Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 18789

how to ensure tensorflow gpu version is in use with docker image?

$
0
0

I am using tensorflow 2.13 with help of docker file available within tensorflow models/research/object_detection/dockerfiles folder.

The Docker file contents are

FROM tensorflow/tensorflow:latest-gpuARG DEBIAN_FRONTEND=noninteractive# Install apt dependenciesRUN apt-get update && apt-get install -y \    git \    gpg-agent \    python3-cairocffi \    protobuf-compiler \    python3-pil \    python3-lxml \    python3-tk \    python3-opencv \    libssl-dev \    software-properties-common \    wgetWORKDIR /home/tensorflow## Copy this code (make sure you are under the ../models/research directory)COPY models/research/. /home/tensorflow/models# Compile protobuf configsRUN (cd /home/tensorflow/models/ && protoc object_detection/protos/*.proto --python_out=.)WORKDIR /home/tensorflow/models/RUN cp object_detection/packages/tf2/setup.py ./ENV PATH="/home/tensorflow/.local/bin:${PATH}"RUN python -m pip install -U pipRUN python -m pip install .COPY scripts /home/tensorflow/COPY workspace /home/tensorflow/#ENTRYPOINT ["python", "object_detection/model_main_tf2.py"]

After all due steps the docker image runs. My host machine has GPU GTX 1650. I tried to test if my tensorflow installation is using GPU using test_tf.py as below

import tensorflow as tfif tf.test.gpu_device_name():    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))else:    print("Please install GPU version of TF")

Here is the output from test_tf.py

root@5433479cb167:/home/tensorflow# python3 test_tf.py 2023-09-07 14:43:50.620651: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.2023-09-07 14:43:53.994502: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: UNKNOWN ERROR (34)Please install GPU version of TF

Using docker is critical for my work so I need to figure out a solution. From the message my understanding is TF detects the GPU but is unable to use it . The last message is confusing since the base image in use is FROM tensorflow/tensorflow:latest-gpu

I started a small dataset training ( 50 images ) and it seems to be using my CPU to full extent. My training loop is stuck with the following message on the console -

I0907 14:31:03.622151 140609981511424 api.py:460] feature_map_spatial_dims: [(128, 128), (64, 64), (32, 32), (16, 16), (8, 8)]I0907 14:31:10.580329 140609981511424 api.py:460] feature_map_spatial_dims: [(128, 128), (64, 64), (32, 32), (16, 16), (8, 8)]I0907 14:31:16.743497 140609981511424 api.py:460] feature_map_spatial_dims: [(128, 128), (64, 64), (32, 32), (16, 16), (8, 8)]I0907 14:31:23.568284 140609981511424 api.py:460] feature_map_spatial_dims: [(128, 128), (64, 64), (32, 32), (16, 16), (8, 8)]

Prior to this run my training loop had crashed with the same 4 msgs on console and I had to reduce my batch size which allows the training to now continue. But I have no loss messages on console yet.

Please suggest any steps I can take to address / resolve these issues.

Update # after about an hour my training loop produced loss statement and that confirms all is well with CPU based training

INFO:tensorflow:Step 100 per-step time 30.199sI0907 15:21:21.773339 140617841817408 model_lib_v2.py:705] Step 100 per-step time 30.199sINFO:tensorflow:{'Loss/classification_loss': 0.16712503,'Loss/localization_loss': 0.101843126,'Loss/regularization_loss': 0.29896417,'Loss/total_loss': 0.56793237,'learning_rate': 0.0141663505}I0907 15:21:21.820583 140617841817408 model_lib_v2.py:708] {'Loss/classification_loss': 0.16712503,'Loss/localization_loss': 0.101843126,'Loss/regularization_loss': 0.29896417,'Loss/total_loss': 0.56793237,'learning_rate': 0.0141663505}

Still I am looking for a solution to my GPU woes.


Viewing all articles
Browse latest Browse all 18789

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>