Skip to content

How could I run GPU docker? #361

@hyzhak

Description

@hyzhak

The question is: How could use kaggle docker with GPU?

I haven't found any examples how could I use already built kaggle docker-python for GPU. So I decided to built it by myself.

I cloned current repository and built GPU docker from there (build --gpu). After that I run docker to test where we have GPUs there (it was for me with official tensorflow DockerFile tensorflow/tensorflow:latest-gpu-py3 from here: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/dockerfiles)

Script:

import tensorflow as tf
from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']

get_available_gpus()

for tensorflow/tensorflow:latest-gpu-py3 I've received:

['/device:GPU:0']

But in kaggle/python-gpu-build it won't work and response was:

[]

and I've found errors in logs:

tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: UNKNOWN ERROR (-1)
tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: 24cb5b98c9ce
tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: 24cb5b98c9ce tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 410.48.0 eug

side note: I'm using nvidia-docker2 by --runtime=nvidia.

Does kaggle/python-gpu-build requires extra work to tune it before run? And where can I find more information how could I use it?
Thanks!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions