Request for nvidia-docker support in docker build or "stub" driver symbol libraries in devel images
All images in this repository and docker hub does not contain /usr/local/nvidia
directory,
because that directory is supposed to be automatically mounted by nvidia-docker (the runtime/volume wrapper).
This becomes a problem when building some applications, such as TensorFlow, using docker build,
because docker build does not allow neither mounts of host directories nor use of custom runtimes.
For instance, TensorFlow build fails miserably with a lot of cuXXX
symbol errors:
/usr/bin/ld: warning: libcuda.so.1, needed by bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Uio_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_frame[102/4806]
ot found (try using -rpath or -rpath-link)
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Uio_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cuMemFree_v2'
bazel-out/host/bin/_solib_local/_U_S_Stensorflow_Spython_Cgen_Uio_Uops_Upy_Uwrappers_Ucc___Utensorflow/libtensorflow_framework.so: undefined reference to `cuMemsetD32Async'
...
Of course, building such applications in nvidia-docker run setup works fine.
A workaround is to manually download *.run
distributions from the NVIDIA website and extract both the driver and the CUDA toolkit inside the docker build process, but this seems to be suboptimal since NVIDIA already offers this well-established repository for that purpose!
Though there is an issue about allowing host-directory mounts for RUN commands in Dockerfiles, which would resolve this issue as well, I think this caveat must be documented somewhere, a guide for workarounds must be provided, or make the devel images here to contain driver symbol libraries.