Tesla K80 can't run applications in CUDA >=11.5 images
cuda-compat-11-x
Context on Upstream cuda:11-x
images come with the cuda-compat-11-x
package. This package is the Forward Compatibility package and is intended to make CUDA 10.x drivers (old) support 11.x applications. Essentially, it comes with a libcuda.so
library from a newer driver. Forward compatibility is stronger than Minor Version Compatibility and has way fewer features that are missing (https://docs.nvidia.com/deploy/cuda-compatibility/index.html#feature-exceptions).
CUDA 11.x
upstream images always install it, regardless of the Driver "version" of the host. See Table 3 in CUDA's Release Notes for a mapping between CUDA Version and driver version (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-major-component-versions)
K80 can't run applications in upstream images >= 11.5
Tesla K80 supports up to CUDA 11.5 Forward Compatibility packages according to the table under the "Which GPUs are supported by the driver ?" section in the FAQ of the CUDA Compatility Docs (https://docs.nvidia.com/deploy/cuda-compatibility/index.html#faq).
While we can start CUDA images with a version >= 11.5, they fail in action with a cudaErrorNoDevice
error.
Therefore, Kepler devices can only successfully run CUDA images <=11.4. They do not even support 11.5, as the documentation on Forward Compatibility suggests.
My explanation for this is that the libcuda.so
userspace driver libary installed by the cuda-compat
package, which for a >=11.5 CUDA is of version >= 495.x
(see Table 3 that I link to above) does not support Kepler generation devices (as per the "Maximum Driver Supported" collumn of the table in the CUDA compatibility docs)
If we didn't install the cuda-compat
package, we would be able to run any 11.x image, due to Minor Version Compatibility (https://docs.nvidia.com/deploy/cuda-compatibility/index.html#minor-version-compatibility).
Suggested Fix
Publish a separate set of images, without the cuda-compat
package, for CUDA versions >= 11.5 targeted at Kepler generation hardware.