Upgrade the Nvidia drivers on the saas-linux-medium-amd64-gpu runners

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Release notes

When using the saas-linux-medium-amd64-gpu runners, regardless of the used image, the underlying Nvidia driver version is 470.182.03.

Problem to solve

This causes a problem when trying to use newer CUDA runtime versions.

Proposal

Upgrade the Nvidia drivers on our GPU runners so that newer versions of the CUDA runtime can be used.

Currently the versions show as follows:

image: nvidia/cuda:12.3.0-base-ubuntu22.04
tags: saas-linux-medium-amd64-gpu-standard
Output of nvidia-smi:
NVIDIA-SMI 470.182.03 
Driver Version: 470.182.03 
CUDA Version: 11.4 

The errors when attempting to use a newer CUDA runtime are:

ERROR tests/test_optimizers.py - tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
ERROR tests/test_sparse.py - tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

Intended users

Feature Usage Metrics

Does this feature require an audit event?

/cc @gabrielengel_gl

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited by 🤖 GitLab Bot 🤖