Skip to content

"nvidia-container-runtime": executable file not found in $PATH: unknown

Trying to make my gitlab runner work with GPU, I've added on config.toml a new line under docker runners with gpus = all and on my scripts nvidia-smi return the output with the gpus so it's seem to be working.

My main issue in my CI is here:

run_product_tests:
  stage: product_test
  image: ${DOCKER_DOWNLOAD__REGISTRY}/${DOCKER_TEST_IMAGE_NAME}
  tags:
    - linux
  services:
    - name: registry.hub.docker.com/library/docker:20.10.16-dind
      alias: docker
  script:
    - nvidia-smi
    - do-what-i-say.sh //here

in do-what-i-say.sh I'm doing docker-compose up -d to a docker-compose.yml , that the one of the services is using GPU:

...
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [ GPU ]

I'm getting error:

Error response from daemon: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/749639f1499acc5848bf962f11d47516e0531ce994965e039e7eb89a21e0e0de/log.json: no such file or directory): exec: "nvidia-container-runtime": executable file not found in $PATH: unknown

The docker I mention before ({DOCKER_DOWNLOAD__REGISTRY}/{DOCKER_TEST_IMAGE_NAME}) is looking like that:

FROM python:3.8

ENV PIP_INDEX_URL="http://my.domain.local:8081/repository/pypigr/simple"
ENV PIP_TRUSTED_HOST="my.domain.local"

COPY requirements.txt /tmp/requirements.txt
RUN python3 -m pip install -r /tmp/requirements.txt

#Docker
RUN apt update && apt-get install -y \
    ca-certificates \
    curl \
    gnupg \
    lsb-release
RUN mkdir -p /etc/apt/keyrings
RUN curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
RUN echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
  $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
RUN chmod a+r /etc/apt/keyrings/docker.gpg
RUN apt-get update
ARG DOCKER_VERSION="5:20.10.21~3-0~debian-bullseye"
ARG CONTAINERD_VERSION="1.6.11-1"
ARG DOCKER_COMPOSE_VERSION="2.12.2~debian-bullseye"
RUN apt-get install -y \
    docker-ce=${DOCKER_VERSION} \
    docker-ce-cli=${DOCKER_VERSION} \
    containerd.io=${CONTAINERD_VERSION} \
    docker-compose-plugin=${DOCKER_COMPOSE_VERSION}

#Container runtime
RUN curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
    apt-key add -
RUN distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L \
    https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
    tee /etc/apt/sources.list.d/nvidia-container-runtime.list
RUN apt-get update && apt-get install nvidia-container-runtime -y

What can cause this problem?