"nvidia-container-runtime": executable file not found in $PATH: unknown
Trying to make my gitlab runner work with GPU, I've added on config.toml
a new line under docker runners with gpus = all
and on my scripts nvidia-smi
return the output with the gpus so it's seem to be working.
My main issue in my CI is here:
run_product_tests:
stage: product_test
image: ${DOCKER_DOWNLOAD__REGISTRY}/${DOCKER_TEST_IMAGE_NAME}
tags:
- linux
services:
- name: registry.hub.docker.com/library/docker:20.10.16-dind
alias: docker
script:
- nvidia-smi
- do-what-i-say.sh //here
in do-what-i-say.sh
I'm doing docker-compose up -d
to a docker-compose.yml
, that the one of the services is using GPU:
...
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [ GPU ]
I'm getting error:
Error response from daemon: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/749639f1499acc5848bf962f11d47516e0531ce994965e039e7eb89a21e0e0de/log.json: no such file or directory): exec: "nvidia-container-runtime": executable file not found in $PATH: unknown
The docker I mention before ({DOCKER_DOWNLOAD__REGISTRY}/
{DOCKER_TEST_IMAGE_NAME}) is looking like that:
FROM python:3.8
ENV PIP_INDEX_URL="http://my.domain.local:8081/repository/pypigr/simple"
ENV PIP_TRUSTED_HOST="my.domain.local"
COPY requirements.txt /tmp/requirements.txt
RUN python3 -m pip install -r /tmp/requirements.txt
#Docker
RUN apt update && apt-get install -y \
ca-certificates \
curl \
gnupg \
lsb-release
RUN mkdir -p /etc/apt/keyrings
RUN curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
RUN echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
RUN chmod a+r /etc/apt/keyrings/docker.gpg
RUN apt-get update
ARG DOCKER_VERSION="5:20.10.21~3-0~debian-bullseye"
ARG CONTAINERD_VERSION="1.6.11-1"
ARG DOCKER_COMPOSE_VERSION="2.12.2~debian-bullseye"
RUN apt-get install -y \
docker-ce=${DOCKER_VERSION} \
docker-ce-cli=${DOCKER_VERSION} \
containerd.io=${CONTAINERD_VERSION} \
docker-compose-plugin=${DOCKER_COMPOSE_VERSION}
#Container runtime
RUN curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
apt-key add -
RUN distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L \
https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
tee /etc/apt/sources.list.d/nvidia-container-runtime.list
RUN apt-get update && apt-get install nvidia-container-runtime -y
What can cause this problem?