DIND with GPU: Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

Summary

When using DIND, the docker service doesn't pass the GPU capabilities, so dockers inside can't use GPU.

Steps to reproduce

Create a runner with GPU capabilities:

[[runners]]
  name = "host1.OUR-DOMAIN.local"
  url = "https://gitlab.OUR-DOMAIN.local/"
  token = "**************"
  tls-ca-file = "/etc/gitlab-runner/ORGANIZATION.pem"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "ubuntu:20.04"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/Releases:/Releases", "/etc/docker/daemon.json:/etc/docker/daemon.json"]
    shm_size = 0
    gpus = "all"
.gitlab-ci.yml
.run_product_tests:
  stage: test
  image: docker:24.0.5
  tags:
    - linux
  services:
    - name: docker:24.0.5-dind
  variables:
    BASIC_FLAGS: " -e ${CI_JOB_STAGE} "
    CPU_FLAG: " -t CPU "
    DOCKER_HOST: tcp://docker:2375
    DOCKER_TLS_CERTDIR: ""
  before_script:
    - mkdir -p logs
    - apk add bash
    - bash
    - docker version
    
Product test X GPU:
  extends: .run_product_tests
  tags:
    - gpu
  script:
    - scripts/start_server.sh -f BF -t GPU ${BASIC_FLAGS}
    - docker compose logs -f >> logs/docker_server_logs-X-local.log 2>&1 &
    - scripts/run_product_test_contained.sh -j ${JOBS} -f BF -t GPU 

Inside the start_server.sh there is a script running: docker compose up -d, the compose (part of it that contain the GPU part):

runner:
    container_name: runner
    depends_on:
      redis-db:
        condition: service_started
        required: true
    deploy:
      resources:
        reservations:
          devices:
          - capabilities:
            - gpu
            driver: nvidia
    environment:
      TZ: Asia/Jerusalem
    image: Our_image:tag
    labels:
      org.label-schema.group: Project
    networks:
      Project_net: null
    restart: unless-stopped
    volumes:
    - type: bind
      source: /builds/Project/server2/configuration/runner/runner-configuration.ini
      target: /configuration.ini
      read_only: true
      bind:
        create_host_path: true
    - type: volume
      source: data-storage-vol
      target: /data
      volume: {}
    - type: bind
      source: /builds/Project/server2/solutions/algorithmic_solutions_butterfly.txt
      target: /home/scripts/algorithmic_solutions_list.txt
      read_only: true
      bind:
        create_host_path: true

Actual behavior

All the containers on docker-compose.yaml is started, but the service is use the GPU saying: Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

Expected behavior

The service should be up.

Relevant logs and/or screenshots

https://imgur.com/a/2gqHRnH

job log
[0KRunning with gitlab-runner 13.9.0 (2ebc4dc4)
[0;m[0K  on rogue1.DOMAIN.local q4x2x9Ez
[0;msection_start:1693399371:prepare_executor
[0K[0K[36;1mPreparing the "docker" executor[0;m
[0;m[0KUsing Docker executor with image docker:24.0.5 ...
[0;m[0KStarting service docker:24.0.5-dind ...
[0;m[0KPulling docker image docker:24.0.5-dind ...
[0;m[0KUsing docker image sha256:1dab0c1da22afec581cf1e31f16bdc24607039abeb71f6cc57d50c458f28f06c for docker:24.0.5-dind with digest docker@sha256:020562d22f11c27997e00da910ed6b580d93094bc25841cb87aacab4ced4a882 ...
[0;m[0KWaiting for services to be up and running...
[0;m
[0;33m*** WARNING:[0;m Service runner-q4x2x9ez-project-184-concurrent-0-97c483665c6ec768-docker-0 probably didn't start properly.

Health check error:
service "runner-q4x2x9ez-project-184-concurrent-0-97c483665c6ec768-docker-0-wait-for-service" timeout

Health check container logs:


Service container logs:
2023-08-30T12:42:52.622899957Z time="2023-08-30T12:42:52.622770983Z" level=info msg="Starting up"
2023-08-30T12:42:52.623148688Z time="2023-08-30T12:42:52.623098182Z" level=warning msg="Binding to IP address without --tlsverify is insecure and gives root access on this machine to everyone who has access to your network." host="tcp://0.0.0.0:2375"
2023-08-30T12:42:52.623162133Z time="2023-08-30T12:42:52.623108391Z" level=warning msg="Binding to an IP address, even on localhost, can also give access to scripts run in a browser. Be safe out there!" host="tcp://0.0.0.0:2375"
2023-08-30T12:42:53.623475047Z time="2023-08-30T12:42:53.623297642Z" level=warning msg="Binding to an IP address without --tlsverify is deprecated. Startup is intentionally being slowed down to show this message" host="tcp://0.0.0.0:2375"
2023-08-30T12:42:53.623514462Z time="2023-08-30T12:42:53.623338890Z" level=warning msg="Please consider generating tls certificates with client validation to prevent exposing unauthenticated root access to your network" host="tcp://0.0.0.0:2375"
2023-08-30T12:42:53.623521535Z time="2023-08-30T12:42:53.623354770Z" level=warning msg="You can override this by explicitly specifying '--tls=false' or '--tlsverify=false'" host="tcp://0.0.0.0:2375"
2023-08-30T12:42:53.623526394Z time="2023-08-30T12:42:53.623365470Z" level=warning msg="Support for listening on TCP without authentication or explicit intent to run without authentication will be removed in the next release" host="tcp://0.0.0.0:2375"
2023-08-30T12:43:08.630645824Z time="2023-08-30T12:43:08.630370393Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2023-08-30T12:43:08.630693774Z time="2023-08-30T12:43:08.630563338Z" level=info msg="containerd not running, starting managed containerd"
2023-08-30T12:43:08.633354695Z time="2023-08-30T12:43:08.632596161Z" level=info msg="started new containerd process" address=/var/run/docker/containerd/containerd.sock module=libcontainerd pid=42
2023-08-30T12:43:08.647320941Z time="2023-08-30T12:43:08.647201646Z" level=info msg="starting containerd" revision=1677a17964311325ed1c31e2c0a3589ce6d5c30d version=v1.7.1
2023-08-30T12:43:08.667199206Z time="2023-08-30T12:43:08.667078668Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
2023-08-30T12:43:08.674570843Z time="2023-08-30T12:43:08.674475072Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
2023-08-30T12:43:08.674611450Z time="2023-08-30T12:43:08.674546868Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
2023-08-30T12:43:08.674706580Z time="2023-08-30T12:43:08.674600810Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
2023-08-30T12:43:08.674901839Z time="2023-08-30T12:43:08.674828140Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
2023-08-30T12:43:08.674916918Z time="2023-08-30T12:43:08.674847877Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
2023-08-30T12:43:08.674922909Z time="2023-08-30T12:43:08.674858998Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
2023-08-30T12:43:08.675127035Z time="2023-08-30T12:43:08.675068835Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
2023-08-30T12:43:08.675136813Z time="2023-08-30T12:43:08.675092199Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
2023-08-30T12:43:08.675177711Z time="2023-08-30T12:43:08.675133497Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
2023-08-30T12:43:08.675188241Z time="2023-08-30T12:43:08.675145049Z" level=info msg="metadata content store policy set" policy=shared
2023-08-30T12:43:08.676290154Z time="2023-08-30T12:43:08.676166811Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
2023-08-30T12:43:08.676305563Z time="2023-08-30T12:43:08.676237154Z" level=info msg="loading plugin \"io.containerd.event.v1.exchange\"..." type=io.containerd.event.v1
2023-08-30T12:43:08.676318858Z time="2023-08-30T12:43:08.676274865Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
2023-08-30T12:43:08.676419629Z time="2023-08-30T12:43:08.676326593Z" level=info msg="loading plugin \"io.containerd.lease.v1.manager\"..." type=io.containerd.lease.v1
2023-08-30T12:43:08.676782264Z time="2023-08-30T12:43:08.676420831Z" level=info msg="loading plugin \"io.containerd.nri.v1.nri\"..." type=io.containerd.nri.v1
2023-08-30T12:43:08.676803004Z time="2023-08-30T12:43:08.676747839Z" level=info msg="NRI interface is disabled by configuration."
2023-08-30T12:43:08.676833481Z time="2023-08-30T12:43:08.676777034Z" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
2023-08-30T12:43:08.677063747Z time="2023-08-30T12:43:08.676997431Z" level=info msg="loading plugin \"io.containerd.runtime.v2.shim\"..." type=io.containerd.runtime.v2
2023-08-30T12:43:08.677083324Z time="2023-08-30T12:43:08.677026206Z" level=info msg="loading plugin \"io.containerd.sandbox.store.v1.local\"..." type=io.containerd.sandbox.store.v1
2023-08-30T12:43:08.677100847Z time="2023-08-30T12:43:08.677048207Z" level=info msg="loading plugin \"io.containerd.sandbox.controller.v1.local\"..." type=io.containerd.sandbox.controller.v1
2023-08-30T12:43:08.677106287Z time="2023-08-30T12:43:08.677067754Z" level=info msg="loading plugin \"io.containerd.streaming.v1.manager\"..." type=io.containerd.streaming.v1
2023-08-30T12:43:08.677126766Z time="2023-08-30T12:43:08.677091018Z" level=info msg="loading plugin \"io.containerd.service.v1.introspection-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677154478Z time="2023-08-30T12:43:08.677115595Z" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677161862Z time="2023-08-30T12:43:08.677135923Z" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677186689Z time="2023-08-30T12:43:08.677153116Z" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677215253Z time="2023-08-30T12:43:08.677173785Z" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677237816Z time="2023-08-30T12:43:08.677193662Z" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677247394Z time="2023-08-30T12:43:08.677212448Z" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677266931Z time="2023-08-30T12:43:08.677229510Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
2023-08-30T12:43:08.677407527Z time="2023-08-30T12:43:08.677358193Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
2023-08-30T12:43:08.677949782Z time="2023-08-30T12:43:08.677894888Z" level=info msg="loading plugin \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
2023-08-30T12:43:08.677982965Z time="2023-08-30T12:43:08.677948760Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678015837Z time="2023-08-30T12:43:08.677972234Z" level=info msg="loading plugin \"io.containerd.transfer.v1.local\"..." type=io.containerd.transfer.v1
2023-08-30T12:43:08.678046645Z time="2023-08-30T12:43:08.678012741Z" level=info msg="loading plugin \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
2023-08-30T12:43:08.678130604Z time="2023-08-30T12:43:08.678094485Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678145462Z time="2023-08-30T12:43:08.678116417Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678162634Z time="2023-08-30T12:43:08.678134762Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678176230Z time="2023-08-30T12:43:08.678153357Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678195015Z time="2023-08-30T12:43:08.678171251Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678213310Z time="2023-08-30T12:43:08.678189144Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678235652Z time="2023-08-30T12:43:08.678205726Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678254278Z time="2023-08-30T12:43:08.678224952Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678283633Z time="2023-08-30T12:43:08.678246874Z" level=info msg="loading plugin \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
2023-08-30T12:43:08.678545608Z time="2023-08-30T12:43:08.678497838Z" level=info msg="loading plugin \"io.containerd.grpc.v1.sandbox-controllers\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678571127Z time="2023-08-30T12:43:08.678531442Z" level=info msg="loading plugin \"io.containerd.grpc.v1.sandboxes\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678592036Z time="2023-08-30T12:43:08.678561308Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678613336Z time="2023-08-30T12:43:08.678579873Z" level=info msg="loading plugin \"io.containerd.grpc.v1.streaming\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678640959Z time="2023-08-30T12:43:08.678598729Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678672568Z time="2023-08-30T12:43:08.678638584Z" level=info msg="loading plugin \"io.containerd.grpc.v1.transfer\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678692236Z time="2023-08-30T12:43:08.678664183Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
2023-08-30T12:43:08.678714287Z time="2023-08-30T12:43:08.678681235Z" level=info msg="loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." type=io.containerd.tracing.processor.v1
2023-08-30T12:43:08.678743182Z time="2023-08-30T12:43:08.678703367Z" level=info msg="skip loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." error="no OpenTelemetry endpoint: skip plugin" type=io.containerd.tracing.processor.v1
2023-08-30T12:43:08.678749824Z time="2023-08-30T12:43:08.678723334Z" level=info msg="loading plugin \"io.containerd.internal.v1.tracing\"..." type=io.containerd.internal.v1
2023-08-30T12:43:08.678769371Z time="2023-08-30T12:43:08.678739725Z" level=info msg="skipping tracing processor initialization (no tracing plugin)" error="no OpenTelemetry endpoint: skip plugin"
2023-08-30T12:43:08.679166492Z time="2023-08-30T12:43:08.679121177Z" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
2023-08-30T12:43:08.679239851Z time="2023-08-30T12:43:08.679204264Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
2023-08-30T12:43:08.679303081Z time="2023-08-30T12:43:08.679270749Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
2023-08-30T12:43:08.679325553Z time="2023-08-30T12:43:08.679294885Z" level=info msg="containerd successfully booted in 0.032877s"
2023-08-30T12:43:08.693870363Z time="2023-08-30T12:43:08.693794620Z" level=info msg="Loading containers: start."
2023-08-30T12:43:08.770872543Z time="2023-08-30T12:43:08.770743208Z" level=info msg="Loading containers: done."
2023-08-30T12:43:08.778765997Z time="2023-08-30T12:43:08.778676458Z" level=warning msg="WARNING: API is accessible on http://0.0.0.0:2375 without encryption.\n         Access to the remote API is equivalent to root access on the host. Refer\n         to the 'Docker daemon attack surface' section in the documentation for\n         more information: https://docs.docker.com/go/attack-surface/"
2023-08-30T12:43:08.778794782Z time="2023-08-30T12:43:08.778706675Z" level=warning msg="WARNING: No swap limit support"
2023-08-30T12:43:08.778811213Z time="2023-08-30T12:43:08.778767981Z" level=info msg="Docker daemon" commit=a61e2b4 graphdriver=overlay2 version=24.0.5
2023-08-30T12:43:08.778921722Z time="2023-08-30T12:43:08.778875174Z" level=info msg="Daemon has completed initialization"
2023-08-30T12:43:08.799374793Z time="2023-08-30T12:43:08.799240088Z" level=info msg="API listen on /var/run/docker.sock"
2023-08-30T12:43:08.799390092Z time="2023-08-30T12:43:08.799264514Z" level=info msg="API listen on [::]:2375"

[0;33m*********[0;m

[0KPulling docker image docker:24.0.5 ...
[0;m[0KUsing docker image sha256:1dab0c1da22afec581cf1e31f16bdc24607039abeb71f6cc57d50c458f28f06c for docker:24.0.5 with digest docker@sha256:020562d22f11c27997e00da910ed6b580d93094bc25841cb87aacab4ced4a882 ...
[0;msection_end:1693399404:prepare_executor
[0Ksection_start:1693399404:prepare_script
[0K[0K[36;1mPreparing environment[0;m
[0;mRunning on runner-q4x2x9ez-project-184-concurrent-0 via rogue1.DOMAIN.local...
section_end:1693399405:prepare_script
[0Ksection_start:1693399405:get_sources
[0K[0K[36;1mGetting source from Git repository[0;m
[0;m[32;1mFetching changes with git depth set to 50...[0;m
Reinitialized existing Git repository in /builds/PROJECT_NAME/server2/.git/
[32;1mChecking out c8e0d29c as quite_pull...[0;m
Removing logs/

[32;1mUpdating/initializing submodules recursively...[0;m
Synchronizing submodule url for 'common'
Entering 'common'
Entering 'common'
HEAD is now at 8d7da28 fix port
Entering 'common'
section_end:1693399407:get_sources
[0Ksection_start:1693399407:step_script
[0K[0K[36;1mExecuting "step_script" stage of the job script[0;m
[0;m[0KUsing docker image sha256:1dab0c1da22afec581cf1e31f16bdc24607039abeb71f6cc57d50c458f28f06c for docker:24.0.5 with digest docker@sha256:020562d22f11c27997e00da910ed6b580d93094bc25841cb87aacab4ced4a882 ...
[0;m[32;1m$ mkdir -p logs[0;m
[32;1m$ apk add bash[0;m
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
(1/2) Installing readline (8.2.1-r1)
(2/2) Installing bash (5.2.15-r5)
Executing bash-5.2.15-r5.post-install
Executing busybox-1.36.1-r2.trigger
OK: 33 MiB in 57 packages
[32;1m$ bash[0;m
[32;1m$ docker version[0;m
Client:
 Version:           24.0.5
 API version:       1.43
 Go version:        go1.20.6
 Git commit:        ced0996
 Built:             Fri Jul 21 20:34:32 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.5
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.6
  Git commit:       a61e2b4
  Built:            Fri Jul 21 20:35:56 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.1
  GitCommit:        1677a17964311325ed1c31e2c0a3589ce6d5c30d
 runc:
  Version:          1.1.8
  GitCommit:        v1.1.8-0-g82f18fe
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
[32;1m$ scripts/start_server.sh -f BF -t GPU ${BASIC_FLAGS}[0;m
[1;33m[2023-08-30             12:43:29.              ][WARN  ][main:24 ] No docker-compose.yaml found, trying to generate[0m
logger: unrecognized option: tag
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: logger [-s] [-t TAG] [-p PRIO] [MESSAGE]

Write MESSAGE (or stdin) to syslog

	-s	Log to stderr as well as the system log
	-t TAG	Log using the specified tag (defaults to user name)
	-p PRIO	Priority (number or FACILITY.LEVEL pair)
[1;31mError while logging with syslog. Where these flags ok: '--tag b-log_example_02'[0m
[37m[2023-08-30             12:43:29.              ][INFO  ][main:45 ] Environment: CI Production[0m
logger: unrecognized option: tag
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: logger [-s] [-t TAG] [-p PRIO] [MESSAGE]

Write MESSAGE (or stdin) to syslog

	-s	Log to stderr as well as the system log
	-t TAG	Log using the specified tag (defaults to user name)
	-p PRIO	Priority (number or FACILITY.LEVEL pair)
[1;31mError while logging with syslog. Where these flags ok: '--tag b-log_example_02'[0m
[37m[2023-08-30             12:43:29.              ][INFO  ][main:65 ] Flavor BF selected[0m
logger: unrecognized option: tag
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: logger [-s] [-t TAG] [-p PRIO] [MESSAGE]

Write MESSAGE (or stdin) to syslog

	-s	Log to stderr as well as the system log
	-t TAG	Log using the specified tag (defaults to user name)
	-p PRIO	Priority (number or FACILITY.LEVEL pair)
[1;31mError while logging with syslog. Where these flags ok: '--tag b-log_example_02'[0m
[37m[2023-08-30             12:43:29.              ][INFO  ][main:96 ] Compose set:  -f compose-components/GW/base.yaml -f compose-components/RB/base.yaml -f compose-components/redis/base.yaml -f compose-components/storage-manager/base.yaml -f compose-components/runner/base.yaml -f compose-components/runner/gpu.yaml -f compose-components/runner/remote_solutions.yaml[0m
logger: unrecognized option: tag
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: logger [-s] [-t TAG] [-p PRIO] [MESSAGE]

Write MESSAGE (or stdin) to syslog

	-s	Log to stderr as well as the system log
	-t TAG	Log using the specified tag (defaults to user name)
	-p PRIO	Priority (number or FACILITY.LEVEL pair)
[1;31mError while logging with syslog. Where these flags ok: '--tag b-log_example_02'[0m
[1;32m[2023-08-30             12:43:29.              ][NOTICE][main:128] Docker compose generated successfully[0m
logger: unrecognized option: tag
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: logger [-s] [-t TAG] [-p PRIO] [MESSAGE]

Write MESSAGE (or stdin) to syslog

	-s	Log to stderr as well as the system log
	-t TAG	Log using the specified tag (defaults to user name)
	-p PRIO	Priority (number or FACILITY.LEVEL pair)
[1;31mError while logging with syslog. Where these flags ok: '--tag b-log_example_02'[0m
[1;32m[2023-08-30             12:43:29.              ][NOTICE][docker_compose_version_check:12 ] Docker Compose version is higher than 2.20.0[0m
logger: unrecognized option: tag
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: logger [-s] [-t TAG] [-p PRIO] [MESSAGE]

Write MESSAGE (or stdin) to syslog

	-s	Log to stderr as well as the system log
	-t TAG	Log using the specified tag (defaults to user name)
	-p PRIO	Priority (number or FACILITY.LEVEL pair)
[1;31mError while logging with syslog. Where these flags ok: '--tag b-log_example_02'[0m
 redis-db Pulling 
 GW Pulling 
 storage-manager-http Pulling 
 RB Pulling 
 runner Pulling 
 GW Pulled 
 storage-manager-http Pulled 
 RB Pulled 
 redis-db Pulled 
 runner Pulled 
 Network PROJECT_NAME_net  Creating
 Network PROJECT_NAME_net  Created
 Volume "GW_redis-storage-vol"  Creating
 Volume "GW_redis-storage-vol"  Created
 Volume "GW_data-storage-vol"  Creating
 Volume "GW_data-storage-vol"  Created
 Container redis-db  Creating
 Container redis-db  Created
 Container storage-manager-http  Creating
 Container GW  Creating
 Container RB  Creating
 Container runner  Creating
 Container storage-manager-http  Created
 Container GW  Created
 Container RB  Created
 Container runner  Created
 Container redis-db  Starting
 Container redis-db  Started
 Container RB  Starting
 Container storage-manager-http  Starting
 Container runner  Starting
 Container GW  Starting
 Container GW  Started
 Container storage-manager-http  Started
 Container RB  Started
Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]
[1;31m[2023-08-30             12:46:26.              ][ERROR ][start_server:17 ] Error[0m
logger: unrecognized option: tag
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

Usage: logger [-s] [-t TAG] [-p PRIO] [MESSAGE]

Write MESSAGE (or stdin) to syslog

	-s	Log to stderr as well as the system log
	-t TAG	Log using the specified tag (defaults to user name)
	-p PRIO	Priority (number or FACILITY.LEVEL pair)
[1;31mError while logging with syslog. Where these flags ok: '--tag b-log_example_02'[0m
section_end:1693399587:step_script
[0Ksection_start:1693399587:after_script
[0K[0K[36;1mRunning after_script[0;m
[0;m[32;1mRunning after script...[0;m
[32;1m$ ls -ltr results/[0;m
ls: results/: No such file or directory
section_end:1693399588:after_script
[0Ksection_start:1693399588:upload_artifacts_on_failure
[0K[0K[36;1mUploading artifacts for failed job[0;m
[0;m[32;1mUploading artifacts...[0;m
[0;33mWARNING: results/: no matching files              [0;m 
logs/: found 1 matching files and directories     [0;m 
docker-compose.yaml: found 1 matching files and directories[0;m 
Uploading artifacts as "archive" to coordinator... ok[0;m  id[0;m=368127 responseStatus[0;m=201 Created token[0;m=dxE-4kkx
[32;1mUploading artifacts...[0;m
[0;33mWARNING: results/*.xml: no matching files         [0;m 
[31;1mERROR: No files to upload                         [0;m 
section_end:1693399590:upload_artifacts_on_failure
[0Ksection_start:1693399590:cleanup_file_variables
[0K[0K[36;1mCleaning up file based variables[0;m
[0;msection_end:1693399591:cleanup_file_variables
[0K[31;1mERROR: Job failed: exit code 1
[0;m

Environment description

Running self-hosted https://i.imgur.com/1Ze7TcJ.png

config.toml contents
[[runners]]
  name = "host1.OUR-DOMAIN.local"
  url = "https://gitlab.OUR-DOMAIN.local/"
  token = "**************"
  tls-ca-file = "/etc/gitlab-runner/ORGANIZATION.pem"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "ubuntu:20.04"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/Releases:/Releases", "/etc/docker/daemon.json:/etc/docker/daemon.json"]
    shm_size = 0
    gpus = "all"

Used GitLab Runner version

Version:      13.9.0
Git revision: 2ebc4dc4
Git branch:   13-9-stable
GO version:   go1.13.8
Built:        2021-02-22T20:17:08+0000
OS/Arch:      linux/amd64

--- docker
Client: Docker Engine - Community
 Version:    24.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.20.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 43
  Running: 0
  Paused: 0
  Stopped: 43
 Images: 34
 Server Version: 24.0.5

Edit: Tried also to upgrade the runner to 16.3.0

Edited by David Tayar