gitlab-runner 15.1.0 regresses docker-in-docker service health check
Summary
Docker in docker worked on 15.0.0. It works on 15.1.0 but the health check fails.
Steps to reproduce
Upgrade gitlab-runner from 15.0.0 to 15.1.0
.gitlab-ci.yml
build:
stage: build
services:
- docker:stable-dind
image: docker:stable
script:
- docker info
Actual behavior
The job succeeds (docker-in-docker functions correctly), but the DIND service fails its health check
Expected behavior
Health check passes
Relevant logs and/or screenshots
job log
*** WARNING: Service runner-gozrxeer-project-870-concurrent-0-2025fe58bcfc5a76-docker-0 probably didn't start properly.
Health check error:
service "runner-gozrxeer-project-870-concurrent-0-2025fe58bcfc5a76-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2022-06-22T16:21:51.692284828Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-06-22T16:21:52.400190811Z ..............................................................................................++++
2022-06-22T16:21:52.726833234Z ......................................................................................++++
2022-06-22T16:21:52.727754124Z e is 65537 (0x010001)
2022-06-22T16:21:52.740558016Z Generating RSA private key, 4096 bit long modulus (2 primes)
2022-06-22T16:21:52.840138928Z .........................++++
2022-06-22T16:21:52.877602326Z ........++++
2022-06-22T16:21:52.878070916Z e is 65537 (0x010001)
2022-06-22T16:21:52.904964495Z Signature ok
2022-06-22T16:21:52.905280343Z subject=CN = docker:dind server
2022-06-22T16:21:52.905528791Z Getting CA Private Key
2022-06-22T16:21:52.916577300Z /certs/server/cert.pem: OK
2022-06-22T16:21:52.939272330Z Signature ok
2022-06-22T16:21:52.939339478Z subject=CN = docker:dind client
2022-06-22T16:21:52.939557092Z Getting CA Private Key
2022-06-22T16:21:52.950613710Z /certs/client/cert.pem: OK
2022-06-22T16:21:52.986893400Z time="2022-06-22T16:21:52.986731601Z" level=info msg="Starting up"
2022-06-22T16:21:52.988243448Z time="2022-06-22T16:21:52.988127389Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2022-06-22T16:21:52.989189578Z time="2022-06-22T16:21:52.989098142Z" level=info msg="libcontainerd: started new containerd process" pid=56
2022-06-22T16:21:52.989261528Z time="2022-06-22T16:21:52.989152347Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2022-06-22T16:21:52.989269882Z time="2022-06-22T16:21:52.989169847Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2022-06-22T16:21:52.989274961Z time="2022-06-22T16:21:52.989203121Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
2022-06-22T16:21:52.989280292Z time="2022-06-22T16:21:52.989215298Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2022-06-22T16:21:53.005009048Z time="2022-06-22T16:21:53.004891591Z" level=info msg="starting containerd" revision=ea765aba0d05254012b0b9e595e995c09186427f version=v1.3.9
2022-06-22T16:21:53.026527622Z time="2022-06-22T16:21:53.026395505Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
2022-06-22T16:21:53.026633145Z time="2022-06-22T16:21:53.026590861Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.027003179Z time="2022-06-22T16:21:53.026914184Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.027042407Z time="2022-06-22T16:21:53.026943779Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.027096879Z time="2022-06-22T16:21:53.026977449Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
2022-06-22T16:21:53.027104534Z time="2022-06-22T16:21:53.026994062Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.032803456Z time="2022-06-22T16:21:53.032680076Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="modprobe aufs failed: \"ip: can't find device 'aufs'\\naufs 258048 0 \\nmodprobe: can't change directory to '/lib/modules': No such file or directory\\n\": exit status 1: skip plugin" type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.032837786Z time="2022-06-22T16:21:53.032718662Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.032890289Z time="2022-06-22T16:21:53.032844999Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.033073721Z time="2022-06-22T16:21:53.033006024Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.033327753Z time="2022-06-22T16:21:53.033257254Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
2022-06-22T16:21:53.033338122Z time="2022-06-22T16:21:53.033280955Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
2022-06-22T16:21:53.033383978Z time="2022-06-22T16:21:53.033344402Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
2022-06-22T16:21:53.033417772Z time="2022-06-22T16:21:53.033360764Z" level=info msg="metadata content store policy set" policy=shared
2022-06-22T16:21:53.040122691Z time="2022-06-22T16:21:53.040004138Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
2022-06-22T16:21:53.040139590Z time="2022-06-22T16:21:53.040045037Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
2022-06-22T16:21:53.040193382Z time="2022-06-22T16:21:53.040093453Z" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.040200758Z time="2022-06-22T16:21:53.040111899Z" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.040206119Z time="2022-06-22T16:21:53.040131546Z" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.040210938Z time="2022-06-22T16:21:53.040155564Z" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.040269640Z time="2022-06-22T16:21:53.040173897Z" level=info msg="loading plugin \"io.containerd.service.v1.leases-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.040276528Z time="2022-06-22T16:21:53.040192545Z" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.040281029Z time="2022-06-22T16:21:53.040211281Z" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.040285597Z time="2022-06-22T16:21:53.040229280Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
2022-06-22T16:21:53.040515557Z time="2022-06-22T16:21:53.040440414Z" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
2022-06-22T16:21:53.040677053Z time="2022-06-22T16:21:53.040592248Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
2022-06-22T16:21:53.041629159Z time="2022-06-22T16:21:53.041518792Z" level=info msg="loading plugin \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
2022-06-22T16:21:53.041702161Z time="2022-06-22T16:21:53.041572958Z" level=info msg="loading plugin \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
2022-06-22T16:21:53.041710569Z time="2022-06-22T16:21:53.041645141Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041732464Z time="2022-06-22T16:21:53.041673498Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041760088Z time="2022-06-22T16:21:53.041699398Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041786995Z time="2022-06-22T16:21:53.041724681Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041793006Z time="2022-06-22T16:21:53.041749150Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041820151Z time="2022-06-22T16:21:53.041771250Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041865887Z time="2022-06-22T16:21:53.041796288Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041880688Z time="2022-06-22T16:21:53.041821133Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.041905839Z time="2022-06-22T16:21:53.041845091Z" level=info msg="loading plugin \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
2022-06-22T16:21:53.042156856Z time="2022-06-22T16:21:53.042082954Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.042166670Z time="2022-06-22T16:21:53.042122841Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.042224330Z time="2022-06-22T16:21:53.042151496Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.042230610Z time="2022-06-22T16:21:53.042183655Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
2022-06-22T16:21:53.043454968Z time="2022-06-22T16:21:53.043333408Z" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
2022-06-22T16:21:53.043517350Z time="2022-06-22T16:21:53.043425011Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
2022-06-22T16:21:53.043545567Z time="2022-06-22T16:21:53.043511366Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
2022-06-22T16:21:53.043593943Z time="2022-06-22T16:21:53.043530131Z" level=info msg="containerd successfully booted in 0.039259s"
2022-06-22T16:21:53.052701785Z time="2022-06-22T16:21:53.052597965Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2022-06-22T16:21:53.052844911Z time="2022-06-22T16:21:53.052768972Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2022-06-22T16:21:53.052952790Z time="2022-06-22T16:21:53.052802459Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
2022-06-22T16:21:53.052960731Z time="2022-06-22T16:21:53.052888389Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2022-06-22T16:21:53.053901603Z time="2022-06-22T16:21:53.053776716Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2022-06-22T16:21:53.053915977Z time="2022-06-22T16:21:53.053798948Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2022-06-22T16:21:53.053921959Z time="2022-06-22T16:21:53.053816040Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
2022-06-22T16:21:53.053926889Z time="2022-06-22T16:21:53.053828950Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2022-06-22T16:21:53.054903682Z time="2022-06-22T16:21:53.054818039Z" level=error msg="No zfs dataset found for root" backingFS=extfs root=/var/lib/docker storage-driver=zfs
2022-06-22T16:21:53.094649664Z time="2022-06-22T16:21:53.094546075Z" level=warning msg="Your kernel does not support cgroup rt period"
2022-06-22T16:21:53.094744457Z time="2022-06-22T16:21:53.094713920Z" level=warning msg="Your kernel does not support cgroup rt runtime"
2022-06-22T16:21:53.094806513Z time="2022-06-22T16:21:53.094782770Z" level=warning msg="Your kernel does not support cgroup blkio weight"
2022-06-22T16:21:53.094855952Z time="2022-06-22T16:21:53.094827033Z" level=warning msg="Your kernel does not support cgroup blkio weight_device"
2022-06-22T16:21:53.095126609Z time="2022-06-22T16:21:53.095079973Z" level=info msg="Loading containers: start."
2022-06-22T16:21:53.179435166Z time="2022-06-22T16:21:53.179338842Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.18.0.0/16. Daemon option --bip can be used to set a preferred IP address"
2022-06-22T16:21:53.207534140Z time="2022-06-22T16:21:53.207432077Z" level=info msg="Loading containers: done."
2022-06-22T16:21:53.218904062Z time="2022-06-22T16:21:53.218782967Z" level=info msg="Docker daemon" commit=5eb3275 graphdriver(s)=overlay2 version=19.03.14
2022-06-22T16:21:53.219132645Z time="2022-06-22T16:21:53.219076187Z" level=info msg="Daemon has completed initialization"
2022-06-22T16:21:53.251922276Z time="2022-06-22T16:21:53.251784059Z" level=info msg="API listen on [::]:2376"
2022-06-22T16:21:53.251985983Z time="2022-06-22T16:21:53.251885982Z" level=info msg="API listen on /var/run/docker.sock"
*********
...
Executing "step_script" stage of the job script
00:01
Using docker image sha256:b0757c55a1fdbb59c378fd34dde3e12bd25f68094dd69546cf5ca00ddbaa7a33 for docker:stable with digest docker@sha256:fd4d028713fd05a1fb896412805daed82c4a0cc84331d8dad00cb596d7ce3e3a ...
$ docker info
Client:
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 19.03.14
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: ea765aba0d05254012b0b9e595e995c09186427f
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.13.0-1021-aws
Operating System: Alpine Linux v3.12 (containerized)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.775GiB
Name: dff798adb7cf
ID: NTTI:AIKR:MXEC:5TA3:MJFQ:TTCB:TMFN:7KST:CKKF:UFYD:D4RZ:63CQ
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
Cleaning up project directory and file based variables
00:00
Job succeeded
Environment description
config.toml contents
concurrent = 30
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "gitlab-runner-medium"
url = "REDACTED"
token = "REDACTED"
executor = "docker+machine"
environment = ["DOCKER_TLS_CERTDIR=/certs", "DOCKER_AUTH_CONFIG={\"credsStore\":\"ecr-login\",\"credHelpers\":{\"REDACTED...}"]
[runners.custom_build_dir]
[runners.cache]
Type = "s3"
Shared = true
[runners.cache.s3]
BucketName = "REDACTED"
BucketLocation = "REDACTED"
AuthenticationType = "iam"
[runners.cache.gcs]
[runners.cache.azure]
[runners.docker]
tls_verify = false
image = "REDACTED"
privileged = true
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache", "/certs/client"]
shm_size = 0
[runners.machine]
IdleCount = 0
IdleScaleFactor = 0.0
IdleCountMin = 0
IdleTime = 600
MachineDriver = "amazonec2"
MachineName = "gitlab-runner-medium-%s"
MachineOptions = ["REDACTED"]
[[runners.machine.autoscaling]]
Periods = ["* * 0-2,5-23 * * mon-fri *"]
Timezone = ""
IdleCount = 2
IdleScaleFactor = 0.0
IdleCountMin = 0
IdleTime = 600
Used GitLab Runner version
Version: 15.1.0
Git revision: 76984217
Git branch: 15-1-stable
GO version: go1.17.7
Built: 2022-06-20T10:10:31+0000
OS/Arch: linux/amd64
Possible fixes
I don't know about a fix, but !3033 (merged) seems like a likely candidate for root cause.
Edited by Francis Ferrell