Regression runner v18.7.0 wrt services

Summary

I got a regression issue with both Gitlab-runner version 18.7.0-1 and 18.7.1-1.

But today I ran a gitlab CI/CD workflow that uses services (all other workflows are not causing issues).

But with services I'm getting:

*** WARNING: Service runner-vx8dp7gjm-project-192-concurrent-0-ab791c78715ffbde-mariadb-0 probably didn't start properly.
Health check error:
service "runner-vx8dp7gjm-project-192-concurrent-0-ab791c78715ffbde-mariadb-0-wait-for-service" timeout

Meaning the gitlab-runner is unable to connect to my mariadb service, which use to work fine:

test:
  stage: test
  variables:
    DOCKER_DRIVER: overlay2
    MYSQL_ROOT_PASSWORD: 'secret'
    MYSQL_DATABASE: 'secret'
    MYSQL_USER: 'secret'
    MYSQL_PASSWORD: 'secret'
    TZ: 'Europe/Amsterdam'
  services:
    - mariadb

The MariaDB (mysql) docker image is running fine. Output is (this is good):

2025-12-24T23:08:32.205792399Z 2025-12-25  0:08:32 0 [Note] InnoDB: Buffer pool(s) load completed at 251225  0:08:32
2025-12-24T23:08:33.153427183Z 2025-12-25  0:08:33 0 [Note] Server socket created on IP: '0.0.0.0', port: '3306'.
2025-12-24T23:08:33.153445313Z 2025-12-25  0:08:33 0 [Note] Server socket created on IP: '::', port: '3306'.
2025-12-24T23:08:33.158397863Z 2025-12-25  0:08:33 0 [Note] mariadbd: Event Scheduler: Loaded 0 events
2025-12-24T23:08:33.159184238Z 2025-12-25  0:08:33 0 [Note] mariadbd: ready for connections.

However, the internal communication is NO longer working between the two docker containers started by the GitLab Runner. So my CI/CD pipeline is failing.

So I downgraded to gitlab-runner version: 18.6.6-1. And this works again, meaning something in 18.7.x is broken!!

ps. I'm using Docker version 29.1.3, build f52814d and running on Ubuntu Server 24.04. And running GitLab v18.7.0.

Steps to reproduce

.gitlab-ci.yml
test:
  stage: test
  variables:
    DOCKER_DRIVER: overlay2
    MYSQL_ROOT_PASSWORD: 'secret'
    MYSQL_DATABASE: 'secret'
    MYSQL_USER: 'secret'
    MYSQL_PASSWORD: 'secret'
    TZ: 'Europe/Amsterdam'
  services:
    - mariadb

Actual behavior

*** WARNING: Service runner-vx8dp7gjm-project-192-concurrent-0-ab791c78715ffbde-mariadb-0 probably didn't start properly.
Health check error:
service "runner-vx8dp7gjm-project-192-concurrent-0-ab791c78715ffbde-mariadb-0-wait-for-service" timeout

Expected behavior

Well no errors or issues with health checks and failing pipelines..

Using Docker executor with image registry.melroy.org/melroy/docker-images/pnpm:24 ...
Starting service mariadb:latest...
Using effective pull policy of [always] for container mariadb:latest
Pulling docker image mariadb:latest ...
Using docker image sha256:f90bc2981a9328d1cf99f733a5c355a5cc869d78f10eea1932cf99d80328ff86 for mariadb:latest with digest mariadb@sha256:e1bcd6f85781f4a875abefb11c4166c1d79e4237c23de597bf0df81fec225b40 ...
Waiting for services to be up and running (timeout 30 seconds)...
Using effective pull policy of [always] for container registry.melroy.org/melroy/docker-images/pnpm:24
Authenticating with credentials from job payload (GitLab Registry)
Pulling docker image registry.melroy.org/melroy/docker-images/pnpm:24 ...
Using docker image sha256:7075474821d05f5bc2d30ceed26663b3d50fe8a1e2083cab529f2a07ece0c01a for registry.melroy.org/melroy/docker-images/pnpm:24 with digest registry-1.docker.io/danger89/pnpm@sha256:6f275adb6cd28a2ab01316c19da12c40921e6451d425f206db6c9b9eac058247 ...
Preparing environment 00:00
Using effective pull policy of [always] for container sha256:a94f7cb84038b6d0510ca7b38d8e4ac6f38863e764ab1a10e70860dff6ab24bd
Running on runner-vx8dp7gjm-project-192-concurrent-0 via ubuntu-server...
Getting source from Git repository 00:06
Gitaly correlation ID: 01KD9BE1EA6QA8QT5NFKSF1DMF
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/erpjs/erpjs/.git/
Created fresh repository.
Checking out 6d7645da as detached HEAD (ref is upgrade_rgs_3_8)...

Relevant logs and/or screenshots

See my summary above.

Environment description

My Docker daemon.json:

{
  "experimental": false,
  "icc": false,
  "userns-remap": "default",
  "storage-driver": "overlay2",
  "userland-proxy": false,
  "live-restore": false,
  "no-new-privileges": true,
  "dns": ["8.8.8.8", "8.8.4.4"],
  "ipv6": false,
  "ip6tables": false,
  "fixed-cidr-v6": "2a02:22a0:bbba:f900::/64",
  "registry-mirrors": ["http://127.0.0.1:6000"],
  "insecure-registries": ["127.0.0.1:6000"],
  "data-root": "/media/data_extra/docker",
  "bip": "10.254.1.1/24",
  "default-address-pools":[
    {"base":"10.254.0.0/16","size":25}
  ]
}

docker info:

Client: Docker Engine - Community
 Version:    29.1.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.30.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v5.0.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 44
  Running: 44
  Paused: 0
  Stopped: 0
 Images: 150
 Server Version: 29.1.3
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: dea7da592f5d1d2b7755e3a161be07f43fad8f75
 runc version: v1.3.4-0-gd6d73eb8
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  userns
  cgroupns
  no-new-privileges
 Kernel Version: 6.8.0-90-generic
 Operating System: Ubuntu 24.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 22
 Total Memory: 63.78GiB
 Name: ubuntu-server
 ID: ee401697-a44c-4d20-9937-a254b0d09618
 Docker Root Dir: /media/data_extra/docker/231072.231072
 Debug Mode: false
 Username: danger89
 Experimental: false
 Insecure Registries:
  127.0.0.1:6000
  ::1/128
  127.0.0.0/8
 Registry Mirrors:
  http://127.0.0.1:6000/
 Live Restore Enabled: false
 Default Address Pools:
   Base: 10.254.0.0/16, Size: 25
 Firewall Backend: iptables
config.toml contents
concurrent = 5
check_interval = 60
connection_max_age = "15m0s"
shutdown_timeout = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "Default Docker runner"
  url = "secret"
  id = 42
  token = "secret"
  token_obtained_at = 2023-03-29T21:32:49Z
  token_expires_at = 0001-01-01T00:00:00Z
  wait_for_services_timeout=110
  request_concurrency=2
  executor = "docker"
  environment = ["DOCKER_DRIVER=overlay2"]
  [runners.docker]
    tls_verify = false
    image = "alpine:3.17"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
    network_mtu = 0

Used GitLab Runner version

Like I said I now use version 18.6.6 again, but version 18.7.0 and higher are broken wrt services for me!

Possible fixes

I don't know the root cause. But 99% sure the bug is in 18.7.0. And I'm 80% sure its a regression is of MR: !5980 (merged) most likely.

Workaround: Going back to version 18.6.6...

Edited by Melroy van den Berg