Sign in or sign up before continuing. Don't have an account yet? Register now to get started.
Register now

Allow pipeline services configuration for marking services as critical

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

  • Close this issue

Problem to solve

In certain cases, a failure to start a service should result in a job failure. Tying a service container failure to a build container failure is not currently possible.

The current healthcheck configuration for services raises a warning when a service fails to start. There can be several case where a warning may be raised without an issue as described in the docs however in other cases it would be preferable for a service container failure to result in a build container's failure.

Target audience

  • Devon, DevOps Engineer, https://design.gitlab.com/research/personas#persona-devon

Further details

One example of this problem is when our sast job attempts to spin up a docker service for docker-in-docker execution. When the service fails to start, a warning is generated but the job continues to execute, eventually failing as it depends on the service but it was not clear the service failed to start.

Running with gitlab-runner 11.6.0 (f100a208)
  on ci-runner-2 xwRbLsB8
Using Docker executor with image docker:stable ...
Starting service docker:stable-dind ...
Pulling docker image docker:stable-dind ...
Using docker image sha256:5b626cc3459ad077146e8aac1fbe25f7099d71c6765efd6552b9209ca7ea4dc1 for docker:stable-dind ...
Waiting for services to be up and running...

*** WARNING: Service runner-xwRbLsB8-project-26-concurrent-0-docker-0 probably didn't start properly.

Health check error:
ContainerStart: Error response from daemon: Cannot link to a non running container: /runner-xwRbLsB8-project-26-concurrent-0-docker-0 AS /runner-xwRbLsB8-project-26-concurrent-0-docker-0-wait-for-service/service (executor_docker.go:1321:0s)

Service container logs:
2019-02-21T15:57:51.502610505Z mount: permission denied (are you root?)

Proposal

Add a configuration option to the services settings map to recognize a service as critical. Possible options: allow_failure: false (default: true) or critical: true (default: false).

services:
 - name: docker:stable-dind
   allow_failure: false

I would argue that this should be the default behavior for all services, however this is a breaking change so I'm not sure if we would want to proceed with this modification.

What does success look like, and how can we measure that?

When defining a service in .gitlab-ci.yml, if that service fails to start properly, the job should report as a failure.

Links / references

Edited Aug 04, 2025 by 🤖 GitLab Bot 🤖
Assignee Loading
Time tracking Loading