Skip to content

Docker in Docker 19.03 service fails

Summary

CE and EE jobs are are failing with an error like:

docker: Cannot connect to the Docker daemon at tcp://docker:2375. Is the docker daemon running?.

E.g.:

This is also affecting customers. Presumably it will affect anyone using docker:stable-dind on our runners.


Details

Docker has released a new version 19.03 https://hub.docker.com/_/docker?tab=tags which enabled TLS by default.

Starting in 18.09+, the dind variants of this image will automatically generate TLS certificates in the directory specified by the DOCKER_TLS_CERTDIR environment variable.

Warning: in 18.09, this behavior is disabled by default (for compatibility). If you use --network=host, shared network namespaces (as in Kubernetes pods), or otherwise have network access to the container (including containers started within the dind instance via their gateway interface), this is a potential security issue (which can lead to access to the host system, for example). It is recommended to enable TLS by setting the variable to an appropriate value (-e DOCKER_TLS_CERTDIR=/certs or similar). In 19.03+, this behavior is enabled by default.

This means that when the service starts it will try and create the certificates, which Gitlab Runner doesn't seem to accept this.

Notes

With the workaround below you still might see errors like the service not starting, but your job still succeeds, @tmaczukin left a detailed explanation why this happens in #4501 (comment 195033385)

Workaround

Support TLS

With 19.03 TLS is enabled by default, to use TLS you need to update the GitLab Runner configuration so that the certificates are shared between the service and build container, do this this update your config.toml to look something like below:

[[runners]]
  name = "My Docker Runner"
  url = "http://127.0.0.1:3000/"
  token = "oXA2AxcKb8mdGEUrB-3L"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.docker]
    tls_verify = false
    image = "docker:stable"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/certs/client", "/cache"]                      #<-------------- Notice the extra mount to /certs/client
    shm_size = 0

Then you need to update your .gitlab-ci.yml file to explicitly specify that you are the certificates to be generated in a specific path

    image: docker:19.03

    variables:
      # When using dind service we need to instruct docker, to talk with
      # the daemon started inside of the service. The daemon is
      # available with a network connection instead of the default
      # /var/run/docker.sock socket. docker:19.03-dind does this
      # automatically by setting the DOCKER_HOST in
      # https://github.com/docker-library/docker/blob/d45051476babc297257df490d22cbd806f1b11e4/19.03/docker-entrypoint.sh#L23-L29
      #
      # The 'docker' hostname is the alias of the service container as described at
      # https://docs.gitlab.com/ee/ci/docker/using_docker_images.html#accessing-the-services.
      #
      # Note that if you're using the Kubernetes executor, the variable should be set to
      # tcp://localhost:2376/ because of how the Kubernetes executor connects services
      # to the job container
      # DOCKER_HOST: tcp://localhost:2376/
      #
      # When using dind, it's wise to use the overlayfs driver for
      # improved performance.
      DOCKER_DRIVER: overlay2
      # Specify to Docker where to create the certificates, Docker will
      # create them automatically on boot, and will create
      # `/certs/client` that will be shared between the service and
      # build container.
      DOCKER_TLS_CERTDIR: "/certs"

    services:
      - docker:19.03-dind

    before_script:
      - docker info

    build:
      stage: build
      script:
        - docker build -t my-docker-image .
        - docker run my-docker-image /script/to/run/tests

Disable TLS

Set DOCKER_TLS_CERTDIR= as an environment variable to disable TLS, this can be done if a few ways:

config.toml

# config.toml
[[runners]]
  environment = ["DOCKER_TLS_CERTDIR="]

Per job

# .gitlab-ci.yml
variables:
  DOCKER_TLS_CERTDIR: ""

Use older Docker in Docker image

 variables:
   DOCKER_HOST: tcp://docker:2375/
   DOCKER_DRIVER: overlay2
 services:
   - docker:18.09-dind

Proposal

Timeline

Edited by Steve Xuereb