per-build networking breaks DNS configuration for DinD
Summary
The per-build
networking mode causes the DNS configuration of the host system to not be picked up by containers running inside of Docker-in-Docker (DinD).
Docker falls back to hard-coded DNS servers 8.8.8.8
and 8.8.4.4
.
This is a problem particularly in corporate/institutional networks where outgoing DNS traffic may be blocked, i.e. public DNS resolvers cannot be reached. As a result any docker build
of a container image which requires network access (inside a RUN
command) fails.
Steps to reproduce
Pure explanation of the issue in Docker at the end of this section.
Repository consisting of the following Dockerfile
and .gitlab-ci.yml
.
FROM busybox
RUN cat /etc/resolv.conf
RUN nslookup google.com || true
RUN wget -O /google.com.html https://google.com/
.gitlab-ci.yml
stages:
- build
build-image:
stage: build
image: docker:20.10
tags:
- docker-privileged
services:
- docker:20.10-dind
script:
- docker build -t myimage .
The underlying problem, docker only
The following example demonstrates the underlying issue by emulating some of the steps performed by GitLab Runner, specifically creating a custom docker network and connecting DinD to it:
# Host resolv.conf, using a company-internal DNS resolver
$ cat /etc/resolv.conf
search corp.com
nameserver 192.168.53.53
# Create per-build network
$ docker network create test
# Start the docker:dind container connected to the network
$ docker run -d --name dind --net test --privileged docker:dind
# DinD resolv.conf, using a forwarding resolver specific to the custom docker network `test`
$ docker exec dind cat /etc/resolv.conf
search corp.com
nameserver 127.0.0.11
options ndots:0
# Name resolution/ping works
$ docker exec dind ping -c 1 google.com
PING google.com (142.250.185.174): 56 data bytes
64 bytes from 142.250.185.174: seq=0 ttl=111 time=10.567 ms
--- google.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 10.567/10.567/10.567 ms
# Container inside of DinD resolv.conf, falling back to hard-coded docker defaults
$ docker exec dind docker run busybox cat /etc/resolv.conf
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
aa2a8d90b84c: Pulling fs layer
aa2a8d90b84c: Download complete
aa2a8d90b84c: Pull complete
Digest: sha256:be4684e4004560b2cd1f12148b7120b0ea69c385bcc9b12a637537a2c60f97fb
Status: Downloaded newer image for busybox:latest
search corp.com
options ndots:0
nameserver 8.8.8.8
nameserver 8.8.4.4
# Name resolution does not work as outgoing DNS traffic is blocked by the corporation's firewall
$ docker exec -it dind docker run busybox ping -c 1 google.com
ping: bad address 'google.com'
# Tearing down...
$ docker stop dind
dind
$ docker rm dind
dind
$ docker network remove test
test
This effect is the result of the following:
- Each custom network has a docker-embedded DNS resolver for resolving service names. Connected containers are configured with this resolver in resolv.conf.
- A custom network is created because the
per-build
networking mode is enabled via theFF_NETWORK_PER_BUILD
feature flag. - The resolver is available on
127.0.0.11
for each container connected to the custom network. This cannot be overwritten, i.e.--dns 192.168.53.53
does not have any effect.
- A custom network is created because the
- For child containers started by dind, the default behaviour of Docker applies for populating resolv.conf, as these are not connected to a custom network.
- Docker uses the resolv.conf of the "host" (the dind container), stripping away any localhost nameservers (like
127.0.0.11
) - If no nameservers remain, Docker adds a hard-coded set of default nameservers (
8.8.8.8
,8.8.4.4
) - The resulting list of nameservers is written to the resolv.conf of the child container
- Docker uses the resolv.conf of the "host" (the dind container), stripping away any localhost nameservers (like
This issue is known but somewhat stalled: moby/moby#20037 (comment).
The workaround is to specify the DNS servers to use explicitly for child containers running inside of DinD. Either of the following solution fix the issue
-
Configure the DNS on each container which is started within DinD.
$ docker run -d --name dind --net test --privileged docker:dind $ docker exec dind docker run --dns 141.52.3.3 --dns 129.13.64.5 busybox ping -c 1 google.com
-
Configure the default DNS when starting
docker:dind
dockerd.$ docker run -d --name dind --net test --privileged docker:dind --dns 141.52.3.3 --dns 129.13.64.5 $ docker exec dind docker run busybox ping -c 1 google.com
Actual behavior
Building of the Docker image fails because the build containers running inside of DinD cannot fetch the required files from the internet as DNS names cannot be resolved. (Could also be any package fetching/installation).
Expected behavior
Image is built as the DNS configuration of the host is used inside the build containers.
Relevant logs and/or screenshots
job log
I omitted (OMITTED
) log output related to #27686.
Running with gitlab-runner 13.11.0 (7f7a4bb0)
on pauls-test-runner-docker-privileged 3gq-aACs
feature flags: FF_NETWORK_PER_BUILD:true
Preparing the "docker" executor
Using Docker executor with image docker:20.10 ...
Starting service docker:20.10-dind ...
Pulling docker image docker:20.10-dind ...
Using docker image sha256:dc8c389414c80f3c6510d3690cd03c29fc99d66f58955f138248499a34186bfa for docker:20.10-dind with digest docker@sha256:87ed8e3a7b251eef42c2e4251f95ae3c5f8c4c0a64900f19cc532d0a42aa7107 ...
Waiting for services to be up and running...
*** WARNING: Service runner-3gq-aacs-project-25822-concurrent-0-070528995f596ee8-docker-0 probably didn't start properly.
Health check error:
service "runner-3gq-aacs-project-25822-concurrent-0-070528995f596ee8-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
OMITTED
*********
Pulling docker image docker:20.10 ...
Using docker image sha256:d2979b152a7d43f040c7aef88c4c83de4e545227622b1045adf6fe409293f803 for docker:20.10 with digest docker@sha256:062edd9c11cbdf94e7620d932857a356fa179eaa26a3cc352759e75f04729c49 ...
Preparing environment
Running on runner-3gq-aacs-project-25822-concurrent-0 via build-ci...
Getting source from Git repository
Fetching changes with git depth set to 50...
Initialized empty Git repository in /builds/cy8791/dind-dns-test/.git/
Created fresh repository.
Checking out 71bf4985 as main...
Skipping Git submodules setup
Executing "step_script" stage of the job script
Using docker image sha256:d2979b152a7d43f040c7aef88c4c83de4e545227622b1045adf6fe409293f803 for docker:20.10 with digest docker@sha256:062edd9c11cbdf94e7620d932857a356fa179eaa26a3cc352759e75f04729c49 ...
$ docker build -t myimage .
Step 1/4 : FROM busybox
latest: Pulling from library/busybox
aa2a8d90b84c: Pulling fs layer
aa2a8d90b84c: Verifying Checksum
aa2a8d90b84c: Download complete
aa2a8d90b84c: Pull complete
Digest: sha256:be4684e4004560b2cd1f12148b7120b0ea69c385bcc9b12a637537a2c60f97fb
Status: Downloaded newer image for busybox:latest
---> c55b0f125dc6
Step 2/4 : RUN cat /etc/resolv.conf
---> Running in 7d3e7642c93f
search corp.com
options ndots:0
nameserver 8.8.8.8
nameserver 8.8.4.4
Removing intermediate container 7d3e7642c93f
---> eae2b70b7bcf
Step 3/4 : RUN nslookup google.com || true
---> Running in 2303181946a4
;; connection timed out; no servers could be reached
Removing intermediate container 2303181946a4
---> 242abe799a60
Step 4/4 : RUN wget -O /google.com.html https://google.com/
---> Running in 40ebcfcd7de1
wget: bad address 'google.com'
The command '/bin/sh -c wget -O /google.com.html https://google.com/' returned a non-zero code: 1
Cleaning up file based variables
ERROR: Job failed: exit code 1
Environment description
The custom-installed runner is executed on a host inside a network where outgoing DNS traffic is blocked. That means the DNS servers configured in the host's resolv.conf must be used for performing any DNS query.
The runner uses the Docker executor in privileged mode so that Docker images can be built. Recent versions of GitLab Runner and Docker are installed.
config.toml contents
concurrent = 2
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "REDACTED-docker-privileged"
url = "https://REDACTED/"
token = "REDACTED"
executor = "docker"
environment = ["DOCKER_DRIVER=overlay2", "DOCKER_TLS_CERTDIR=/certs"]
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
[runners.cache.azure]
[runners.feature_flags]
FF_NETWORK_PER_BUILD = true
[runners.docker]
tls_verify = false
image = "docker:latest"
privileged = true
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/certs/client", "/cache"]
pull_policy = ["always"]
shm_size = 0
`docker info` output
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
scan: Docker Scan (Docker Inc.)
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 5
Server Version: 20.10.6
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-1160.25.1.el7.x86_64
Operating System: Red Hat Enterprise Linux
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.696GiB
Name: build-ci
ID: YQM6:ZWJI:UQ73:N5GM:K4JL:7PK5:M7CX:GWA4:RYGP:RHUF:O5YX:VPUI
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Used GitLab Runner version
Version: 13.11.0
Git revision: 7f7a4bb0
Git branch: 13-11-stable
GO version: go1.13.8
Built: 2021-04-20T17:02:28+0000
OS/Arch: linux/amd64
Possible fixes
To me no fix is apparent.
- Ideally, the proper DNS servers would somehow be picked up automatically. This fix would have to occur in Docker/Moby. moby/moby#20037 (comment)
- Alternatively, GitLab Runner could provide a mechanism to specify the DNS servers in
config.toml
which get picked up bydocker:dind
containers (and their child containers) running as services within CI jobs.
Fixing this bug is not critical, as workarounds are available, and per-build networking is not the default (yet?).
Workarounds
-
Require each .gitlab-ci.yml to specify DNS explicitly for the
docker:dind
service, i.e. specify a commanddockerd ... --dns 192.168.53.53
. This requires developers to know details of the network environment of the GitLab runners. -
Provide a DinD service in the GitLab Runner via
config.toml
which is properly configured, i.e. has a commanddockerd ... --dns 192.168.53.53
. As the image is fixed inconfig.toml
, there is no way for developers to specify a different version of the image in .gitlab-ci.yml. -
Disable per-build networking for the GitLab Runner, i.e. remove the feature flag from
config.toml
. Then, passing on the host's nameservers through Docker's resolv.conf mechanism works: host → dind → child containerIMHO this is the preferably workaround as it is simple and preserves the separation of runner administration and developers.
References
- Underlying issue in Moby moby/moby#20037 (comment)
- More general issue concerning DNS nameserver detection by the Docker daemon moby/moby#23910
- Docker documentation on DNS configuration https://docs.docker.com/config/containers/container-networking/#dns-services
- Does not use per-build networking, but also concerns internal corporate networks #2201
- Does not use per-build networking, but also concerns DNS configuration #3054 and !892 (closed)
- Same issue surfacing in Drone CI https://discourse.drone.io/t/dind-container-not-receiving-host-resolv-conf-settings/811
To do
-
Let's test the solution outlined in the comment threads to validate there is a viable solution for this problem.