Docker in Docker Builds Slow and Consume Huge Amounts of Disk Space

Summary

After provisioning a fresh server with Ubuntu 14.10, Docker CE and overlay2 storage driver I find that docker in docker CI builds are taking twice as long to complete and are consuming a huge amount of disk space during the build. It appears that all of this disk space is recovered one the build complete.

Steps to reproduce

  1. Provision Ubuntu 14.10 with the script below
  2. Run gitlab runner as a docker container (see config.toml below)
  3. Set up a build pipeline with the .gitlab-ci.ymlbelow
  4. Observe the current disk usage:
  5. Start the build.
  6. Monitor increase of disk usage during build:
  7. Use iotop to see that something is still running with --storage-driver=vfs rather than overlay or overlay2
  8. Monitor disk usage returning to normal after build:

Actual behavior

The build takes twice as long to run and consumes way more than expected disk space during the build.

Expected behavior

After discovering my hosts were running with the devicemapper loopback storage driver I expected a new host with overlay2 storage driver to run builds at least as quickly and with a similar amount of disk usage.

Relevant logs and/or screenshots

Initial disk usage:

Filesystem      Size  Used Avail Use% Mounted on
/dev/root        20G  4.9G   14G  27% /
devtmpfs        492M     0  492M   0% /dev
tmpfs           495M     0  495M   0% /dev/shm
tmpfs           495M  8.9M  486M   2% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           495M     0  495M   0% /sys/fs/cgroup

Example of disk usage during build (fluctuates a lot, has gone as high as 90%):

For example, during the image pulling stage:

Filesystem      Size  Used Avail Use% Mounted on
/dev/root        20G   12G  7.2G  62% /
devtmpfs        492M     0  492M   0% /dev
tmpfs           495M     0  495M   0% /dev/shm
tmpfs           495M  9.4M  486M   2% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           495M     0  495M   0% /sys/fs/cgroup

For example, during the npm install stage (it's a simple node project):

Filesystem      Size  Used Avail Use% Mounted on
/dev/root        20G   15G  4.1G  79% /
devtmpfs        492M     0  492M   0% /dev
tmpfs           495M     0  495M   0% /dev/shm
tmpfs           495M  9.4M  486M   2% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           495M     0  495M   0% /sys/fs/cgroup

Output of sudo iotop during build. Note the references to --storage-driver=vfs:

Total DISK READ :      29.78 M/s | Total DISK WRITE :      29.65 M/s
Actual DISK READ:      35.19 M/s | Actual DISK WRITE:      70.23 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
48260 be/4 root        6.48 M/s    0.00 B/s  0.00 % 17.11 % dockerd -~driver=vfs
 9257 be/4 root        8.77 M/s    0.00 B/s  0.00 % 12.46 % dockerd -~driver=vfs
 8259 be/4 root       10.88 M/s    0.00 B/s  0.00 %  8.51 % dockerd -~driver=vfs
 8260 be/4 root        3.65 M/s    0.00 B/s  0.00 %  6.12 % dockerd -~driver=vfs
 9360 be/4 root        0.00 B/s   29.65 M/s  0.00 %  0.00 % docker-un~44333-init

Environment description

  • Self-hosted gitlab CE version 9.0.4
  • Gitlab runner 9.3.0 (recently upgraded from 1.4.2 or something - made no difference)
  • Docker executor / DIND build

.gitlab-ci.yml:

image: docker:latest

# When using dind, it's wise to use the overlayfs driver for
# improved performance.
variables:
  DOCKER_DRIVER: overlay

services:
- docker:dind

stages:
- build

build:
  stage: build
  script:
    - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN myregistry.com
    - docker build --pull -t myimage .
    - docker push myimage

config.toml:

concurrent = 1
check_interval = 0

[[runners]]
  name = "DIND runner"
  url = "https://my.gitlab.com/ci"
  token = "REDACTED"
  executor = "docker"
  [runners.docker]
    tls_verify = false
    image = "docker:latest"
    privileged = true
    disable_cache = true
    volumes = ["/cache"]
  [runners.cache]

docker info:

Containers: 16
 Running: 14
 Paused: 0
 Stopped: 2
Images: 16
Server Version: 17.03.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.15-x86_64-linode81
Operating System: Ubuntu 16.10
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 989 MiB
Name: dev1
ID: KWTP:NAOT:CGDG:HBCS:UOVG:H3XZ:IW7C:5EJA:IZI7:ROFZ:EHHU:TQNL
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Server provisioning script:

#!/bin/bash

# Install packages to allow apt to access a repository over https
sudo apt-get -y install \
    apt-transport-https \
    ca-certificates \
    curl \
    software-properties-common

# Add Dockers official GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

# Verify that the key fingerprint is 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88
sudo apt-key fingerprint 0EBFCD88

# Set up the stable repository
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

# Update the apt package index.
sudo apt-get update

# Latest version of Docker
sudo apt-get -y install docker-ce

# Configure Docker FileSystem
mkdir -p /etc/systemd/system/docker.service.d
cat > /etc/systemd/system/docker.service.d/overlay2.conf <<EOF
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// --storage-driver=overlay2
EOF

# Add users to docker group
usermod -aG docker $(whoami)
usermod -aG docker $SUDO_USER

reboot

Used GitLab Runner version

Please run and paste the output of gitlab-runner --version. If you are using a Runner where you don't have access to, please paste at least the first lines the from build log, like:

Running with gitlab-ci-multi-runner 1.4.2 (bcc1794)
Using Docker executor with image golang:1.6 ...
Edited Jun 24, 2017 by djskinner
Assignee Loading
Time tracking Loading