Skip to content

13.10 Job fetching performance / docker+machine usage

Summary

Starting with runner 13.10 we experience performance issues in our pipelines. The runner does no longer pick enough jobs and has too long delays between job checks.

We did not have any of these issues with 13.9, there were no configuration changes.

Steps to reproduce

  • configure docker+machine autoscaling
  • start a bunch of jobs

Actual behavior

Runner only picks around 2-3 jobs and does not use all machines it started.

Expected behavior

Pick any pending job and use all the idle machines.

Relevant logs and/or screenshots

During this log period there were around 30 idle jobs that could have been picked by this runner. The runner only had 2 jobs running, whereas there were 9 docker-machines up and running that could have been used.

The runner correctly spins up the required instances and starts fetching jobs. What we noticed is that job checking occurs about every 50 seconds, opposed to what we would expect:

job log
Mar 24 10:27:11 gitlab gitlab-runner[21612]: Checking for jobs... received                     #033[0;m  job#033[0;m=313338 repo_url#033[0;m=https://gitlab.example.com/redacted-project.git runner#033[0;m=ovfJCpH1
Mar 24 10:27:12 gitlab gitlab-runner[21612]: Using existing docker-machine                     #033[0;m  created#033[0;m=2021-03-24 10:01:18.780997449 +0100 CET m=+1.713588173 docker#033[0;m=tcp://116.203.59.132:2376 job#033[0;m=313338 name#033[0;m=runner-ovfjcph1-runner-1616571477-cef13cd7 now#033[0;m=2021-03-24 10:27:12.741892218 +0100 CET m=+1555.674482981 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=7
Mar 24 10:27:14 gitlab gitlab-runner[21612]: Starting docker-machine build...                  #033[0;m  created#033[0;m=2021-03-24 10:01:18.780997449 +0100 CET m=+1.713588173 docker#033[0;m=tcp://116.203.59.132:2376 job#033[0;m=313338 name#033[0;m=runner-ovfjcph1-runner-1616571477-cef13cd7 now#033[0;m=2021-03-24 10:27:14.497331006 +0100 CET m=+1557.429921751 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=7
Mar 24 10:28:04 gitlab gitlab-runner[21612]: Checking for jobs... received                     #033[0;m  job#033[0;m=313339 repo_url#033[0;m=https://gitlab.example.com/redacted-project.git runner#033[0;m=ovfJCpH1
Mar 24 10:28:05 gitlab gitlab-runner[21612]: Finished docker-machine build: <nil>              #033[0;m  created#033[0;m=2021-03-24 10:01:18.781005082 +0100 CET m=+1.713595815 docker#033[0;m=tcp://157.90.166.215:2376 job#033[0;m=313333 name#033[0;m=runner-ovfjcph1-runner-1616572126-92c36086 now#033[0;m=2021-03-24 10:28:05.157583303 +0100 CET m=+1608.090174067 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=11
Mar 24 10:28:05 gitlab gitlab-runner[21612]: Cleaned up docker-machine                         #033[0;m  created#033[0;m=2021-03-24 10:01:18.781005082 +0100 CET m=+1.713595815 docker#033[0;m=tcp://157.90.166.215:2376 job#033[0;m=313333 name#033[0;m=runner-ovfjcph1-runner-1616572126-92c36086 now#033[0;m=2021-03-24 10:28:05.180154732 +0100 CET m=+1608.112745479 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=11
Mar 24 10:28:05 gitlab gitlab-runner[21612]: Job succeeded                                     #033[0;m  duration#033[0;m=1m43.849102947s job#033[0;m=313333 project#033[0;m=461 runner#033[0;m=ovfJCpH1
Mar 24 10:28:05 gitlab gitlab-runner[21612]: Using existing docker-machine                     #033[0;m  created#033[0;m=2021-03-24 10:01:18.780988916 +0100 CET m=+1.713579639 docker#033[0;m=tcp://157.90.126.211:2376 job#033[0;m=313339 name#033[0;m=runner-ovfjcph1-runner-1616569366-5a0fd8f0 now#033[0;m=2021-03-24 10:28:05.271903506 +0100 CET m=+1608.204494244 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=7
Mar 24 10:28:06 gitlab gitlab-runner[21612]: Starting docker-machine build...                  #033[0;m  created#033[0;m=2021-03-24 10:01:18.780988916 +0100 CET m=+1.713579639 docker#033[0;m=tcp://157.90.126.211:2376 job#033[0;m=313339 name#033[0;m=runner-ovfjcph1-runner-1616569366-5a0fd8f0 now#033[0;m=2021-03-24 10:28:06.985579886 +0100 CET m=+1609.918170640 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=7
Mar 24 10:28:52 gitlab gitlab-runner[21612]: Finished docker-machine build: <nil>              #033[0;m  created#033[0;m=2021-03-24 10:01:18.781013068 +0100 CET m=+1.713603790 docker#033[0;m=tcp://157.90.174.232:2376 job#033[0;m=313331 name#033[0;m=runner-ovfjcph1-runner-1616574459-7d0883dd now#033[0;m=2021-03-24 10:28:52.351457155 +0100 CET m=+1655.284047926 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=12
Mar 24 10:28:52 gitlab gitlab-runner[21612]: Cleaned up docker-machine                         #033[0;m  created#033[0;m=2021-03-24 10:01:18.781013068 +0100 CET m=+1.713603790 docker#033[0;m=tcp://157.90.174.232:2376 job#033[0;m=313331 name#033[0;m=runner-ovfjcph1-runner-1616574459-7d0883dd now#033[0;m=2021-03-24 10:28:52.386604577 +0100 CET m=+1655.319195329 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=12
Mar 24 10:28:52 gitlab gitlab-runner[21612]: Job succeeded                                     #033[0;m  duration#033[0;m=3m25.597972082s job#033[0;m=313331 project#033[0;m=461 runner#033[0;m=ovfJCpH1
Mar 24 10:28:52 gitlab gitlab-runner[21612]: Finished docker-machine build: <nil>              #033[0;m  created#033[0;m=2021-03-24 10:01:18.780988916 +0100 CET m=+1.713579639 docker#033[0;m=tcp://157.90.126.211:2376 job#033[0;m=313339 name#033[0;m=runner-ovfjcph1-runner-1616569366-5a0fd8f0 now#033[0;m=2021-03-24 10:28:52.781324371 +0100 CET m=+1655.713915122 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=7
Mar 24 10:28:54 gitlab gitlab-runner[21612]: Cleaned up docker-machine                         #033[0;m  created#033[0;m=2021-03-24 10:01:18.780988916 +0100 CET m=+1.713579639 docker#033[0;m=tcp://157.90.126.211:2376 job#033[0;m=313339 name#033[0;m=runner-ovfjcph1-runner-1616569366-5a0fd8f0 now#033[0;m=2021-03-24 10:28:54.191253434 +0100 CET m=+1657.123844158 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=7
Mar 24 10:28:54 gitlab gitlab-runner[21612]: Job succeeded                                     #033[0;m  duration#033[0;m=49.930860599s job#033[0;m=313339 project#033[0;m=461 runner#033[0;m=ovfJCpH1
Mar 24 10:28:54 gitlab gitlab-runner[21612]: Checking for jobs... received                     #033[0;m  job#033[0;m=313332 repo_url#033[0;m=https://gitlab.example.com/redacted-project.git runner#033[0;m=ovfJCpH1
Mar 24 10:28:55 gitlab gitlab-runner[21612]: Using existing docker-machine                     #033[0;m  created#033[0;m=2021-03-24 10:01:18.781013068 +0100 CET m=+1.713603790 docker#033[0;m=tcp://157.90.174.232:2376 job#033[0;m=313332 name#033[0;m=runner-ovfjcph1-runner-1616574459-7d0883dd now#033[0;m=2021-03-24 10:28:55.343254134 +0100 CET m=+1658.275844855 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=13
Mar 24 10:28:57 gitlab gitlab-runner[21612]: Starting docker-machine build...                  #033[0;m  created#033[0;m=2021-03-24 10:01:18.781013068 +0100 CET m=+1.713603790 docker#033[0;m=tcp://157.90.174.232:2376 job#033[0;m=313332 name#033[0;m=runner-ovfjcph1-runner-1616574459-7d0883dd now#033[0;m=2021-03-24 10:28:57.087692868 +0100 CET m=+1660.020283630 project#033[0;m=461 runner#033[0;m=ovfJCpH1 usedcount#033[0;m=13

gitlab-runner --debug run::

Checking for jobs... nothing                        runner=redacted_fPwc
Checking for jobs... nothing                        runner=redacted_3PG7
Checking for jobs... nothing                        runner=redacted_27Gn
Checking for jobs... nothing                        runner=redacted_8819
Checking for jobs... nothing                        runner=redacted_6cb5
Feeding runners to channel                          builds=2
Docker Machine Details                              creating=0 idle=9 maxMachineCreate=0 maxMachines=14 minIdleCount=5 removing=0 runner=redacted_CpH1 time=2021-03-25 11:01:15.730655082 +0100 CET m=+101.597428603 total=11 used=2
Checking for jobs... nothing                        runner=redacted_1a65
Checking for jobs... nothing                        runner=redacted_eb93
Checking for jobs... nothing                        runner=redacted_4f32
Checking for jobs... nothing                        runner=redacted_ae5e
Checking for jobs... nothing                        runner=redacted_dec0
Checking for jobs... nothing                        runner=redacted_fd17
Checking for jobs... received                       job=314792 repo_url=https://gitlab.example.com/redacted-project.git runner=redacted_CpH1
Processing chain                                    chain-leaf=[0xc000cfa580 0xc00058cc00 0xc00071f180] context=certificate-chain-build
Certificate doesn't provide parent URL: exiting the loop  Issuer=DST Root CA X3 IssuerCertURL=[] Serial=91299735575339953335919266965803778155 Subject=DST Root CA X3 context=certificate-chain-build
Failed to requeue the runner                        builds=3 runner=redacted_CpH1
Running with gitlab-runner 13.10.0 (54944146)       job=314792 project=310 runner=redacted_CpH1
  on docker@hetzner.cloud redacted_CpH1                  job=314792 project=310 runner=redacted_CpH1
Preparing the "docker+machine" executor  job=314792 project=310 runner=redacted_CpH1
Checking for jobs... nothing                        runner=redacted_aaef
Checking for jobs... nothing                        runner=redacted_eb6a
Checking for jobs... nothing                        runner=redacted_hENo
Checking for jobs... nothing                        runner=redacted_zx8u

Environment description

We have a single gitlab-runner instance that has 20 distinct runner configurations. One of these configurations is our main runner using docker+machine autoscaling:

config.toml contents
concurrent = 20
check_interval = 1

[session_server]
  session_timeout = 1800

[[runners]]
  name = "my-runner"
  limit = 14
  output_limit = 40960
  url = "https://gitlab.example.com/"
  token = "<redacted>"
  executor = "docker+machine"
  [runners.docker]
    tls_verify = false
    image = "debian:latest"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache", "/var/run/docker.sock:/var/run/docker.sock"]
    shm_size = 0
  [runners.cache]
    Type = "s3"
    Shared = true
    [runners.cache.s3]
      ServerAddress = "<redacted>"
      AccessKey = "<redacted>"
      SecretKey = "<redacted>"
      BucketName = "gitlab-cache"
    [runners.cache.gcs]
  [runners.machine]
    IdleCount = 0
    IdleTime = 1800
    MaxBuilds = 100
    MachineDriver = "hetzner"
    MachineName = "runner-%s"
    MachineOptions = [
# redacted
    ]

    [[runners.machine.autoscaling]]
      Periods = ["* * 7-19 * * mon-fri *"]
      IdleCount = 5
      IdleTime = 3600
      Timezone = "Europe/Berlin"

Used GitLab Runner version

Version:      13.10.0
Git revision: 54944146
Git branch:   13-10-stable
GO version:   go1.13.8
Built:        2021-03-21T09:13:32+0000
OS/Arch:      linux/amd64
Edited by Laurens Stötzel