Skip to content

Docker executor intermittently fails to start with "incorrect username or password"

Summary

I have a self-hosted runner being utilized on a self-managed instance and recently I started experiencing issues getting CI jobs to run.

Seemingly out of nowhere, Docker Hub began blocking my IP and I began consistently getting "anonymous pull limit reached" (which apparently happens quite often to entire blocks of IPs). Running docker login with my Hub account works fine, though, after verifying something wasn't spamming pulls somewhere.

When using the docker CLI on the server hosting the runner, it has no trouble after a docker login, so I figured the executor just needed a similar login. I attempted to follow the documentation for private registries (found here) but applying it to Docker Hub.

I followed the steps for putting the JSON directly in the runner's config.toml (since the environment variable doesn't seem to work), however it only seems to work sometimes.

Steps to reproduce

  1. Define DOCKER_AUTH_CONFIG as an inline JSON in config.toml as documented.
  2. Start any job
  3. Intermittently observe error.

Actual behavior

It seems like every few jobs (1 in 4 ish), the executor fails to start with this message:

WARNING: Failed to pull image with policy "always": Error response from daemon: Head "https://registry-1.docker.io/v2/library/node/manifests/22": unauthorized: incorrect username or password (manager.go:254:0s)
ERROR: Job failed: failed to pull image "node:22" with specified policies [always]: Error response from daemon: Head "https://registry-1.docker.io/v2/library/node/manifests/22": unauthorized: incorrect username or password (manager.go:254:0s)

Retrying the job usually works, sometimes needing a few retries, but it eventually goes.

Recent examples:

All of these jobs ran on the same runner successfully, barring the two marked in red which succeeded only after clicking "retry":

image

The top-most red mark actually took several retries, almost like Docker Hub temporarily blocked pulls for ~10 seconds, but eventually went:

image

Expected behavior

Job starts normally.

Relevant logs and/or screenshots

An example job log:
Running with gitlab-runner 17.10.0 (67b2b2db)
  on shared [MASKED], system ID: [MASKED]
Preparing the "docker" executor
00:02
Using Docker executor with image node:22 ...
Authenticating with credentials from $DOCKER_AUTH_CONFIG
Pulling docker image node:22 ...
WARNING: Failed to pull image with policy "always": Error response from daemon: Head "https://registry-1.docker.io/v2/library/node/manifests/22": unauthorized: incorrect username or password (manager.go:254:0s)
ERROR: Job failed: failed to pull image "node:22" with specified policies [always]: Error response from daemon: Head "https://registry-1.docker.io/v2/library/node/manifests/22": unauthorized: incorrect username or password (manager.go:254:0s)

Environment description

`docker info`
Client: Docker Engine - Community
 Version:    28.0.4
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.22.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.34.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
config.toml contents
concurrent = 1
check_interval = 0
shutdown_timeout = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "shared"
  url = "[redacted]"
  id = 6
  token = "[redacted]"
  token_obtained_at = [redacted]
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "docker"
  environment = ["DOCKER_AUTH_CONFIG=[redacted]"]
  [runners.custom_build_dir]
  [runners.cache]
    MaxUploadedArchiveSize = 0
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "ubuntu:latest"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0

Used GitLab Runner version

Running with gitlab-runner 17.10.0 (67b2b2db)
Using Docker executor with image node:22 ...

Possible fixes

Not sure, as the problem appears to be coming from within the executor itself.

Edit: Added some example screenshots.

Edited by Matthew Struble