Skip to content

Windows gitlab-runner-helper image fails due to invalid volume specification for `/opt/step-runner' path.

Summary

When using the docker-autoscaler executor with Windows EC2 hosts running Windows containers in AWS there is an invalid volume specification error during the Preparing environment stage.

I believe this relates to !5322 (merged)

Steps to reproduce

  1. Setup a Linux EC2 runner-manager with gitlab-runner v17.9.0.
  2. Configure an Auto Scaling Group (ASG) with a Windows_Server-2019-English-Full-ECS_Optimized-2025.01.15 AMI with container support.
  3. Try to run a job using the docker-autoscaler executor.

Any CI job that uses that runner will fail during the Preparing environment stage.

.gitlab-ci.yml
test_docker_executor:
  stage: .pre
  script:
    - echo "This is a test"
    - "dir env:"
  tags:
    - windows10-docker-executor
test_docker_executor:
  stage: .pre
  script:
    - echo "This is a test"
    - "dir env:"
  tags:
    - windows10-docker-executor

Actual behavior

Job fails during Preparing environment

Expected behavior

Environment successfully prepares and the rest of the job runs.

Relevant logs and/or screenshots

job log
ERROR: Job failed (system failure): prepare environment: Error response from daemon: invalid volume specification: 'runner-t2dxlw-project-6858-concurrent-0-1c120e7dd2e806ef-cache-df4a2b79b4bdda066e51ec4c84673e4d:/opt/step-runner' (docker.go:642:0s). Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

Environment description

  • This is in a custom, self-hosted installation.
  • The docker-autoscaler executor is being used.
  • Here is the docker info from the Windows host:
`docker info`
Client:
 Version:    25.0.6.m
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 8
 Server Version: 25.0.6
 Storage Driver: windowsfilter
  Windows:
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics internal l2bridge l2tunnel nat null overlay private transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local splunk syslog
 Swarm: inactive
 Default Isolation: process
 Kernel Version: 10.0 17763 (17763.1.amd64fre.rs5_release.180914-1434)
 Operating System: Microsoft Windows Server Version 1809 (OS Build 17763.6775)
 OSType: windows
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.941GiB
 Name: EC2AMAZ-2LVAN60
 ID: f80481b3-966b-4391-bc23-33bd01636186
 Docker Root Dir: C:\ProgramData\docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
config.toml contents
concurrent = 10
check_interval = 0
connection_max_age = "15m0s"
shutdown_timeout = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "<redacted>"
  url = "<redacted>"
  id = 123456
  token = "<redacted>"
  token_obtained_at = 2025-02-25T18:03:55Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "docker-autoscaler"
  environment = ["FF_USE_POWERSHELL_PATH_RESOLVER=1"]
  shell = "powershell"
  [runners.custom_build_dir]
  [runners.cache]
    MaxUploadedArchiveSize = 0
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "mcr.microsoft.com/windows/servercore:ltsc2019"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["c:\\cache"]
    shm_size = 0
    helper_image = "registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-v17.8.3-servercore1809"
    network_mtu = 0
  [runners.autoscaler]
    capacity_per_instance = 2
    max_use_count = 0
    max_instances = 5
    plugin = "aws"
    update_interval = "0s"
    update_interval_when_expecting = "0s"
    [runners.autoscaler.plugin_config]
      config_file = "/home/gitlab-runner/.aws/config"
      name = "asg-name-is-here"
      profile = "default"
    [runners.autoscaler.connector_config]
      username = "Administrator"
      key_path = "/home/gitlab-runner/.ssh/windows-executor-key.pem"
      keepalive = "0s"
      timeout = "0s"

    [[runners.autoscaler.policy]]
      idle_count = 1
      idle_time = "20m0s"
      scale_factor = 0.0
      scale_factor_limit = 0

Used GitLab Runner version

Running with gitlab-runner 17.9.0 (c4cbe9dd)

Possible fixes

Reverting to 17.8.3 seems to work.

I'm pretty sure it is related to this MR: !5322 (merged)

This is the only place I can find the offending path in the code: https://gitlab.com/gitlab-org/gitlab-runner/-/blame/main/executors/docker/steps.go#L18