Infrequent job failures ("error during connect") with docker-autoscaler and AWS fleeting plugin

Summary

We are trying out the experimental fleeting support on AWS with GitLab Runner 16.5. Quite often (more than once every 10 jobs) in the middle of the script we're seeing an error.

Steps to reproduce

It happens seemingly randomly, across all the jobs in all the repositories. I haven't been able to see any pattern yet.

Actual behavior

Jobs randomly fail, with nearly identical error messages (see below).

Expected behavior

The jobs don't fail :)

Relevant logs and/or screenshots

section_end:1699277419:step_script
[0Ksection_start:1699277419:cleanup_file_variables
[0K[0K[36;1mCleaning up project directory and file based variables[0;m[0;m
[0;33mWARNING: Failed to inspect predefined container 01173e3c3a77515196f1f9726205e28878c4ceafe01c687e1fce2bafcc7b4c63 error during connect: Get "http://internel.tunnel.invalid/v1.43/containers/01173e3c3a77515196f1f9726205e28878c4ceafe01c687e1fce2bafcc7b4c63/json": dialing environment connection: EOF (docker_command.go:134:0s)[0;m
[0KUsing helper image:  registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-853330f9[0;m
[0KPulling docker image registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-853330f9 ...[0;m
[0;33mWARNING: Failed to pull image with policy "always": error during connect: Post "http://internel.tunnel.invalid/v1.43/images/create?fromImage=registry.gitlab.com%2Fgitlab-org%2Fgitlab-runner%2Fgitlab-runner-helper&tag=x86_64-853330f9": dialing environment connection: EOF (manager.go:237:0s)[0;m
section_end:1699277419:cleanup_file_variables
[0K[31;1mERROR: Failed to cleanup volumes[0;m
[31;1mERROR: Job failed (system failure): error during connect: Post "http://internel.tunnel.invalid/v1.43/containers/748a7e651c94ecf42d894f0e8b6bbe5f1d3e45b7e4deea36bff3a47a12719c59/wait?condition=not-running": dialing environment connection: EOF[0;m

Environment description

docker info output
Client:
 Version:    24.0.5
 Context:    default
 Debug Mode: false

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version:
 runc version:
 init version:
 Security Options:
  apparmor
  seccomp
   Profile: builtin
 Kernel Version: 5.4.0-166-generic
 Operating System: Ubuntu 20.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 1.925GiB
 Name: trevor
 ID: 7ce1107e-baff-4b96-a3c6-846b937a96de
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://docker-mirror.active-group.de/
 Live Restore Enabled: false

WARNING: No swap limit support
config.toml contents
concurrent = 20
check_interval = 2
shutdown_timeout = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "ec2-docker-autoscaler"
  limit = 10
  url = "https://gitlab.active-group.de"
  id = 16
  token = "FOOBAR"
  token_obtained_at = 2023-07-07T14:14:03Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "docker-autoscaler"
  environment = ["DOCKER_TLS_CERTDIR="]
  [runners.docker]
    tls_verify = false
    image = "alpine:latest"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = true
    shm_size = 0
  [runners.autoscaler]
    capacity_per_instance = 1
    max_use_count = 1
    max_instances = 10
    plugin = "/etc/gitlab-runner/fleeting-plugin-aws-linux-amd64"
    [runners.autoscaler.plugin_config]
      credentials_file = "/etc/gitlab-runner/aws-credentials"
      config_file = "/etc/gitlab-runner/aws-config"
      name = "gitlab-autoscaling-group"
    [runners.autoscaler.connector_config]
      username = "ubuntu"
      use_static_credentials = false
      use_external_addr = true

Used GitLab Runner version

Version:      16.5.0
Git revision: 853330f9
Git branch:   16-5-stable
GO version:   go1.20.10
Built:        2023-10-20T15:57:21+0000
OS/Arch:      linux/amd64