Build job fails with exit code 4294967295 (Windows docker executor)

Summary

Sometimes a build job fails randomly with the exit code 4294967295.

  • It does not happen always, but frequently (maybe 1 out of 3 builds)
  • It always happens at a different time during the build, which is why I provided two different build logs as an example
  • It happened with different windows docker images (python, mcr.microsoft.com/dotnet/framework/sdk:3.5)
  • The windows event log shows that there is a problem with the username/password used in the container

Steps to reproduce

.gitlab-ci.yml
stages:
  - build
  - test
  - deploy

build-app:
  image: mcr.microsoft.com/dotnet/framework/sdk:3.5
  stage: build
  script:
    - ***
    - "msbuild /restore /p:Configuration=Release /p:BuildNumber=$CI_PIPELINE_IID"
  coverage: '/= \[\["Total",\d+\.?\d+/'
  tags:
    - DOCKER
  artifacts:
    paths:
      - build/***
      - testresults/coverage/
    expire_in: 3d

build-docs:
  image: python
  stage: test
  script:
    - pip install -i *** -r doc/technical/requirements.txt
    - cd doc/technical/
    - mkdocs build
  artifacts:
    expire_in: 3h
    paths:
      - doc/technical/generated_docs/
  tags:
    - DOCKER

pages:
  stage: deploy
  script:
    - ***
  artifacts:
    paths:
    - public
  tags:
    - DOCKER
  only:
    - basic

Actual behavior

The job fails with the exit code 4294967295.

Expected behavior

The build should succeed.

Relevant logs and/or screenshots

First example log of a job which failed
Running with gitlab-runner 12.9.0 (4c96e5ad)
   on ***
Preparing the "docker-windows" executor
00:02
 Using Docker executor with image python ...
 Pulling docker image python ...
 Using docker image sha256:8becec955a226b561eb0a5f6cce1d489225d98e70484539611bb3a674784b15b for python ...
Preparing environment
00:05
 Running on RUNNER-R5CFEJCI via 
 *** ...
Getting source from Git repository
00:08
 Fetching changes...
 Reinitialized existing Git repository in c:/builds/***/.git/
 Checking out 0ff746a6 as development...
 ...
 git-lfs/2.7.1 (GitHub; windows amd64; go 1.11.5; git 6b7fb6e3)
 Skipping Git submodules setup
Restoring cache
00:04
Downloading artifacts
00:22
 Version:      12.9.0
 Git revision: 4c96e5ad
 Git branch:   12-9-stable
 GO version:   go1.13.8
 Built:        2020-03-20T12:20:41+0000
 OS/Arch:      windows/amd64
 Downloading artifacts for build-app (24058096)...
 WARNING: Failed to load system CertPool: crypto/x509: system root pool is not available on Windows 
Uploading artifacts for failed job
00:05
 ERROR: Job failed: exit code 4294967295
Second example of a job which failed
 Running with gitlab-runner 12.9.0 (4c96e5ad)
   on KHE VLAN - Windows Docker R5cFejci
Preparing the "docker-windows" executor
00:01
 Using Docker executor with image mcr.microsoft.com/dotnet/framework/sdk:3.5 ...
 Pulling docker image mcr.microsoft.com/dotnet/framework/sdk:3.5 ...
 Using docker image sha256:db864339786b5596083bd33a2305c9780052cf9ef57cd12cf67326fb703159c0 for mcr.microsoft.com/dotnet/framework/sdk:3.5 ...
Preparing environment
00:06
 Running on RUNNER-R5CFEJCI via 
 ***...
Getting source from Git repository
00:07
 Fetching changes...
 Reinitialized existing Git repository in c:/builds/***/.git/
 From ***
  * [new ref]         refs/pipelines/5271159 -> refs/pipelines/5271159
  + 8718ff5...782add3 development            -> origin/development  (forced update)
 Checking out 782add36 as development...
 ***
 git-lfs/2.7.1 (GitHub; windows amd64; go 1.11.5; git 6b7fb6e3)
 Skipping Git submodules setup
Restoring cache
00:06
Downloading artifacts
00:04
Running before_script and script
07:49
 $ msbuild -t:restore ***
 Microsoft (R) Build Engine version 16.5.0+d4cbfca49 for .NET Framework
 Copyright (C) Microsoft Corporation. All rights reserved.
 Build started 4/7/2020 9:24:19 PM.
 Project "C:\builds\***" on node 1 (Restore target(s)).
 Restore:
   ***
   Installing Castle.Core 3.3.0.
   Installing RestSharp 106.6.10.
   Installing TestStack.White 0.13.3.
Uploading artifacts for failed job
00:06
 ERROR: Job failed: exit code 4294967295
The Windows-Event log
Log Name:      Application
Source:        docker
Date:          08.04.2020 15:35:20
Event ID:      11
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ***
Description:
exec's CreateProcess() failed [module=libcontainerd namespace=moby container=884856b814e3ff90ceb4f8aa16bbd3237610dfff868f790987a4bc2860ffdb8e exec=890c2275caf85218f144a0f28142f5f7f9e9f7b85ff6d799bd7eea52d34c72d0 error=container 884856b814e3ff90ceb4f8aa16bbd3237610dfff868f790987a4bc2860ffdb8e encountered an error during hcsshim::System::CreateProcess: failure in a Windows system call: The user name or password is incorrect. (0x52e) extra info: {"CommandLine":"cmd.exe /C \"ECHO 10.116.42.70    host.docker.internal \u003e\u003e %systemroot%\\system32\\drivers\\etc\\hosts \u0026 ECHO 10.116.42.70    gateway.docker.internal \u003e\u003e %systemroot%\\system32\\drivers\\etc\\hosts\"","User":"Administrator","WorkingDirectory":"/", [***]

Environment description

The GitLab runner is running on a Windows VM with Docker-Desktop installed.

docker info
Client:
 Debug Mode: false
 
Server:
 Containers: 2
  Running: 0
  Paused: 0
  Stopped: 2
 Images: 6
 Server Version: 19.03.8
 Storage Driver: windowsfilter
  Windows:
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics internal l2bridge l2tunnel nat null overlay private transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
 Swarm: inactive
 Default Isolation: process
 Kernel Version: 10.0 17763 (17763.1.amd64fre.rs5_release.180914-1434)
 Operating System: Windows Server 2019 Standard Version 1809 (OS Build 17763.1098)
 OSType: windows
 Architecture: x86_64
 CPUs: 2
 Total Memory: 6GiB
 Name: ***
 ID: ***
 Docker Root Dir: C:\ProgramData\Docker
 Debug Mode: true
  File Descriptors: -1
  Goroutines: 28
  System Time: 2020-04-08T16:50:20.6359141+02:00
  EventsListeners: 1
 HTTP Proxy: ***
 HTTPS Proxy: ***
 No Proxy: ***
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine
type config.toml
concurrent = 1
check_interval = 0
 
[session_server]
  session_timeout = 1800
 
[[runners]]
  name = "***"
  url = "***"
  token = "***"
  executor = "docker-windows"
  [runners.custom_build_dir]
  [runners.docker]
    tls_verify = false
    image = "mcr.microsoft.com/windows/servercore:1809"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["c:\\cache"]
    shm_size = 0
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
  [runners.custom]
    run_exec = ""

Used GitLab Runner version

Version:      12.9.0
Git revision: 4c96e5ad
Git branch:   12-9-stable
GO version:   go1.13.8
Built:        2020-03-20T13:02:39+0000
OS/Arch:      windows/amd64

Possible fixes

I never saw the described failure happen while I was logged in (via RDP) with the Administrator account on the Windows-Server where the Gitlab-Runner is installed. Maybe this helps to identify the root cause of the problem.