Sporadic Error in Build Jobs: "error: could not write config file [...] *.tmp\git-template\config: Permission denied"
Summary
Our build jobs sporadically fail with a "Permission denied" error. This happens immediately, before our own script starts, during the git fetch operation. (see log below)
We have 12 Runners that are all exactly the same Windows VM setup that are reset weekly to a point long before this issue first started appearing. There is no pattern as to which runner is affected and when it happens. This problem seems to occur totally randomly. It happened a few times in 2023, but since the end of January, it's happening multiple times a day. There seems to be no correlation between the issue occurring and changes to our infrastructure (e.g. Gitlab/Runner Update). We've updated out GitLab and Runners on Jan 14 and this became a daily occurrence on Jan 29.
As this happens even before the pre-build cleanup is performed, these jobs leave outdated artifacts which are then used by subsequent jobs which leads to incorrect data being sent to our other systems.
A retry usually helps to "fix" the issue, but we lose a lot of time, as our developers usually just start pipelines and then come back a while later, only to find out that they failed.
I found some other posts with a similar error, but they all seemed to be "could not lock" instead of "could not write" and they were having this problem every single time, instead of just sporadically like in our case.
We're running out of ideas and the current plan is to build a workaround that somehow detects this type of failure and automatically restarts the job...
The Jobs run under a Windows gMSA-Account
Relevant logs and/or screenshots
The Job Log looks like this:
Running with gitlab-runner 16.8.1 (a6097117)
on runner-VM-FX-GLR-6 %%: , system ID: %%
Preparing the "shell" executor 00:00
Using Shell (powershell) executor...
Preparing environment 00:00
Running on VM-FX-GLR-6...
Getting source from Git repository 00:01
Fetching changes...
error: could not write config file C:\Gitlab-Runner\builds\%%\repo.tmp\git-template\config: Permission denied
Uploading artifacts for failed job 00:05
Version: 16.8.1
Git revision: a6097117
Git branch: 16-8-stable
GO version: go1.21.5
Built: 2024-02-15T18:34:46+0000
OS/Arch: windows/amd64
Uploading artifacts...
Runtime platform arch=amd64 os=windows pid=8316 revision=a6097117 version=16.8.1
%%: found 4 matching artifact files and directories
[...]
%%: found 1 matching artifact files and directories
Uploading artifacts as "archive" to coordinator... 201 Created id=921858 responseStatus=201 Created token=64_Csaoe
Cleaning up project directory and file based variables 00:01
ERROR: Job failed: exit status 4
Config for the build job: (The runners are tagged to only run this job)
ci_build_job:
stage: cibuild
allow_failure:
exit_codes: %%
script:
- [...]
artifacts:
paths:
- [...]
when: always
expire_in: 1 week
tags:
- %%
rules:
- if: '$CI_PIPELINE_SOURCE == "push"'
when: never
- if: '$CI_MERGE_REQUEST_LABELS =~ /^.*p:Hotfix.*/'
when: never
- if: '$CI_MERGE_REQUEST_TITLE =~ /^Draft.*/'
when: manual
- when: on_success
Runner config:
concurrent = 1
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "runner-VM-FX-GLR-1"
url = "%%%"
token = "%%%"
executor = "shell"
shell = "powershell"
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
GitLab Versions
GitLab 16.8.4
GitLab Runner 16.8.1