CI_PROJECT_DIR not cleared, dangling symlinks from old build causes `git init` to fail
Summary
Starting with gitlab-runner 1.9.* all our builds fail before even fetching the repository.
After a bit of troubleshooting it was found that git init
fails due to a dangling symlink within CI_PROJECT_DIR from a previous build that it tries to follow.
This happens without any output at all of what command was executed. The reason the symlink is dangling is because this part of the job is executed on gitlab-runner-helper (which by the way is completely hidden from the end user, making troubleshooting quite difficult as it is easily confused with a project bug in .gitlab-ci.yaml).
Steps to reproduce
Run a job which creates a symlink in the project directory to a non-existent folder on gitlab-runner-helper (e.g. to /go
as in our case).
The next job in that project executed on that runner will fail.
Example Project
n/a
What is the current bug behavior?
Running with gitlab-runner 11.9.2 (fa86510e)
on server ffa325b0
Using Docker executor with image golang:latest ...
Pulling docker image golang:latest ...
Using docker image sha256:83e8267be041b3ddf6a5792c7e464528408f75c446745642db08cfe4e8d58d18 for golang:latest ...
Running on runner-ffa325b0-project-1-concurrent-0 via 0cf373a18e09...
fatal: Invalid path '/go': No such file or directory
ERROR: Job failed: exit code 1
With pre_clone_script = "set -v"
:
Running with gitlab-runner 11.9.2 (fa86510e)
on server ffa325b0
Using Docker executor with image golang:latest ...
Pulling docker image golang:latest ...
Using docker image sha256:83e8267be041b3ddf6a5792c7e464528408f75c446745642db08cfe4e8d58d18 for golang:latest ...
Running on runner-ffa325b0-project-1-concurrent-0 via 0cf373a18e09...
$ set -v
export GIT_LFS_SKIP_SMUDGE=1
$'mkdir' "-p" "/builds/<project>.tmp/git-template"
$'git' "config" "-f" "/builds/<project>.tmp/git-template/config" "fetch.recurseSubmodules" "false"
$'git' "init" "/builds/<project>" "--template" "/builds/<project>.tmp/git-template"
fatal: Invalid path '/go': No such file or directory
ERROR: Job failed: exit code 1
What is the expected correct behavior?
no error, the change set should be checked out and job start
Relevant logs and/or screenshots
see above
Output of checks
Tested and fails on 1.9.0, 1.9.1, 1.9.2
Results of GitLab environment info
n/a
Results of GitLab application Check
n/a
Possible fixes
Remove the project directory as a pre_clone_script
e.g:
...
[[runners]]
...
pre_clone_script = "set -v; rm -rfv $CI_PROJECT_DIR; mkdir -pv $CI_PROJECT_DIR"
...
It would have helped a lot for the troubleshooting of this if the "implicit" and "hidden" commands run as part of the CI job was written to the terminal as well (perhaps in a collapsed way like TravisCI does). Until that is in place I would recommend a set -v
to be added to each such script.