Skip to content

Runner leaving lock files in refs/remotes if job cancelled during checkout

Summary

The runner is leaving lock files in .git/refs/remotes/... if a job is cancelled mid-checkout. I've seen here that lock files in the root of the git directory are cleaned, but no attempt is made to clean out lock files under refs/remotes/. I'd submit a patch for this, but I don't fully understand the structure of the refs directory enough to go wandering around deleting lock files.

Fetching changes...
Reinitialized existing Git repository in /builds/foo/foo/.git/
From https://repo.example.com/foo/foo
 - [deleted]             (none)     -> origin/test_branch
 * [new ref]             refs/pipelines/202942 -> refs/pipelines/202942
error: cannot lock ref 'refs/remotes/origin/master': Unable to create '/builds/foo/foo/.git/refs/remotes/origin/master.lock': File exists.

Steps to reproduce

Having a big repo helps:

  • run a job
  • cancel it mid checkout
  • run a second job on the same runner

Actual job is unimportant.

Actual behavior

The error message shown above is displayed in the job log, the job fails.

Expected behavior

The runner should clean up all stale lock files before running the git fetch

Relevant logs and/or screenshots

See log msg above.

Environment description

OS: Ubuntu OS Version: 16.04

We are using runners in AWS, we don't create a config.toml ourselves, rather we use command line options to control the runner:

sudo gitlab-runner register \
    --non-interactive \
    --url="https://repo.foo.com" \
    --registration-token="magictoken" \
    --locked=false \
    --run-untagged="false" \
    --tag-list="aws, docker, $(hostname), large, xlarge" \
    --executor="docker" \
    --docker-image="alpine:latest" \
    --docker-privileged=true \
    --docker-network-mode="host" \
    --docker-volumes="/var/run/docker.sock:/var/run/docker.sock" \
    --docker-volumes="/srv/gitlab-runner/cache/govendor:/root/go/.cache/govendor:rw" \
    --docker-volumes="/srv/gitlab-runner/cache/bazel:/root/.cache/bazel:rw" \
    --docker-volumes="/srv/gitlab-runner/cache/elm:/root/.elm-install:rw" \
    --docker-volumes="/srv/gitlab-runner/cache/yarn:/usr/local/share/.cache/yarn/v1:rw" \
    --docker-volumes="/srv/gitlab-runner/cache/npm:/root/.npm:rw"

Used GitLab Runner version

Runner version: 12.5.0

Running with gitlab-runner 12.5.0 (577f813d)
  on ip-10-6-1-223
Using Docker executor with image ....
Authenticating with credentials from job payload (GitLab Registry)
Pulling docker image .....
Using docker image ... 

Possible fixes

Some more cleanup is needed here

Edited by Robin Kearney