CI runner should avoid listing remote refs when it can
Upstreaming Git patches has its own issue now: #806 (closed)
We will wrap up this issue by doing the following:
scalability_ci_fetch_shafeature flag gitlab-org/gitlab!51978 (merged)
- update CI documentation with reference to our experiences in this issue gitlab-org/gitlab!52225 (merged)
production change request to set
--no-tagsfor gitlab-com/www-gitlab-com production#3369 (closed)
We now believe we have found two server side performance issues in
git fetch, that get worse when there are many refs in the repository. Combined with the large number of CI jobs that fetch gitlab-org/gitlab this adds up to a lot of wasted CPU on file-cny-01. If we can fix these performance issues we probably no longer need to change CI runner behavior.
We are working on Git patches that should solve these performance problems. #746 (comment 487602722)
When our CI runner runs
git fetch makes 3 HTTP requests to GitLab. If we change that
git fetch command we can have the same result with 2 HTTP requests.
We know that
git fetch traffic from GitLab CI can cause considerable CPU pressure on Gitaly servers. This traffic has two components: refs and objects. I think that at least some of the time, we can avoid the refs traffic.
Here is a recent perf CPU flamegraph from file-01-cny, the Gitaly server that hosts gitlab-org/gitlab. At the time this profile was recorded the server was at 100% CPU utilization.
Notice how 18% of CPU time is spent in the function
cmd_upload_pack. This C function in Git corresponds to a request sent by Git clients. I believe that if we change the specific
git fetch command issued by the CI runner, we can prevent the corresponding
ls_refs call on the server, saving up to 18% of CPU.
We would change this command:
git fetch origin refs/pipelines/XXX:refs/pipelines/XXX
To this command:
git fetch -n origin $COMMIT_SHA:refs/pipelines/XXX
Video demo: https://youtu.be/P6iU6uVSEvo