CI runner should avoid listing remote refs when it can
Update 2020-01-19
Upstreaming Git patches has its own issue now: #806 (closed)
We will wrap up this issue by doing the following:
-
remove scalability_ci_fetch_sha
feature flag gitlab-org/gitlab!51978 (merged) -
update CI documentation with reference to our experiences in this issue gitlab-org/gitlab!52225 (merged) -
production change request to set --no-tags
for gitlab-com/www-gitlab-com production#3369
Update 2020-01-18
We now believe we have found two server side performance issues in git fetch
, that get worse when there are many refs in the repository. Combined with the large number of CI jobs that fetch gitlab-org/gitlab this adds up to a lot of wasted CPU on file-cny-01. If we can fix these performance issues we probably no longer need to change CI runner behavior.
We are working on Git patches that should solve these performance problems. #746 (comment 487602722)
Original issue
When our CI runner runs git fetch
, git fetch
makes 3 HTTP requests to GitLab. If we change that git fetch
command we can have the same result with 2 HTTP requests.
Details
We know that git fetch
traffic from GitLab CI can cause considerable CPU pressure on Gitaly servers. This traffic has two components: refs and objects. I think that at least some of the time, we can avoid the refs traffic.
Here is a recent perf CPU flamegraph from file-01-cny, the Gitaly server that hosts gitlab-org/gitlab. At the time this profile was recorded the server was at 100% CPU utilization.
Notice how 18% of CPU time is spent in the function ls_refs
under cmd_upload_pack
. This C function in Git corresponds to a request sent by Git clients. I believe that if we change the specific git fetch
command issued by the CI runner, we can prevent the corresponding ls_refs
call on the server, saving up to 18% of CPU.
We would change this command:
git fetch origin refs/pipelines/XXX:refs/pipelines/XXX
To this command:
git fetch -n origin $COMMIT_SHA:refs/pipelines/XXX
Video demo: https://youtu.be/P6iU6uVSEvo