CI runner should avoid listing remote refs when it can
# Update 2020-01-19 Upstreaming Git patches has its own issue now: https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/806 We will wrap up this issue by doing the following: - [x] remove `scalability_ci_fetch_sha` feature flag https://gitlab.com/gitlab-org/gitlab/-/merge_requests/51978 - [x] update [CI documentation](https://docs.gitlab.com/ee/ci/large_repositories/#git-fetch-extra-flags) with reference to our experiences in this issue https://gitlab.com/gitlab-org/gitlab/-/merge_requests/52225 - [x] production change request to set `--no-tags` for gitlab-com/www-gitlab-com https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3369 # Update 2020-01-18 We now believe we have found two server side performance issues in `git fetch`, that get worse when there are many refs in the repository. Combined with the large number of CI jobs that fetch gitlab-org/gitlab this adds up to a lot of wasted CPU on file-cny-01. If we can fix these performance issues we probably no longer need to change CI runner behavior. We are working on Git patches that should solve these performance problems. https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746#note_487602722 # Original issue When our CI runner runs `git fetch`, `git fetch` makes 3 HTTP requests to GitLab. If we change that `git fetch` command we can have the same result with 2 HTTP requests. ## Details We know that `git fetch` traffic from GitLab CI can cause considerable CPU pressure on Gitaly servers. This traffic has two components: refs and objects. I think that at least some of the time, we can avoid the refs traffic. Here is a recent [perf CPU flamegraph](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3104#note_466376148) from file-01-cny, the Gitaly server that hosts gitlab-org/gitlab. At the time this profile was recorded the server was at 100% CPU utilization. ![Screenshot_2020-12-18_at_15.02.06](/uploads/f3b08003261786e9b7dc8ca7a5727fe3/Screenshot_2020-12-18_at_15.02.06.png) Notice how 18% of CPU time is spent in the function `ls_refs` under `cmd_upload_pack`. This C function in Git corresponds to a request sent by Git clients. I believe that if we change the specific `git fetch` command issued by the CI runner, we can prevent the corresponding `ls_refs` call on the server, saving up to 18% of CPU. We would change this command: ``` git fetch origin refs/pipelines/XXX:refs/pipelines/XXX ``` To this command: ``` git fetch -n origin $COMMIT_SHA:refs/pipelines/XXX ``` Video demo: https://youtu.be/P6iU6uVSEvo
issue