Skip to content
GitLab
Next
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • scalability scalability
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 270
    • Issues 270
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 1
    • Merge requests 1
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Code review
    • Insights
    • Issue
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.comGitLab.com
  • GitLab Infrastructure TeamGitLab Infrastructure Team
  • scalabilityscalability
  • Issues
  • #746
You need to sign in or sign up before continuing.
Closed
Open
Issue created Dec 18, 2020 by Jacob Vosmaer@jacobvosmaer-gitlabOwner3 of 3 checklist items completed3/3 checklist items

CI runner should avoid listing remote refs when it can

Update 2020-01-19

Upstreaming Git patches has its own issue now: #806 (closed)

We will wrap up this issue by doing the following:

  • remove scalability_ci_fetch_sha feature flag gitlab-org/gitlab!51978 (merged)
  • update CI documentation with reference to our experiences in this issue gitlab-org/gitlab!52225 (merged)
  • production change request to set --no-tags for gitlab-com/www-gitlab-com production#3369 (closed)

Update 2020-01-18

We now believe we have found two server side performance issues in git fetch, that get worse when there are many refs in the repository. Combined with the large number of CI jobs that fetch gitlab-org/gitlab this adds up to a lot of wasted CPU on file-cny-01. If we can fix these performance issues we probably no longer need to change CI runner behavior.

We are working on Git patches that should solve these performance problems. #746 (comment 487602722)

Original issue

When our CI runner runs git fetch, git fetch makes 3 HTTP requests to GitLab. If we change that git fetch command we can have the same result with 2 HTTP requests.

Details

We know that git fetch traffic from GitLab CI can cause considerable CPU pressure on Gitaly servers. This traffic has two components: refs and objects. I think that at least some of the time, we can avoid the refs traffic.

Here is a recent perf CPU flamegraph from file-01-cny, the Gitaly server that hosts gitlab-org/gitlab. At the time this profile was recorded the server was at 100% CPU utilization.

Screenshot_2020-12-18_at_15.02.06

Notice how 18% of CPU time is spent in the function ls_refs under cmd_upload_pack. This C function in Git corresponds to a request sent by Git clients. I believe that if we change the specific git fetch command issued by the CI runner, we can prevent the corresponding ls_refs call on the server, saving up to 18% of CPU.

We would change this command:

git fetch origin refs/pipelines/XXX:refs/pipelines/XXX

To this command:

git fetch -n origin $COMMIT_SHA:refs/pipelines/XXX

Video demo: https://youtu.be/P6iU6uVSEvo

Edited Jan 26, 2021 by Jacob Vosmaer
Assignee
Assign to
Time tracking