Poor Read Distribution in Gitaly Cluster running Monorepo
Support Request for the Gitaly Team
The goal is to keep these requests public. However, if customer information is required to the support request, please be sure to mark this issue as confidential.
This request template is part of Gitaly Team's intake process.
Author Checklist
-
Reached out to #spt_pod_git prior creating issue (please provide link) -
Fill out customer information section -
Provide an detail summary under Additional Information:
-
-
Severity realistically set -
Provided detailed problem description -
Provided detailed troubleshooting performed -
Clearly articulated what is needed from the Gitaly team to support your request by filling out the What specifically do you need from the Gitaly team
Customer Information
Salesforce Link: https://gitlab.my.salesforce.com/0014M00001fbiHe Zendesk Ticket: https://gitlab.zendesk.com/agent/tickets/447458 Installation Size: 3K CNH Reference Architecture Architecture Information:
Slack Channel:
Additional Information:
Support Request
Severity
Severity 2
Problem Description
Customer is runing a monorepo and facing performance issues, I am trying to optimize the monorepo performance. i have noticed that gitlay nodes are are having poor read distribution ie most of the traffic is going to Gitaly 01, which is overwhelming node01 while the other 2 nodes are idle, refer to the chart below. the foloowing feature flags are already on: merge_request_cleanup_ref_worker_async, pipeline_cleanup_ref_worker_async, pipeline_delete_gitaly_refs_in_batches, merge_request_delete_gitaly_refs_in_batches
- What version is the customer running? 16.8.2
- What is the customers architecture? 3K CNH
- What is the GitLab architecture?
- Are networking filesystems (like NFS) used?
- What are the filesystems?
- What are the OS and kernel versions?
- How are backup, replication, HA, etc performed?
- Are they using Gitaly Cluster?
- How many Gitaly Clusters the customer has? 1
- How many Gitaly nodes per cluster the customer has configured? 3
- Has the customer, or some tools/script (backup, synchro, replication, HA, etc) they set up, directly interacted with the Git repository?
- using
rsync
or similar tools? Yes -
git
commands? - history changing tools (like git filter-repo)?
- using
- Does the customer have any hooks configured?
- If this is a performance issue, what does the Git workflow look like?
- What are the customer RPS for push and pulls? (use fast-stats)
- How mamy pipelines does the customer run?
- How many users are working on the instance?
- How big are the repositories? Do they have monorepos?
- Provide the output of git-sizer.
Troubleshooting Performed
Applied optimization recommendations from the official documentation: https://docs.gitlab.com/ee/user/project/repository/monorepos/
What specifically do you need from the Gitaly team
Assistance in resolving the poor read distribution of gitaly nodes.
/cc @mjwood @andrashorvath @jcaigitlab @john.mcdonnell @gerardo