Need help with a script to find and delete keep around refs for old branches
Support Request for the Gitaly Team
The goal is to keep these requests public. However, if customer information is required to the support request, please be sure to mark this issue as confidential.
This request template is part of Gitaly Team's intake process.
Author Checklist
-
Reached out to #spt_pod_git prior creating issue: https://gitlab.slack.com/archives/C04D5FUADAM/p1715679731746359 -
Fill out customer information section -
Provide an detail summary under Additional Information:
-
-
Severity realistically set -
Provided detailed problem description -
Provided detailed troubleshooting performed -
Clearly articulated what is needed from the Gitaly team to support your request by filling out the What specifically do you need from the Gitaly team
Customer Information
Salesforce Link: 500PL000006TP4QYAW
Zendesk Ticket: https://gitlab.zendesk.com/agent/tickets/505498
Installation Size: 1k
Architecture Information: 1k omnibus
Slack Channel: #esc_adyen
Additional Information:
I am looking for a script to find and delete keep around refs for old branches. We have an escalated customer that has a a 7gb repo with 9.5 million keep around ref. We are working with the Geo team regarding a related issue and it would be really helpful to prune the keep-around refs for old branches. TIA!
Support Request
Severity
Customer is in an escalated state.
Problem Description
- What version is the customer running? 16.7.6
- What is the customers architecture? bare metal
- What is the GitLab architecture? 1k
- Are networking filesystems (like NFS) used? no
- What are the filesystems? ext4
- What are the OS and kernel versions? Oracle Linux Server 9.3
- How are backup, replication, HA, etc performed? 1k backup. on 1k Geo Primary. They are trying to use GEO for replication to a 3k secondary, but are running into issue with this repo, hence the request for the Geo team for a way to prune the keep-around refs for old branches.
- Are they using Gitaly Cluster? Not for the Geo Primary.
- How many Gitaly Clusters the customer has? 1, and it's the in the Geo Secondary site, where this repo will not replicate.
- How many Gitaly nodes per cluster the customer has configured? 3
- Has the customer, or some tools/script (backup, synchro, replication, HA, etc) they set up, directly interacted with the Git repository?
- using
rsync
or similar tools? no -
git
commands? no - history changing tools (like git filter-repo)? no
- using
- Does the customer have any hooks configured? no
- If this is a performance issue, what does the Git workflow look like? Not a performance issue.
- What are the customer RPS for push and pulls? (use fast-stats)
- How many pipelines does the customer run? Several per minute.
- How many users are working on the instance? 2500
- How big are the repositories? Do they have monorepos? Yes, 7gb, 9.5 million keep-around-refs
- Provide the output of git-sizer. I have asked for this from the customer.
Troubleshooting Performed
[danielg@ai208037 adyen-main.git] (gitlab-c-ams1o) $ ~/git-sizer --verbose
Processing blobs: 3089751
Processing trees: 32734669
Processing commits: 11038297
Matching commits to trees: 11038297
Processing annotated tags: 7002
Processing references: 10643194
| Name | Value | Level of concern |
| ---------------------------- | --------- | ------------------------------ |
| Overall repository size | | |
| * Commits | | |
| * Count | 11.0 M | ********************** |
| * Total size | 4.95 GiB | ********************* |
| * Trees | | |
| * Count | 32.7 M | ********************* |
| * Total size | 134 GiB | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| * Total tree entries | 3.56 G | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| * Blobs | | |
| * Count | 3.09 M | ** |
| * Total size | 136 GiB | ************** |
| * Annotated tags | | |
| * Count | 7.00 k | |
| * References | | |
| * Count | 10.6 M | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| * Branches | 1.46 k | |
| * Tags | 2.23 k | |
| * Remote-tracking refs | 394 | |
| * Other | 10.6 M | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| | | |
| Biggest objects | | |
| * Commits | | |
| * Maximum size [1] | 2.78 MiB | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| * Maximum parents [2] | 10 | * |
| * Trees | | |
| * Maximum entries [3] | 6.77 k | ****** |
| * Blobs | | |
| * Maximum size [4] | 785 MiB | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| | | |
| History structure | | |
| * Maximum history depth | 485 k | |
| * Maximum tag depth [5] | 1 | |
| | | |
| Biggest checkouts | | |
| * Number of directories [6] | 97.5 k | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| * Maximum path depth [7] | 23 | ** |
| * Maximum path length [7] | 236 B | ** |
| * Number of files [8] | 299 k | ***** |
| * Total size of files [9] | 3.48 GiB | *** |
| * Number of symlinks [8] | 273 | |
| * Number of submodules [10] | 2 | |
[1] aaf93dd22a7e4c9482d98051041bfd23ddf8aabf
[2] 763bfe4db4dc2a88d329b3ecbcccdc68f0b394ce
[3] 408427f72580aae2a490372a597a277ad3008471 (refs/keep-around/4007a031dd860bf1bde9a99994d8cde66afc4098:checkoutshopper/src/main/webapp/images/logos/small)
[4] 32bd4ff81669fa1bb31186872f957d84c776269c (refs/keep-around/88ed3c02743cda52adc8a6e2fd080b0f3a0176c0:insightsplc/druid/ingestion-specs/paymentlifecyclebase.json)
[5] ec704da7f503713e752a36ade25726048a23e9a8 (refs/archive/release-tags/V1_150p0)
[6] ebf799f04c3a9f18a39496d0797577f78cccce18 (refs/keep-around/82d4e8642c9f6ac8f407c0e84a929a35b1442cdb^{tree})
[7] da28b1033020dd8817b14777d32b7ff205ad49fc (refs/keep-around/4007a031dd860bf1bde9a99994d8cde66afc4098^{tree})
[8] f9b9dc002b59f459755348e485ca9f1341e90f71 (refs/keep-around/f187aa90637019b46fe76a3b69f2c1751d7e3de6^{tree})
[9] 15b4b950e05706e00fe73cebcdb75939e576bf8f (refs/keep-around/88ed3c02743cda52adc8a6e2fd080b0f3a0176c0^{tree})
[10] d7486f00b1394fc243b3c1b1a3643b12eb479080 (refs/archive/release-branches/V1_185:apidocs/WebContent)
https://gitlab.com/gitlab-com/geo-customers/-/issues/207#note_1902862503
What specifically do you need from the Gitaly team
We need a script to find and delete keep around refs for old branches. Can you help with this?
/cc @mjwood @andrashorvath @jcaigitlab @john.mcdonnell @gerardo