Why is GarbageCollect called more often than RepackIncremental?
GitLab's repository housekeeping service Project::HousekeepingService is responsible for running RepackIncremental (git repack), RepackFull (git repack -ad) and GarbageCollect (git gc) at appropriate times.
When we designed Project::HousekeepingService the intention was to run it every 10 pushes, with the following cadence: 9x RepackIncremental, RepackFull, 9x RepackIncremental, GarbageCollect. However, looking at grafana I see the following request rates:
- GarbageCollect 11/s
- RepackIncremental 1/s
- RepackFull 0.15/s
This is not right: the most frequently called RPC of these three should be RepackIncremental by a factor 18.
Edit: I misremembered the exact pattern, or it changed. It does not change that out of these three, GarbageCollect should be the least frequently called RPC. This test explains the expect distribution: