Why is GarbageCollect called more often than RepackIncremental?
GitLab's repository housekeeping service Project::HousekeepingService
is responsible for running RepackIncremental (git repack
), RepackFull (git repack -ad
) and GarbageCollect (git gc
) at appropriate times.
When we designed Project::HousekeepingService
the intention was to run it every 10 pushes, with the following cadence: 9x RepackIncremental, RepackFull, 9x RepackIncremental, GarbageCollect. However, looking at grafana I see the following request rates:
- GarbageCollect 11/s
- RepackIncremental 1/s
- RepackFull 0.15/s
This is not right: the most frequently called RPC of these three should be RepackIncremental by a factor 18.
Edit: I misremembered the exact pattern, or it changed. It does not change that out of these three, GarbageCollect should be the least frequently called RPC. This test explains the expect distribution: