Why is GarbageCollect called more often than RepackIncremental?

GitLab's repository housekeeping service Project::HousekeepingService is responsible for running RepackIncremental (git repack), RepackFull (git repack -ad) and GarbageCollect (git gc) at appropriate times.

When we designed Project::HousekeepingService the intention was to run it every 10 pushes, with the following cadence: 9x RepackIncremental, RepackFull, 9x RepackIncremental, GarbageCollect. However, looking at grafana I see the following request rates:

  • GarbageCollect 11/s
  • RepackIncremental 1/s
  • RepackFull 0.15/s

This is not right: the most frequently called RPC of these three should be RepackIncremental by a factor 18.

Edit: I misremembered the exact pattern, or it changed. It does not change that out of these three, GarbageCollect should be the least frequently called RPC. This test explains the expect distribution:

https://gitlab.com/gitlab-org/gitlab-ce/blob/52758b929fa71540f97cd241d1668ade795306a1/spec/services/projects/housekeeping_service_spec.rb#L70-95

Edited by Jacob Vosmaer