Skip to content

housekeeping: Use more accurate data to inform heuristics

This MR switches the housekeeping heuristical optimization strategy to use more accurate data to operate on:

  • We use the actual number of loose objects in the repository, instead of the extrapolated one. Counting all loose objects is going to be a bit slower, but it likely won't matter in the first place.
  • We decide whether to do a full repack or not based on the combined size of all packfiles, instead of just the largest one. The previous heuristic wasn't all that sensible given that the time to do a full repack is not dictated by the largest packfile, but really by the size of all packfiles combined.

Despite being more accurate, the ultimate goal of this is to bring those counting functions closer to git-count-objects(1). This will allow us to eventually move those functions over into git/stats to replace the external git-count-objects(1) process with our own, self-rolled logic.

Merge request reports