Skip to content

git/housekeeping: Reduce frequency of full repacks

Due to different reasons we need to perform regular full repacks both in normal repositories and in object pools. These full repacks are guided by a cooldown period so that we'll perform them only in case the last full repack is longer ago than the cooldown period.

For object pool, the reason we do full repacks is to refresh deltas so that they again honor our delta islands. This is not all that important to users and should not be noticeable in general when we do this less frequently. Consequentially, we only perform a full repack once every week.

For normal repositories full repacks are mostly done in order to guarantee that objects will get evicted into cruft packs so that they can be expired and thus deleted. This is something that both we and our customers care about given that it can be directly equated to disk space that is required. It is thus prudent that we perform this on a more-regular basis so that objects get deleted quickly.

That being said, there is interplay between the stale object grace period (which is 14 days) and the cooldown periods (which is 1 day). Effectively, assuming that a repository gets daily optimization jobs, and with the knowledge in mind that we need to perform two full repacks in order to evict an unreachable object, objects will get deleted after 14 to 15 days:

1. The first full repack on days 0 to 1 will evict the unreachable
   object into a cruft pack.

2. We wait 14 days and will thus land either on day 14 or 15.

3. We perform a second full repack to expire the object part of the
   cruft pack.

This interplay between both periods is important, because it means that we can do compromises between tuning the cooldown period and stale object grace period without actually impacting the median time to deletion:

- Increasing the cooldown period means we need to perform less
  regular full repacks, thus saving on resources. Conversely,
  decreasing the cooldown means more regular full repacks and thus
  using more resources.

- Increasing the grace period means we'll have a longer time to
  avoid racy access to Git objects with the downside of more disk
  space use. Decreasing the grace period means we are more likely to
  hit racy access to Git objects, but evict objects and thus save
  disk space more regularly.

Now optimizing the cooldown period is something we're very keen to do because it directly impacts how much resources we and our customers need to provision for machines. On the other hand, the grace period is mostly there to avoid racy access to Git objects, and two weeks feels excessive for that.

So long story short, this commit changes our strategy to increase the full repack cooldown period to 5 days instead of 1 day while decreasing the stale object grace period from 14 days to 7 days to counteract the longer time-to-deletion for stale objects. This means objects will get deleted 12 to 17 days afer becoming unreachable, with a median value of 14.5 days. This is the exact same median value as previously, so the time-to-deletion should not change in practice. But on the other hand, it does allow us to greatly save on compute resources by reducing the frequency we perform full repacks to one fifth.

Furthermore, as the repack cooldown period for normal repositories and object pools are almost the same now, let's merge them so that we have one less special case to think about.

Closes #2774 (closed). Related to #5031 (closed).

Edited by Patrick Steinhardt

Merge request reports