Skip to content
Snippets Groups Projects

git/housekeeping: Introduce geometric repacking strategy

Merged Patrick Steinhardt requested to merge pks-housekeeping-geometric-repacking-strategy into master
2 unresolved threads

One of the most important tasks performed by our repository housekeeping is repacking objects into packfiles such that they can be efficiently accessed. Unfortunately, this is also the most expensive part in general as it is both CPU- and IOPS-intensive. So even though it would be most efficient to always pack all objects into a single packfile, this is not feasible especially in the context of large repositories.

We are thus using a hybrid approach right now where we alternate between incremmental repacks that soak up all loose objects, and full repacks that repack all packfiles into a single packfile once there are too many of them. In highly active repositories this strategy is too inefficient though as we accumulate incremental packfiles fast and then need to do the full repack quite regularly. And just bumping the limits of how many packfiles are allowed until the next full repack is not ideal either, as the repository will become less efficient to serve the more packfiles there are.

To improve upon this usecase, Git v2.33.0 has introduced support for geometric repacking that is a tradeoff between incremental and full repacks: instead of repacking either only loose objects or everything at once, a geometric repack will arrange packfiles so that the end result forms a geometric sequence regarding the amount of objects contained in each of the packs. Like this, a geometric repack would typically only rewrite a small slice of packfiles.

In the best case, we'd shift to use geometric repacks exclusively in order to reduce the overhead of repository maintenance. But there are two reasons why we cannot do so:

- Geometric repacks don't take reachability into account and will
  always include all loose objects. As we need to expire objects
  over time it means that we have to do a full repack with cruft
  packs every now and then to evict unreachable objects.

- Geometric repacks don't take delta islands into account. We use
  delta islands to make sure that deltas are only created against
  objects which are part of the default refspec used by clients,
  namely branches and tags. We thus need to regularly do a full
  repack that "freshens" our delta islands.

We're thus forced to keep alternating between full and geometric repacks. But we can do the full repacks a whole lot less frequently than we used to do them before. As we don't have control over the number of packfiles anymore (it's controlled by the geometric sequence now), we instead need to use a different heuristic to decide whether to do one or the other. For now, we are going with time-based repacks where we only do a full repack in case a certain amount of time has passed. This effectively puts a rate-limit in place.

Implement this new strategy behind a feature flag.

Closes #4998 (closed).

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
62 63 WriteMultiPackIndex: true,
63 64 }
64 65
66 if featureflag.GeometricRepacking.IsEnabled(ctx) {
67 nonCruftPackfilesCount := s.info.Packfiles.Count - s.info.Packfiles.CruftCount
68
69 // It is mandatory for us that we perform regular full repacks in repositories so
70 // that we can evict objects which are unreachable into a separate cruft pack. So in
71 // the case where we have more than one non-cruft packfiles and the time since our
72 // last full repack is longer than the grace period we'll perform a full repack.
73 //
74 // This heuristic is simple on purpose: customers care about when objects will be
75 // declared as unreachable and when the pruning grace period starts as it impacts
76 // usage quotas. So with this simple policy we can tell customers that we evict and
77 // expire unreachable objects once per day.
  • Technically, repacking frequency also depends on how OptimizeRepository RPC is called. If there is no request during that time frame, it would take more than 1 day.

    Suggested change
    77 // expire unreachable objects once per day.
    77 // expire unreachable objects at most once per day.
  • Please register or sign in to reply
    • Thanks, @pks-gitlab. The MR looks neat and elegant. It adds a lot of contexts and reasons behind the heuristics of picking strategy. I have one nit, but it doesn't matter anyway. I'll approve and set MWPS.

      Although I involved in the reviews of housekeeping for a while, I feel like missing the big picture. I wonder if you can spend sometimes summarizing the direction of the housekeeping in doc/, especially the reasons why we move toward the current direction. Something like this MR's description is super helpful.

    • Although I involved in the reviews of housekeeping for a while, I feel like missing the big picture. I wonder if you can spend sometimes summarizing the direction of the housekeeping in doc/, especially the reasons why we move toward the current direction. Something like this MR's description is super helpful.

      @qmnguyen0711 I'm kind of torn on this, to be honest. I have recently written the blog post about repository maintenance which explains what the issues were with the old strategy. But it's rather a snapshot in time, and things have been evolving quite a lot in that context. And furthermore, we have the housekeeping documentation that is customer-facing which doesn't go into the nitty-gritty details, because ultimately customers should not need to care.

      What is still missing is an overview that details how exactly the current strategy looks like and why it looks that way. But the issue I have with this is that it would grow outdated quite fast if one forgot to update it. So instead, I have been trying to be as verbose as possible with the in-code comments to explain all the different choices I have been doing there. It's easier to update that documentation while at it.

      I'm not quite sure what exactly it is that you're searching for. Is it just an overall picture of the complete housekeeping strategy that we're using by now? If so, I could definitely try to implement something like that, but I'd rather want to do it as in-code documentation.

    • What is still missing is an overview that details how exactly the current strategy looks like and why it looks that way. But the issue I have with this is that it would grow outdated quite fast if one forgot to update it. So instead, I have been trying to be as verbose as possible with the in-code comments to explain all the different choices I have been doing there. It's easier to update that documentation while at it.

      Yeah, I agree. I guess it's a normal mental state where an area moves fast and changes fast. I really appreciate your effort when you spent a lot of time on in-code documentations, as well as commit messages. Combining commits and codes, I think I can get the full vision and reasons behind each change. Sorry for bragging about this :pray:

    • @qmnguyen0711 There really is no need to be sorry -- it is a valid concern if it is hard to understand the bigger picture, and I would love to improve the situation if required.

      As the driving force behind this kind of work that spans over many releases one tends to have a clear picture of what one is aiming for, up to the point where things are "obvious". "Obvious" of course typically only means obvious to oneself as compared to obvious for other involved parties, and this is something that is easy to miss.

      Also, even if I wrote a bunch of documentation, it does not mean that other involved parties actually know that this documentation even exists in the first place. They weren't part of all the steps, so it's on me if I didn't clearly communicate where to find them.

      So again, no need to be apologize in any way. Let me know if you think it would help to have documentation of the overall strategy and the motivations in our code and I'll happily provide it.

    • Please register or sign in to reply
  • Quang-Minh Nguyen added this merge request to the merge train at position 2

    added this merge request to the merge train at position 2

  • mentioned in commit 92f31e24

  • Patrick Steinhardt mentioned in merge request !5510 (merged)

    mentioned in merge request !5510 (merged)

  • added workflowstaging label and removed workflowcanary label

  • Patrick Steinhardt mentioned in merge request !5607 (merged)

    mentioned in merge request !5607 (merged)

  • Please register or sign in to reply
    Loading