Skip to content

Extend heuristic optimizations in OptimizeRepository

OptimizeRepository is currently really limited with regards to what it does: it only repacks repositories, and it only does so in case the repository is missing critical caches like the commit-graph or bitmaps. The original intent of this RPC was to gain more heuristics over time, and this is exactly what this MR does.

With this MR in place we effectively supersede almost all RPCs which do repository optimizations with a single RPC: Cleanup, WriteCommitGraph, RepackIncremental, RepackFull, PackRefs and GarbageCollect. The end result is that OptimizeRepository can now decide based on the repository's state whether it needs to be optimized or not, taking both the on-disk state and the repo's size into account.

This strategy is superior compared to the current strategy where Rails blindly schedules different cleanup RPCs after $n operations. It cannot tell what the repository actually looks like on-disk, so it's the best it can do. Eventually, we'll want to replace most of these callsites in Rails with calls to OptimizeRepository. The most important benefit is that this will put Gitaly into charge of repository maintenance strategies instead of Rails, which also means that we can iterate a lot faster on it and e.g. switch to multi-pack indexes.

For now, this is only used by our nightly maintenance jobs.

Closes #2296 (closed)

Part of #2721 (closed)

Merge request reports