repository: Add new RPC to prune unreachable objects
When rewriting the repository's history with the BFG Repo-Cleaner, then
we potentially accumulate lots and lots of unreachable objects in the
repository's object database. By default, we'd clean up those objects
after two weeks, which is a rather long time to sit on such a huge
number of objects. To fix this usecase we have thus gained a prune
parameter in our GarbageCollect RPC call: if set, then we prune
unreachable objects if they haven't been accessed during the last 30
mintues.
The problem with this though is that GarbageCollect does a lot more than only pruning objects: it may end up packing objects or objects, writing commit-graphs, write bitmaps or some other things. All of these are things we want to control ourselves though, but we instead let git-gc(1) dictate how the repository is packed.
We're thus about to deprecate all RPCs which directly influence how a repository is packed in favor of OptimizeRepository: this is our "black box" RPC that, from the viewpoint of the caller, does something with the repository to make it great again. And this is by design: callers should not control the way Gitaly handles repository maintenance.
This highlights the need though for a new RPC call which only prunes objects which have become unreachable to disentangle it from repository maintenance tasks. This commit thus introduces PruneUnreachableObjects, a new RPC which does exactly that: any unreachable loose object that hasn't been touched in the last 30 minutes is going to be pruned.
Note that to make this work correctly, the caller has to do two RPC calls: the first RPC call to OptimizeRepository is required to unpack unreachable loose objects, and 30 minutes later they may prune these objects with a second call to PruneUnreachableObjects.
This is no different from right now, even though it's hidden away and
(naturally) used incorrectly by Rails: GarbageCollect would need to be
called twice, first to explode unreachable objects into loose objects
and then second with prune=true
to prune them after half an hour. This
is because Git will only ever consider loose objects for pruning, and
the grace period is determined by inspecting its access time. So the way
Rails does this is broken, and the new RPC call doesn't change that
fact. This is a separate story though and nothing we can fix in Gitaly:
we must retain the grace period to avoid repository corruption.
Changelog: added
Fixes #4041 (closed)