Skip to content

housekeeping: Deduplicate objects when alternates change

When creating a fork we first create the object pool, then create the fork, and ultimately end up linking the fork to the object pool. As a consequence, the newly created object pool will not yet have made use of the objects contained inside of the object pool and instead has them duplicated.

Historically, Rails has called the GarbageCollect() RPC to deduplicate all objects afterwards again. This got lost though in our transition to the new OptimizeRepository() RPC, where the objects may or may not be deduplicated based on whether our heuristics decide to do a full repack or not. This means that it will on average take lot longer before we reap the storage savings in both the origin and fork repositories, it at all.

Introduce a new heuristic in OptimizeRepository() that knows to perform a full repack when the last full repack has been performed before the alternates file was modified. This will cause us to always try and deduplicate objects that have been made obsolete by the changed alternates file.

Closes #5124 (closed).

Edited by Patrick Steinhardt

Merge request reports