Performance: repository repacking should be a side effect of replication
Problem to solve
It is important that replicas are properly repacked so that they exhibit proper performance if a failover occurs. If not, performance could be terrible until sufficient writes are received.
Proposal
Because replication now in Gitaly HA uses git fetch
it is pointless and impossible to "replicate" a GarbageCollect or RepackXxx RPC call: these calls make changes that are invisible to git fetch
.
However, it is important that we keep replicas in a good packed state because this can make a big difference for performance.
I propose we:
- create heuristics that decide when to repack #2054 (closed)
- each time we replicate a repository, we run the heuristic on the destination, and repack if needed
Edited by James Ramsay (ex-GitLab)