Connectivity check in DisconnectGitAlternates is unreasonably expensive
When disconnecting a repository from its object pool we first hard-link all packfiles and objects from the pool into the pool member and then remove the alternates link. In order to verify that the pool member is indeed consistent afterwards we need to perform a connectivity check to walk over all objects in order to verify that we aren't missing any objects after disconnecting from the pool.
This connectivity check is currently implemented via git fsck --connectivity-only
. We have seen cases though where this command is exhibiting adverse performance, peaking at about 80GB of RAM before the process was killed. And from a cursory read the connectivity check seems to be rather naive: it simply iterates over all loose objects and packs, and then marks every such object as reachable. This means that it'll also happily reiterate over objects that are present in multiple packs and include unreachable objects which we don't even care about.
We should check whether it makes sense to implement an alternative.