Skip to content

Fix slow fetches for repositories using object deduplication

Stan Hu requested to merge sh-fix-slow-fetches-from-forked-repos into master

In gitaly!1297 (merged), we introduced a mechanism to prevent accidental deletion of objects from object pools. The idea is that if an object becomes "dangling", meaning there is no ref pointing to it, we just add a new ref to un-dangle it. It turns out this can create too many refs.

If a pooled repository may contain many dangling references, a fetch from the parent repository to one of its forks may take an extraordinarily long time because git will attempt to run git for-each-ref on every single ref in the pooled repository. This can cause a simple fetch to take minutes.

To avoid this problem, we use a similar strategy that we employed in the git push case (!3364 (merged)). Git added a config option, core.alternateRefsCommand, to fix this issue. From https://github.com/git/git/commit/465e73fff380808f0ba3fb17984ab8636afb6405:

When pushing into a repository that borrows its objects from an alternate object store, "git receive-pack" that responds to the push request on the other side lists the tips of refs in the alternate to reduce the amount of objects transferred. This sometimes is detrimental when the number of refs in the alternate is absurdly large, in which case the bandwidth saved in potentially fewer objects transferred is wasted in excessively large ref advertisement. The alternate refs that are advertised are now configurable with a pair of configuration variables.

In this case, we specify a command that will just exit and not return any alternate refs because there's no sense in trying to iterate through the refs in the pool repository.

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/65985

See also gitaly#1900 (closed)

Edited by GitLab Release Tools Bot

Merge request reports