Skip to content

git: Speed up fetches in repos with many refs

Patrick Steinhardt requested to merge pks-git-patch-speed-up-fetch into master

Recently, we've hit such a large amount of references in one of our repos that FetchSourceBranch() always times out. The root cause of this is that Git has to load all commits of the source repo in order to negotiate refs which are in common with the remote reposiory when we execute git-fetch(1). This is currently always hitting the object database up to the point where git-fetch(1) is heavily dominated by decompressing and parsing objects from disk.

To fix this, I'm currently upstreaming a patch to git.git which optimizes the lookup to make use of the commit-graph: if a commit is part of the commit-graph, then we don't need to hit the object database but can instead parse it via this graph, which is a lot more efficient. Benchmarks in the repo which is creating the problems for us has shown that this brings down the time to fetch from 44 seconds to 19 seconds, which is sufficient to unblock FetchSourceBranch() again.

While our process states that we don't apply any patches to Git before they have hit "next", the problem is significant enough to make an exception here given that community contributors cannot create merge requests against gitlab-org/gitlab anymore. Furthermore, the patch itself is simple enough and has only received positive feedback from maintainers so far 1.

Apply the patch to Git to speed up git-fetch(1) in such repos with many refs and unblock community contributors again.

Changelog: performance

Part of gitlab#336657 (closed)

Part of git#94 (closed)

Merge request reports