Mirror fetches are slow on large repositories
For large repositories, mirror-fetches are exceedingly slow, frequently taking about 5 minutes or more. We should investigate and see whteher this can be improved to a reasonable level.
Designs
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Patrick Steinhardt added bugperformance groupgitaly labels
added bugperformance groupgitaly labels
- Patrick Steinhardt mentioned in issue gitaly#3670 (closed)
mentioned in issue gitaly#3670 (closed)
- Author Maintainer
Point in case, with gitlab-org/gitlab:
Benchmark #1: git clone --bare --no-local ~/Development/gitlab/repositories/@hashed/a6/80/a68072e80f075e89bc74a300101a9e71e8363bdb542182580162553462480a52.git/ Time (mean ± σ): 83.588 s ± 2.176 s [User: 180.976 s, System: 65.941 s] Range (min … max): 82.050 s … 85.127 s 2 runs Benchmark #2: git clone --bare --mirror --no-local ~/Development/gitlab/repositories/@hashed/a6/80/a68072e80f075e89bc74a300101a9e71e8363bdb542182580162553462480a52.git/ Time (mean ± σ): 465.668 s ± 3.376 s [User: 906.299 s, System: 277.059 s] Range (min … max): 463.281 s … 468.055 s 2 runs Summary 'git clone --bare --no-local ~/Development/gitlab/repositories/@hashed/a6/80/a68072e80f075e89bc74a300101a9e71e8363bdb542182580162553462480a52.git/' ran 5.57 ± 0.15 times faster than 'git clone --bare --mirror --no-local ~/Development/gitlab/repositories/@hashed/a6/80/a68072e80f075e89bc74a300101a9e71e8363bdb542182580162553462480a52.git/'
Fetches into that mirror clone take nearly two minutes even on this local machine.
Collapse replies - Author Maintainer
With commit graphs, fetches reduce to about a 77 seconds. That's still slow though, and I assume that it shouldn't be hard to find a few low-hanging fruit. In fact, with my patch series in #92 (closed) and #94 (closed), this decreases to 59 seconds. The following patch buys another 40% performance improvement from 59s to 35s:
diff --git a/builtin/fetch.c b/builtin/fetch.c index 25740c13df..0e9495abe6 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1131,11 +1131,14 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, continue; } - commit = lookup_commit_reference_gently(the_repository, - &rm->old_oid, - 1); - if (!commit) - rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE; + commit = lookup_commit_in_graph(the_repository, &rm->old_oid); + if (!commit) { + commit = lookup_commit_reference_gently(the_repository, + &rm->old_oid, + 1); + if (!commit) + rm->fetch_head_status = FETCH_HEAD_NOT_FOR_MERGE; + } if (rm->fetch_head_status != want_status) continue;
Edited by Patrick Steinhardt - Author Maintainer
I've sent out the first version of this patch series. Overall, it improves performance from originally 69s to 26s. http://public-inbox.org/git/cover.1629452412.git.ps@pks.im/T/#t
- Author Maintainer
I've got a second version pending, but as per Junio's request I'll wait for topics this series depends on to land in next.
- Author Maintainer
I've sent out the second version of this patch series: http://public-inbox.org/git/cover.1629452412.git.ps@pks.im/T/#m16191db34a8e9072506ce504f956dc48186294ed
- Author Maintainer
Sent out another small patch that gives another speedup of 500ms, or 2%. http://public-inbox.org/git/ccd03e685af0f5cf25c68272a758fc88d115e37a.1629899211.git.ps@pks.im/T/#u
- Author Maintainer
And another patch which avoids formatting output with
--quiet
for a 7% speedup: http://public-inbox.org/git/40c385048a023dbd447c5f0b4c95ff32485e1e23.1629906005.git.ps@pks.im/ - Author Maintainer
Version 3 of the initial patch series: https://public-inbox.org/git/cover.1629452412.git.ps@pks.im/T/#m503a7fa3274d2f2498c0c3911a0077794e90efbb
- Patrick Steinhardt assigned to @pks-t
assigned to @pks-t
- Patrick Steinhardt changed milestone to %14.3
changed milestone to %14.3
- Maintainer
Setting label(s) Category:Gitaly devopscreate sectiondev based on groupgitaly.
- 🤖 GitLab Bot 🤖 added Category:Gitaly devopscreate sectiondev labels
added Category:Gitaly devopscreate sectiondev labels
- Patrick Steinhardt added gitseen workflowin review labels
added gitseen workflowin review labels
- Patrick Steinhardt mentioned in issue gitlab-com/gl-infra/scalability#1257
mentioned in issue gitlab-com/gl-infra/scalability#1257
- Patrick Steinhardt mentioned in issue gitlab#336657 (closed)
mentioned in issue gitlab#336657 (closed)
- Patrick Steinhardt mentioned in merge request gitaly!3848 (merged)
mentioned in merge request gitaly!3848 (merged)
- Christian Couder closed with merge request gitaly!3848 (merged)
closed with merge request gitaly!3848 (merged)
- Christian Couder mentioned in commit gitaly@ff2f210d
mentioned in commit gitaly@ff2f210d
- Patrick Steinhardt added to epic &6733
added to epic &6733
- Patrick Steinhardt mentioned in merge request gitlab-com/www-gitlab-com!96044 (merged)
mentioned in merge request gitlab-com/www-gitlab-com!96044 (merged)
- Patrick Steinhardt mentioned in issue gitlab-com/gl-infra/production#6222 (closed)
mentioned in issue gitlab-com/gl-infra/production#6222 (closed)
- DJ Mountney mentioned in issue gitlab-com/gl-infra/production#6221 (closed)
mentioned in issue gitlab-com/gl-infra/production#6221 (closed)