Consider having `all_objects` use `git cat-file --batch-check --batch-all-objects`
When fetching LFS pointers from a repo we use all_objects
to find blobs. We might be able to half how long this takes if we used git cat-file --batch-check --batch-all-objects
instead of git rev-list --objects --all
, with gitlab-ee
going from around 6s to around 3s. We might also gain some additional time if we filter that output by type blob
and by size instead of making additional calls through rugged.
The following discussion from !18871 (merged) should be addressed:
-
@jamedjo commented on a discussion: (+12 comments)
I just looked up other approaches and came across git cat-file --batch-check --batch-all-objects
. This also gives us the size and type for us to filter on, and looks to take about half the time, so we could open an MR to investigate moving to that:
lfs-lab3.git|master» time git rev-list --objects --all | wc
945292 1701436 56073287
git rev-list --objects --all 6.83s user 0.41s system 98% cpu 7.369 total
wc 0.25s user 0.23s system 6% cpu 7.367 total
lfs-lab3.git|master» time git rev-list --objects --all | wc
945292 1701436 56073287
git rev-list --objects --all 6.60s user 0.39s system 99% cpu 7.047 total
wc 0.25s user 0.22s system 6% cpu 7.046 total
lfs-lab3.git|master» time git cat-file --batch-check --batch-all-objects | wc
945374 2836122 48019888
git cat-file --batch-check --batch-all-objects 2.96s user 0.11s system 99% cpu 3.099 total
wc 0.17s user 0.03s system 6% cpu 3.097 total
lfs-lab3.git|master» time git cat-file --batch-check --batch-all-objects | wc
945374 2836122 48019888
git cat-file --batch-check --batch-all-objects 2.92s user 0.13s system 97% cpu 3.113 total
wc 0.18s user 0.03s system 6% cpu 3.111 total