Follow option in FindCommitsRequest causes high CPU, long request duration
Summary
Similar to #4749 (closed) and gitlab#421035
In large monorepos, git log
with --follow
can utilise a large amount of CPU resources, and take over 40s. This occurs when attempting to view a specific file's commit history in the UI. It seems that Rails will make the FindCommitsRequest
grpc request with follow
being true
if the number of file paths are exactly one: https://gitlab.com/gitlab-org/gitlab/-/blob/e163e9eea397a9957062e26bd95278ca76831c54/app/models/repository.rb#L156
Investigation
An affected customer has isolated the git
command taking a long time in ps
, and have confirmed that manually running the same command with options takes a long time. If they remove --follow
, the command executes much quicker:
$ time git -c gc.auto=0 -c maintenance.auto=0 -c core.autocrlf=input -c core.useReplaceRefs=false -c core.fsync=objects,derived-metadata,reference -c core.fsyncMethod=fsync -c core.packedRefsTimeout=10000 -c core.filesRefLockTimeout=1000 -c core.bigFileThreshold=50m -c gc.aggressiveDepth=20 -c gc.aggressiveWindow=150 log --format=%H --max-count=40 --follow --end-of-options refs/heads/current/stage1 -- <redacted>
<git log output>
real 1m16.820s
user 0m0.000s
sys 0m0.000s
$ time git -c gc.auto=0 -c maintenance.auto=0 -c core.autocrlf=input -c core.useReplaceRefs=false -c core.fsync=objects,derived-metadata,reference -c core.fsyncMethod=fsync -c core.packedRefsTimeout=10000 -c core.filesRefLockTimeout=1000 -c core.bigFileThreshold=50m -c gc.aggressiveDepth=20 -c gc.aggressiveWindow=150 log --format=%H --max-count=40 --end-of-options refs/heads/current/stage1 -- <redacted>
<git log output>
real 0m2.352s
user 0m0.015s
sys 0m0.015s
Infrastructure
10k Cloud Native Hybrid reference architecture with Gitaly Cluster
Impact
Viewing a file's commit history takes a very long time
Workaround
There is no workaround currently