GetAllLFSPointers is slow in large repos
We've had a couple of tickets from EE customers in which mirror updates on larger repositories result in a Deadline Exceeded
error in Gitaly. This happens while we're running the GetAllLFSPointers
method. I'm guessing how we get these LFS pointers might need to be optimized for larger repos.
I had a call with one of these customers recently and did some digging. We found that running git rev-list --all --filter=blob:limit=200 --in-commit-order --objects
in the repo took 30 seconds and returned 73k lines of output. Running Projects::LfsPointers::LfsListService.new(project).execute
in Rails kept failing until we increased the medium timeout to 180 seconds, and the process actually finished in 150 seconds. This is a "medium" timeout call, so it looks like we normally expect it to take fewer than 30 seconds.
The customer is running GitLab 11.11 on AWS, with EBS for git storage (quite performant in our tests).
We collected a couple of strace files, one for Gitaly, and one for the rails runner. They should be in ticket 122866, linked below.
Tickets (internal-only links): https://gitlab.zendesk.com/agent/tickets/122866 - this is the more recent one where I had the call with the customer https://gitlab.zendesk.com/agent/tickets/122482
@zj please let me know if there is anything else we can gather from the customer to understand this issue better. Also, please let me know if there is any sort of immediately helpful workaround we can offer the customer.