500 error when loading older in-process Merge Requests due to slow ListConflictsFiles response
Summary
When users attempt to view in-process MRs submitted several months ago the page loads a minute, eventually returning a 500.
This appears to be triggered by notify_conflict?, which calls into the Gitaly ListConflictFiles service. That service is exceeding 55 seconds, causing a GRPC::Deadline Exceeded error. The customer was able to work around the problem by temporarily bumping their Gitaly timeout from 55 to 120 seconds.
This is occurring on a moderately sized project using the default merge commit setting. The impacted MRs are roughly 200 commits behind HEAD with a diff of ~1,000,000 lines over 200+ files.
Customer's Gitaly performance is good in general, with the exception of ListConflictFiles. strace shows that the offending process appears to be bottlenecked by time spent in userland rather than file or I/O access.
pid actv (ms) wait (ms) user (ms) total (ms) syscalls children
------- ---------- ---------- ---------- ---------- --------- ---------
1380 885.948 4942.070 58367.770 64195.789 227 0
Expand for Gitaly Method stats
| METHOD | COUNT | PERC99 | PERC95 | STDDEV | MAX | MIN | SCORE |
|---|---|---|---|---|---|---|---|
| SSHUploadPack | 127 | 305883.41 | 304052.07 | 99202.63 | 331052.3 | 5.54 | 38847193.1 |
| ListConflictFiles | 19 | 59379.5 | 57959.05 | 25788.72 | 59734.61 | 1.97 | 1128210.5 |
| ListCommitsByOid | 7706 | 11.96 | 9.71 | 2.44 | 18.2 | 2.01 | 92163.8 |
| CommitIsAncestor | 7635 | 7.75 | 7.04 | 2.13 | 13.85 | 1.59 | 59171.3 |
| FindCommit | 9729 | 4.53 | 3.79 | 0.67 | 18.17 | 1.41 | 44072.4 |
| FindCommits | 65 | 364.95 | 349.31 | 100.99 | 366.63 | 2.45 | 23721.8 |
| ListLastCommitsForTree | 22 | 483.34 | 439.83 | 161.52 | 494.91 | 10.23 | 10633.5 |
| CommitLanguages | 1 | 7153.31 | 7153.31 | 0.0 | 7153.31 | 7153.31 | 7153.3 |
| TreeEntry | 424 | 12.5 | 5.27 | 1.75 | 20.47 | 1.96 | 5300.0 |
| SSHReceivePack | 1 | 3875.71 | 3875.71 | 0.0 | 3875.71 | 3875.71 | 3875.7 |
| GetTreeEntries | 176 | 16.02 | 9.25 | 4.04 | 44.27 | 2.3 | 2819.5 |
| CountCommits | 20 | 115.48 | 15.13 | 30.49 | 140.57 | 2.34 | 2309.6 |
| PostUploadPack | 11 | 104.09 | 101.57 | 9.16 | 104.73 | 73.34 | 1145.0 |
| UserMergeBranch | 1 | 1012.77 | 1012.77 | 0.0 | 1012.77 | 1012.77 | 1012.8 |
| GetCommitSignatures | 89 | 9.62 | 5.21 | 1.49 | 9.77 | 1.62 | 856.2 |
| LastCommitForPath | 35 | 23.89 | 19.74 | 5.56 | 25.95 | 3.85 | 836.2 |
| GetBlobs | 29 | 25.98 | 14.61 | 5.73 | 30.36 | 3.09 | 753.4 |
| InfoRefsUploadPack | 14 | 42.63 | 41.96 | 4.41 | 42.79 | 24.37 | 596.8 |
| ListTagNamesContainingCommit | 4 | 149.16 | 147.29 | 10.93 | 149.62 | 123.78 | 596.6 |
| IsRebaseInProgress | 3094 | 0.17 | 0.14 | 0.04 | 0.89 | 0.05 | 526.0 |
| GetNewLFSPointers | 2 | 234.8 | 234.66 | 2.61 | 234.84 | 231.15 | 469.6 |
| FindDefaultBranchName | 42 | 5.39 | 5.3 | 1.12 | 5.41 | 1.59 | 226.4 |
| FindLocalBranches | 1 | 187.57 | 187.57 | 0.0 | 187.57 | 187.57 | 187.6 |
| HasLocalBranches | 7 | 22.09 | 21.96 | 8.19 | 22.12 | 2.12 | 154.6 |
| FetchSourceBranch | 3 | 44.5 | 44.39 | 6.55 | 44.53 | 32.52 | 133.5 |
| CommitsBetween | 7 | 17.07 | 16.38 | 5.29 | 17.25 | 4.05 | 119.5 |
| CommitDelta | 25 | 4.59 | 3.63 | 0.74 | 4.89 | 1.71 | 114.8 |
| CommitDiff | 9 | 12.76 | 11.32 | 3.13 | 13.12 | 2.84 | 114.8 |
| DiffStats | 10 | 11.44 | 9.33 | 2.59 | 11.97 | 3.14 | 114.4 |
| GetLFSPointers | 27 | 2.82 | 2.42 | 0.31 | 2.96 | 1.62 | 76.1 |
| FindBranch | 11 | 5.56 | 5.55 | 0.45 | 5.56 | 4.19 | 61.2 |
| FindMergeBase | 8 | 7.42 | 7.24 | 2.07 | 7.47 | 2.2 | 59.4 |
| RefExists | 17 | 2.93 | 2.37 | 0.42 | 3.07 | 1.29 | 49.8 |
| FindAllBranchNames | 8 | 4.48 | 4.32 | 0.87 | 4.52 | 2.06 | 35.8 |
| GetBlob | 7 | 3.92 | 3.76 | 0.43 | 3.96 | 2.7 | 27.4 |
| FilterShasWithSignatures | 2 | 10.3 | 10.07 | 4.12 | 10.36 | 4.53 | 20.6 |
| Cleanup | 2 | 10.21 | 10.11 | 1.68 | 10.23 | 7.85 | 20.4 |
| FindAllTagNames | 7 | 2.39 | 2.34 | 0.29 | 2.4 | 1.52 | 16.7 |
| RepositoryExists | 46 | 0.3 | 0.26 | 0.08 | 0.33 | 0.04 | 13.8 |
| GetInfoAttributes | 16 | 0.77 | 0.51 | 0.18 | 0.83 | 0.11 | 12.3 |
| RepositorySize | 1 | 10.7 | 10.7 | 0.0 | 10.7 | 10.7 | 10.7 |
| WriteRef | 2 | 2.35 | 2.34 | 0.11 | 2.35 | 2.2 | 4.7 |
| ApplyGitattributes | 1 | 3.16 | 3.16 | 0.0 | 3.16 | 3.16 | 3.2 |
| ServerInfo | 1 | 1.9 | 1.9 | 0.0 | 1.9 | 1.9 | 1.9 |
| DeleteRefs | 1 | 1.79 | 1.79 | 0.0 | 1.79 | 1.79 | 1.8 |
| Check | 11 | 0.06 | 0.06 | 0.01 | 0.06 | 0.03 | 0.7 |
Steps to reproduce
- Create an MR
- Do not merge it
- Wait three months
- Attempt to load MR - timeout or slow load occurs
What is the current bug behavior?
ListConflictFiles is exceeding 55 seconds, causing the MR page to fail to load.
What is the expected correct behavior?
ListConflictFiles runs quickly and the page loads.
Relevant logs and/or screenshots
production.log
Completed 500 Internal Server Error in 55041ms (ActiveRecord: 4.7ms | Elasticsearch: 0.0ms)
GRPC::DeadlineExceeded (4:Deadline Exceeded):
lib/gitlab/gitaly_client/conflict_files_stitcher.rb:15:in `each'
lib/gitlab/gitaly_client/conflict_files_stitcher.rb:15:in `each'
lib/gitlab/gitaly_client/conflicts_service.rb:29:in `any?'
lib/gitlab/gitaly_client/conflicts_service.rb:29:in `conflicts?'
lib/gitlab/git/repository.rb:938:in `can_be_merged?'
app/models/repository.rb:499:in `method_missing'
app/models/merge_request.rb:725:in `check_if_can_be_merged'
app/controllers/projects/merge_requests_controller.rb:272:in `mark_merge_request_mergeable'
app/controllers/projects/merge_requests_controller.rb:38:in `show'
lib/gitlab/i18n.rb:55:in `with_locale'
lib/gitlab/i18n.rb:61:in `with_user_locale'
app/controllers/application_controller.rb:423:in `set_locale'
lib/gitlab/middleware/multipart.rb:103:in `call'
lib/gitlab/request_profiler/middleware.rb:16:in `call'
ee/lib/gitlab/jira/middleware.rb:17:in `call'
lib/gitlab/middleware/go.rb:20:in `call'
lib/gitlab/etag_caching/middleware.rb:13:in `call'
lib/gitlab/middleware/correlation_id.rb:16:in `block in call'
lib/gitlab/correlation_id.rb:15:in `use_id'
lib/gitlab/middleware/correlation_id.rb:15:in `call'
lib/gitlab/middleware/read_only/controller.rb:42:in `call'
lib/gitlab/middleware/read_only.rb:18:in `call'
lib/gitlab/middleware/basic_health_check.rb:25:in `call'
lib/gitlab/request_context.rb:20:in `call'
lib/gitlab/metrics/requests_rack_middleware.rb:29:in `call'
lib/gitlab/middleware/release_env.rb:13:in `call'
production_json.log
{"method":"GET","path":"/group/project/merge_requests/54","format":"html","controller":"Projects::MergeRequestsController","action":"show","status":500,"error":"GRPC::DeadlineExceeded: 4:Deadline Exceeded","duration":55042.49,"view":0.0,"db":4.71,"time":"2019-03-05T00:53:15.503Z","params":[{"key":"namespace_id","value":"group"},{"key":"project_id","value":"project"},{"key":"id","value":"54"}],"remote_ip":"1.1.1.1","user_id":66,"username":"user","ua":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36","gitaly_calls":2,"correlation_id":"DDKqZaEzOt8"}
Gitaly current (same problem, different instance of error)
2019-04-02_17:04:22.11263 time="2019-04-02T17:04:22Z" level=info msg="finished streaming call with code Canceled" correlation_id=SETNdfG9hAa error="
rpc error: code = Canceled desc = rpc error: code = Canceled desc = context canceled" grpc.code=Canceled grpc.meta.auth_version=v2 grpc.meta.client_
name=gitlab-web grpc.method=ListConflictFiles grpc.request.deadline="2019-04-02T17:04:22Z" grpc.request.fullMethod=/gitaly.ConflictsService/ListConf
lictFiles grpc.request.glProjectPath=group/project grpc.request.glRepository=project-34 grpc.request.repoPath=group/project.git grpc.request.repoStorage
=default grpc.request.topLevelGroup=code grpc.service=gitaly.ConflictsService grpc.start_time="2019-04-02T17:03:27Z" grpc.time_ms=55001.2 peer.addre
ss=@ span.kind=server system=grpc
Output of checks
Issue first reported with GitLab EE v11.8.1, still occurring on v11.10.2
I have not been able to reproduce the error on GitLab.com
Results of GitLab environment info
Expand for output related to GitLab environment info
Requested - will add when received from customer.
Results of GitLab application Check
Expand for output related to the GitLab application check
Requested - will add when received from customer.