Merge Request page rendering performance substantially degrades with high commits count
Summary
With our increased performance testing efforts we've started to identify slow areas in GitLab and raising them as issues. One such area is the Merge Request page where we've found significantly slow rendering and response times when the number of commits is high (6000+). This includes subsets such as Discussion (no discussion), Commits and Changes:
Discussion (no notes) = ~7-10s 1
Commits = ~15s 1
Changes = ~45s 1
Steps to reproduce
- Import the gitlabhq project as detailed here in the Performance Toolkit's documentation.
- Once import has completed you should be able to view the issue immediately by loading the page for the MR that has over 6000 commits (10495) - http://localhost:3000/qa-perf-testing/gitlabhq/merge_requests/10495
- The same MR and performance issue can be seen on one of the Quality team's reference environments that's designed to handle up to 10k users - http://10k.testbed.gitlab.net/qa-perf-testing/gitlabhq/merge_requests/10495/
We used SiteSpeed to do full profiling for this page. You can run this against the 10k environment via docker as follows:
docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io --outputFolder sitespeed-results --browsertime.pageCompleteWaitTime 20000 --browsertime.pageCompleteCheckInactivity true --browsertime.viewPort "1366x1080" -n 1 <URL>
Results will be found in the sitespeed-results
in whatever location you ran it on your host.
What is the current bug behavior?
The page and its subsequent tabs can take a very long time to load, especially the changes tab.
What is the expected correct behavior?
That the main page itself should at least load under the GitLab Speed Index target of 2s (More info on Speed Index can be found here)
SiteSpeed Full Results
Attached is a zip of the full results of a SiteSpeed run - 10k-sitespeed-results.7z
Local GDK fast-stats
Testing locally on GDK found similar results as the 10k environment's above. On this environment we were able to extract more logs with fast-stats:
~/Dev/tools/fast-stats development_json.log --type production_json
CONTROLLER COUNT RPS PERC99 PERC95 MEDIAN MAX MIN SCORE % FAIL
Projects::MergeRequests::ContentController#widget 93 inf 3878.98 3735.98 2953.87 4030.02 2239.30 360745.47 0.00
Projects::MergeRequestsController#show 13 inf 14290.68 9630.52 4013.40 15455.72 95.89 185778.83 0.00
Projects::NotesController#index 298 inf 519.94 423.69 151.74 559.13 111.53 154941.38 0.00
Projects::RawController#show 40 inf 802.94 340.57 86.72 920.01 61.10 32117.59 0.00
Projects::MergeRequestsController#commits 2 inf 10266.08 10035.22 7438.09 10323.79 4552.39 20532.15 0.00
Projects::MergeRequestsController#ci_environments_status 38 inf 426.45 361.37 46.33 460.21 33.37 16205.15 0.00
Peek::ResultsController#show 48 inf 311.54 206.34 15.52 393.04 4.64 14954.02 0.00
Projects::MergeRequests::DiffsController#show 2 inf 6722.11 6687.25 6294.99 6730.83 5859.14 13444.23 0.00
RootController#index 1 inf 9486.87 9486.87 9486.87 9486.87 9486.87 9486.87 0.00
Projects::MergeRequestsController#discussions 5 inf 688.46 622.32 284.06 705.00 202.58 3442.32 0.00
Gitlab::RequestForgeryProtection::Controller#index 2 inf 6.18 6.08 4.96 6.20 3.72 12.35 0.00
Equivalent API Results
For comparison, we also tested the equivalent API endpoints and found that they were also substantially slow for typical API performance but nothing compared to the page results:
Environment: http://10k.testbed.gitlab.net
Version: 12.0.2-ee ef76b54fc1e
NAME | RESULT | DURATION | P95 | RPS_COUNT | RPS_MEAN
-----------------------------------------------------|--------|----------|------------|-----------|-------------
api_v4_projects_merge_requests_merge_request | Passed | 30.0s | 146.37ms | 5415 | 180.499432/s
api_v4_projects_merge_requests_merge_request_changes | Passed | 30.0s | 131.92ms | 5432 | 181.066221/s
api_v4_projects_merge_requests_merge_request_commits | Passed | 30.0s | 9965.99ms | 649 | 21.633277/s
api_v4_projects_merge_requests_merge_request_notes | Passed | 30.0s | 225.73ms | 5293 | 176.432752/s
A separate issue has been raised for the Commits API endpoint.
Resource Metrics
A snapsnot of the 10k environment's metrics - CPU, Memory, etc.. - can be found here - https://snapshot.raintank.io/dashboard/snapshot/633jaGYk26jR4sX2FXuufCTBJQSv6Cm7