Skip to content

Merge Request page rendering performance substantially degrades with high commits count

Summary

With our increased performance testing efforts we've started to identify slow areas in GitLab and raising them as issues. One such area is the Merge Request page where we've found significantly slow rendering and response times when the number of commits is high (6000+). This includes subsets such as Discussion (no discussion), Commits and Changes:

Discussion (no notes) = ~7-10s 1

Commits = ~15s 1

Changes = ~45s 1

Additionally the above times are only for single loads. When testing with sustained load we've found the above times to decay exponentially. The follow are results from testing against our reference 10k environment with target 200 requests per second (times shown in ms):

Environment:    http://10k.testbed.gitlab.net
Version:    12.0.2-ee ef76b54fc1e
NAME                                                 | RESULT | DURATION | P95        | RPS_COUNT | RPS_MEAN    
-----------------------------------------------------|--------|----------|------------|-----------|-------------
projects_merge_requests_controller_show_html         | Passed | 30.0s    | 19351.88ms | 177       | 5.899967/s  
projects_merge_requests_diffs_controller_show_json   | Failed | 30.0s    | 20900.65ms | 71        | 2.366654/s  

Note the second test reports as failed here due to some responses (around 5%) coming back as 500s under strain.

Steps to reproduce

  1. Import the gitlabhq project as detailed here in the Performance Toolkit's documentation.
  2. Once import has completed you should be able to view the issue immediately by loading the page for the MR that has over 6000 commits (10495) - http://localhost:3000/qa-perf-testing/gitlabhq/merge_requests/10495

We used SiteSpeed to do full profiling for this page. You can run this against the 10k environment via docker as follows:

docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io --outputFolder sitespeed-results --browsertime.pageCompleteWaitTime 20000 --browsertime.pageCompleteCheckInactivity true --browsertime.viewPort "1366x1080" -n 1 <URL>

Results will be found in the sitespeed-results in whatever location you ran it on your host.

What is the current bug behavior?

The page and its subsequent tabs can take a very long time to load, especially the changes tab.

What is the expected correct behavior?

That the main page itself should at least load under the GitLab Speed Index target of 2s (More info on Speed Index can be found here)

SiteSpeed Full Results

Attached is a zip of the full results of a SiteSpeed run - 10k-sitespeed-results.7z

Local GDK fast-stats

Testing locally on GDK found similar results as the 10k environment's above. On this environment we were able to extract more logs with fast-stats:

~/Dev/tools/fast-stats development_json.log --type production_json
CONTROLLER                                                  COUNT     RPS    PERC99    PERC95    MEDIAN       MAX       MIN      SCORE    % FAIL
Projects::MergeRequests::ContentController#widget              93     inf   3878.98   3735.98   2953.87   4030.02   2239.30  360745.47      0.00
Projects::MergeRequestsController#show                         13     inf  14290.68   9630.52   4013.40  15455.72     95.89  185778.83      0.00
Projects::NotesController#index                               298     inf    519.94    423.69    151.74    559.13    111.53  154941.38      0.00
Projects::RawController#show                                   40     inf    802.94    340.57     86.72    920.01     61.10   32117.59      0.00
Projects::MergeRequestsController#commits                       2     inf  10266.08  10035.22   7438.09  10323.79   4552.39   20532.15      0.00
Projects::MergeRequestsController#ci_environments_status       38     inf    426.45    361.37     46.33    460.21     33.37   16205.15      0.00
Peek::ResultsController#show                                   48     inf    311.54    206.34     15.52    393.04      4.64   14954.02      0.00
Projects::MergeRequests::DiffsController#show                   2     inf   6722.11   6687.25   6294.99   6730.83   5859.14   13444.23      0.00
RootController#index                                            1     inf   9486.87   9486.87   9486.87   9486.87   9486.87    9486.87      0.00
Projects::MergeRequestsController#discussions                   5     inf    688.46    622.32    284.06    705.00    202.58    3442.32      0.00
Gitlab::RequestForgeryProtection::Controller#index              2     inf      6.18      6.08      4.96      6.20      3.72      12.35      0.00

Equivalent API Results

For comparison, we also tested the equivalent API endpoints and found that they were also substantially slow for typical API performance but nothing compared to the page results:

Environment:    http://10k.testbed.gitlab.net
Version:    12.0.2-ee ef76b54fc1e
NAME                                                 | RESULT | DURATION | P95        | RPS_COUNT | RPS_MEAN    
-----------------------------------------------------|--------|----------|------------|-----------|-------------
api_v4_projects_merge_requests_merge_request         | Passed | 30.0s    | 146.37ms   | 5415      | 180.499432/s
api_v4_projects_merge_requests_merge_request_changes | Passed | 30.0s    | 131.92ms   | 5432      | 181.066221/s
api_v4_projects_merge_requests_merge_request_commits | Passed | 30.0s    | 9965.99ms  | 649       | 21.633277/s 
api_v4_projects_merge_requests_merge_request_notes   | Passed | 30.0s    | 225.73ms   | 5293      | 176.432752/s

A separate issue has been raised for the Commits API endpoint.

Resource Metrics

A snapsnot of the 10k environment's metrics - CPU, Memory, etc.. - can be found here - https://snapshot.raintank.io/dashboard/snapshot/633jaGYk26jR4sX2FXuufCTBJQSv6Cm7

Edited by Grant Young