Improve the performance of DiffsController#metadata and #batches
What does this MR do?
Related to issue #209786 (closed)
Background information: We have two endpoints on the diff controller, #diffs_batches and #diffs_metadata and these both call the RPC diff_stats call. To avoid caching for these separately we should cache at the diff_stats call to solve both issues.
This merge request caches the diff stats results in redis.
Improvement locally:
Here we can see improvements: !33037 (comment 357896137)
There are 3 sections, initial, caching only, and caching + source branch exists memoized. The source branch exists memoized will be another merge request.
With the caching only we go from:
✓ { endpoint:diffs_batch.json }......: avg=2763.92ms min=2421.93ms med=2650.07ms max=3532.99ms p(90)=3263.60ms p(95)=3398.29ms
✓ { endpoint:diffs_metadata.json }...: avg=2205.83ms min=2132.52ms med=2157.35ms max=2327.62ms p(90)=2293.56ms p(95)=2310.59ms
to:
✓ { endpoint:diffs_batch.json }......: avg=839.78ms min=595.83ms med=788.72ms max=1508.04ms p(90)=1092.42ms p(95)=1322.06ms
✓ { endpoint:diffs_metadata.json }...: avg=390.46ms min=315.32ms med=417.80ms max=453.72ms p(90)=444.90ms p(95)=449.31ms
Which is a good improvement.
For information on the redis caching, for the very large change with 23,000 files, this returns a byte usage of (266939), roughly 0.26 mb, using MEMORY USAGE in redis cli. In a more normal situation, a change with 13 files uses 811 bytes, roughly 0.000811mb.
How to test the impact on Performance for this change:
- Manually check the gitaly timings compared to the above here
- There are daily performance tests that are run against the nightly builds. We can check these out by looking in the pipelines in the performance repo here. We then can search for the day before our change went into production, and then look at the day after our change and compare the differences in the time taken for
endpoint:diffs_metadata
andendpoint:diffs_batch.json
. Note, we should look at the 10k requests, not 1 or 2k. This is the same data that is shown here. - The tool that the daily performance tests use is here. It should be used to run the performance tests locally as well.
- Check the latency for these controller methods in grafana here
- Check the details for the gitaly diff stats call here
- Check if there are any significant changes to redis here
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team