Skip to content

Improve the performance of DiffsController#metadata and #batches

What does this MR do?

Related to issue #209786 (closed)

Background information: We have two endpoints on the diff controller, #diffs_batches and #diffs_metadata and these both call the RPC diff_stats call. To avoid caching for these separately we should cache at the diff_stats call to solve both issues.

This merge request caches the diff stats results in redis.

Improvement locally:

Here we can see improvements: !33037 (comment 357896137)

There are 3 sections, initial, caching only, and caching + source branch exists memoized. The source branch exists memoized will be another merge request.

With the caching only we go from:

    ✓ { endpoint:diffs_batch.json }......: avg=2763.92ms  min=2421.93ms  med=2650.07ms  max=3532.99ms  p(90)=3263.60ms  p(95)=3398.29ms
    ✓ { endpoint:diffs_metadata.json }...: avg=2205.83ms  min=2132.52ms  med=2157.35ms  max=2327.62ms  p(90)=2293.56ms  p(95)=2310.59ms

to:

    ✓ { endpoint:diffs_batch.json }......: avg=839.78ms  min=595.83ms  med=788.72ms  max=1508.04ms p(90)=1092.42ms p(95)=1322.06ms
    ✓ { endpoint:diffs_metadata.json }...: avg=390.46ms  min=315.32ms  med=417.80ms  max=453.72ms  p(90)=444.90ms  p(95)=449.31ms

Which is a good improvement.

For information on the redis caching, for the very large change with 23,000 files, this returns a byte usage of (266939), roughly 0.26 mb, using MEMORY USAGE in redis cli. In a more normal situation, a change with 13 files uses 811 bytes, roughly 0.000811mb.

How to test the impact on Performance for this change:

  • Manually check the gitaly timings compared to the above here
  • There are daily performance tests that are run against the nightly builds. We can check these out by looking in the pipelines in the performance repo here. We then can search for the day before our change went into production, and then look at the day after our change and compare the differences in the time taken for endpoint:diffs_metadata and endpoint:diffs_batch.json. Note, we should look at the 10k requests, not 1 or 2k. This is the same data that is shown here.
  • The tool that the daily performance tests use is here. It should be used to run the performance tests locally as well.
  • Check the latency for these controller methods in grafana here
  • Check the details for the gitaly diff stats call here
  • Check if there are any significant changes to redis here

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖

Merge request reports