Skip to content

Improve diff_files endpoint performance

Matthias Käppler requested to merge 326316-improve-async-diff-rendering into master

What does this MR do?

On .com, we currently load commit diffs via a special endpoint asynchronously (it's been a long-standing feature toggle, async_commit_diff_files, which was never rolled out beyond gitlab-org/gitlab). In #326316 we found that diff rendering is very expensive both in terms of CPU and memory, but this endpoint specifically was doing some unnecessary work. This MR is trying to improve this by:

  • Rendering the response as HTML not JSON. This removes the need to escape potentially very large diff HTML strings to stuff them into a JSON envelope.
  • Optimize a few unnecessary allocations on frequently visited code paths during diff rendering.

Results

Taking an object profile for the diff_files endpoint and a large commit with ~2000 diffs across 86 files:

Before

{"mem_objects":2587625,"mem_bytes":210585416,"mem_mallocs":621598}

After

{"mem_objects":2350725,"mem_bytes":173328440,"mem_mallocs":611828}

This saves us ~240k allocations (which also improves GC churn and thus saves CPU cycles) and 37MB of memory per call.

Preview

I've enabled the feature toggle on a review-app. You can look at how the commit is rendered here: https://gitlab-review-326316-imp-qmshtg.gitlab-review.app/gitlab-qa-sandbox-group/qa-test-2021-04-19-12-44-24-d48bb126e58af07f/autodevops-project-47056ff5ba73b673/-/commit/d2cbbaf58dc845048c3bb630f80226000f0d2dd3

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Related to #326316

Edited by Matthias Käppler

Merge request reports