Release index page loading is slow
Problem
Release index page (e.g. https://gitlab.com/gitlab-org/gitlab/-/releases) takes 3-4 seconds to present the data. As we discovered in the profile, the bottle neck is the Markdown rendering for the release descriptions.
How the current caching works
Basically, there are two process steps on markdown rendering:
- Pre-processing. When a user inputs
description
filed ofrelease
, the plain texts will be converted to HTML that will be rendered as Markdown. This converted data will be persisted into database asdescription_html
. - Post-processing. Re-process the
description_html
when it's provided for users. Specifically,
The key point is that pre-processing doesn't take user information into account. The final content of the markdown will vary per user. For example, non-project members see the confidential issue as just a plaintext URL, but project members see the issue title on tooltip.
So if we want to address the core performance problem, here are some approaches to consider:
- Improve the post-processing facility
MarkupHelper.markdown_field
. There might be room for the further optimization (I've not checked the entire implementation). - We cache post-processed content into Redis when user accesses the GraphQL query, so the second or latter rendering will be significantly faster. But, since cached data could exist for the number of users, this could pressurize the redis capacity. Also, we have to think we these data should be invalided.
- We don't post-process
description_html
, but present it directly. We change pre-processing to persist redacted data beforehand. Users can no longer see any special references. i.e. sacrificing UX for performance.
Proposal
Given the approaches described above takes some efforts/times to get this issue fixed, at first we take the most boring, but effective approach that reducing page size to 10 from 20.
As the smaller number of entries that need to post-process the description_html
, we can expect an immediate performance improvement. For example, in my local environment, The loading time is cut down to half.:
PAGE_SIZE = 20
=> 1490 msec
PAGE_SIZE = 10
=> 691 msec
This per-page-size seems same with GitHub releases that shows 10 releases by default (Example).
Reference
- Spreadsheet (Internal Only)
previous description
https://gitlab.com/api/v4/projects/278964/releases?per_page=20 takes 1-1.5 seconds to load, which is very slow for only 20 items being rendered.-
We expose assets, but don't preload them. Which creates N+1 problem. -
And check for any other N+1 -
Write test for future N+1's -
If releases endpoint still will be slow, create any necessary issues for future optimizations. It's possible that markdown rendering is the slowest part. If it's so, we probably can't do anything about it 🤷