Diff syntax highlighting takes a reasonable amount of time to be processed in the first load
Giving a bit of context on how highlighting backend works:
- On the first
/diffs
load we useRouge
to highlight both old and new blobs (we need them to properly present the diff) - The whole highlighted diff is cached on Redis
- We reset the cache when reloading diffs, or after 1 week
- Just MRs have the highlighting diffs cached
Given that, I've run a few local tests in a big MR such as oswaldo/nautilus-test!1 (diffs) and the output is on https://gitlab.com/snippets/1743242.
Summary
- 122 files
- 244 blobs
- It takes 7.7 seconds just highlighting (take with a grain of salt, it's a localhost)
Possible solutions
A quick improvement would be scheduling the highlighting upon the MR creation:
-
👍 Perceived performance improvement on first load -
👎 We would probably cache more and spend more memory
Although frontend still need some work to handle the amount of data being received, backend would be improved here.
I wonder if there would be a place in Gitaly for an gRPC that could handle that job in a second step. We have Chroma on Go realm, and Go probably handles CPU-bound things like highlighting more efficiently (still should give Chroma a try).
Also cc'ing @smcgivern for now
Edited by Oswaldo Ferreira