Banzai optimization ideas
<!--IssueSummary start--> <details> <summary> Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards. </summary> - [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=342509) </details> <!--IssueSummary end--> While working on Banzai optimization tasks I discovered several targets in rendering that can be improved ### 1. EmojiFilter ([source link](https://gitlab.com/gitlab-org/gitlab/blob/54b6d8f318a5f94e261600d018798c29ac1594bf/lib/banzai/filter/emoji_filter.rb#L15)) It fetches all text nodes, converts them to HTML and replaces found :emoji: strings with images. #### Ideas * Try to use `node.text` instead of `node.to_html` because it is much faster. However, we have to ensure that this change does not break anything. ### 2. Regex search in reference cache ([source link](https://gitlab.com/gitlab-org/gitlab/blob/54b6d8f318a5f94e261600d018798c29ac1594bf/lib/banzai/filter/references/reference_cache.rb#L32)) We scan html document with regex for different references (issues, merge requests, ...). For large documents (for example, https://gitlab.com/gitlab-org/gitlab/-/blob/master/CHANGELOG.md) the regex search takes significant time. #### Ideas * Try to extract all references at once (currently we call regex [for each ReferenceFilter in the list](https://gitlab.com/gitlab-org/gitlab/blob/54b6d8f318a5f94e261600d018798c29ac1594bf/lib/banzai/pipeline/gfm_pipeline.rb#L52)) * Find an alternative to regex search * Use a faster regex implementation ### 3. Optimize queries to verify reference objects ([source link](https://gitlab.com/gitlab-org/gitlab/blob/abf7c6e52aa643a01e66abe6b9f0c6480df44ef8/lib/banzai/filter/references/issue_reference_filter.rb#L25)) We use several `WHERE IN (...)` queries for fetch merge requests for example. If we have many merge request ids the queries becomes slow. As an alternative we can use `INNER JOIN (VALUES (...))` trick. #### Implementations https://gitlab.com/gitlab-org/gitlab/-/merge_requests/71947 --- Feel free to update the description with other examples/ideas.
issue