Support for large merge requests
### Problem to solve Although large feature branches is an anti-pattern, large merge request diffs can occur in many other reasons. Fast, easy to navigate diffs become even more important in these situations. - **big merge requests** – like the monthly GitLab release post is small by some standards, but involves thousands of lines being added to a single file. - **merging between release branches** - many changes might accumulate and need to be merge from one location to another. - **merging upstream changes into downstream fork** - for large or critical projects, this can't be done directly into master. For example, a hardware vendor pulling kernel updates into their fork for networking hardware, or customer specific forks of a some other critical library - **reviewing automatically generated changes** – code generation tools can generate large quantities of code that needs to be reviewed manually even if automation tests succeed Even for merge requests that load in full today, waiting for the merge request to load can waste 10s of seconds many times per day. This is **inefficient** and **frustrating**. Worse still are the situations where it is physically impossible to view the entire diff in GitLab at all, which renders code review impossible. The interface should provide failure modes that are much more forgiving rather than being completely unusable. ### Further details Limitations and performance problems associated with rendering diffs have a long history. Prior to Gitaly, Rugged would generate very large strings in memory in Ruby, which would cause Unicorn timeouts. Being proactive to truncate strings and prevent the memory problems was the approach taken to mitigate this. With the advent of Gitaly, generating the diff became not problem until it was sent to the Ruby side, and again need to be truncated due to memory concerns. However, this still created significant memory pressure on the Rails application. For this reason, limits are sent to Gitaly to prevent large strings being sent to the Ruby application. Syntax highlighting is a significant challenge, because the entire file needs to be loaded into memory twice: - the input to be parsed by the syntax highlighting library, https://github.com/jneen/rouge, and - the syntax highlighted output The plain text diff generated by Git is very fast to generate, and is done in two stages: - `git diff -raw` to list files and detect renames - full git diff, which is processed line by line to split at file boundaries The current strategy of diffs involves pre-computing them so that they load quickly. The focus of this epic is primarily about optimizing the ~frontend behavior to reduce the amount of data transferred to load the page in a usable state. This directly support our Q2 product [OKR](https://gitlab.com/gitlab-com/Product/-/issues/1110) to prioritize UX work for merge requests. ### Proposal - Improve merge request diff performance under existing diff limits by implementing file-by-file diff viewing https://gitlab.com/groups/gitlab-org/-/epics/516 Switching to a file by file flow means data can be transferred to the client much more efficiently. - Increase merge request max files limit to 10,000 changed files, rather than 1,000 changed files Using the file tree, and file by file diffs mean that there shouldn't be a significant cost to loading the entire list of changed files. The file tree also makes it possible to load even the list of changed files incrementally by not expanding the tree by default. - Progressively load merge requests diffs https://gitlab.com/groups/gitlab-org/-/epics/1816 - Support viewing diffs of very large files without syntax highlighting - View full file falls back to raw file ### Customers and prospects - https://gitlab.my.salesforce.com/0014M00001inUHb ~customer ~"GitLab Ultimate" ~"priority::1" - https://gitlab.my.salesforce.com/00161000004yKzB - https://gitlab.my.salesforce.com/0016100001Eo8P8 ~customer ~"GitLab Ultimate" ~P3 - https://gitlab.my.salesforce.com/0016100000W44Pc ~"customer+" ~"GitLab Premium" ~P3 - https://gitlab.my.salesforce.com/00161000013aRjG ~customer+ ~"GitLab Premium" ~P4 - https://gitlab.my.salesforce.com/00161000003RIHP ~"customer+" ~"GitLab Premium" - https://gitlab.my.salesforce.com/00161000004bZPD ~customer ~"GitLab Premium" ~P3 - https://gitlab.my.salesforce.com/00161000003QEaF ~customer ~"GitLab Premium" - https://gitlab.my.salesforce.com/00161000004zq7Q ~customer ~"GitLab Starter" - https://gitlab.my.salesforce.com/00161000004xUPr ~customer ~"GitLab Premium" - https://gitlab.my.salesforce.com/0016100001I755j ~customer ~"GitLab Premium" - https://gitlab.my.salesforce.com/0016100001Eo2H5 ~customer ~"GitLab Premium" - Gnome - https://gitlab.zendesk.com/agent/tickets/90066 - https://gitlab.zendesk.com/agent/tickets/100582 - https://gitlab.zendesk.com/agent/tickets/112937 - https://gitlab.zendesk.com/agent/tickets/119919 - https://gitlab.zendesk.com/agent/tickets/247890 - https://gitlab.zendesk.com/agent/tickets/388122 ~customer ~"GitLab Premium"
epic