Defining large MR for performance testing
In theory, the upper-limit of a “large” MR can be infinite. But we need to define what a “large MR” means so that we can have a SSOT for continuously testing performance and tracking improvements.
Characteristics of the test sample
- Must be representative of the 95 or 99 percentile of MRs on GitLab.com, so we are taking into account the largest MRs in a reasonable way.
- Must be automatically reproducible, so that we can “destroy” and create the test sample without significant effort.
- Must be testable locally and on GitLab.com, so that we can test and debug it under different circumstances.
Definition of the test sample
- File size: Mix of files with small and large file sizes.
- File types Mix of image, binary, code, and prose files.
- No. of files: Many changed files.
- No. of lines: Mix of files with few and many changed lines.
- No. of commits: Many commits.
- No. of comments: Many comments and threads, mix of threads with few and many reply comments, and resolved, unresolved, and outdated threads.
- Comments and description content: Mix of text, headings, tables, images, videos, tasks, diagrams, math, emojis, code blocks.
- Pipelines Many pipelines.
source)
Data on Large MR (This is based on the 200 most active projects (as measured by number of MRs in last 90 days) on GitLab.com
- BY FILE COUNT
- 95%ile: 28.0
- 99%ile: 128.0
- Largest: 5984
- BY LINE COUNT
- 95%ile: 1375.0
- 99%ile: 11925.599999999977
- Largest: 14280733
- BY COMMIT COUNT
- 95%ile: 16.0
- 99%ile: 89.0
- Largest: 8802
Decision
Rspec Upgrade will be used for benchmark and testing moving forward.
Related links
- Example of an old MR with historical records: gitlab-foss!9546 (merged)
- Historical records: https://about.gitlab.com/handbook/engineering/performance/
Edited by Kai Armstrong