Create common suite of benchmarks that regularly run against `master` branch (#1) · Issues · GitLab.org / GitLab Data Access / Git team / benchmarks · GitLab

Create common suite of benchmarks that regularly run against `master` branch

Due to the infrequent release schedule of Git, there typically is a large gap between an upstream change and it having an effect on either GitLab.com or self-managed instances. This gap is typically at least 3 months, often even longer, and widened in self-managed instances where the update schedule may not be as rigorous as it is for GitLab.com. This causes issues in the context of performance changes: - Determining the root cause of performance regressions is tedious because we have to investigate hundreds of commits. - Fixing regressions upstream will be a slow process and may delay an upgrade by another couple of months. - Success stories in the context of optimizations cannot be communicated clearly, decreasing visibility of the work the Git team does. While we have tools like GET that do performance testing at scale, these tools work on top of GitLab. And as Git is basically the backend of the backend, it is hard to draw the dots between performance changes in GET and changes in Git themselves. Furthermore, running GET is on the more expensive side given that it requires a full GitLab cluster and thus cannot be run regularly enough for us to track changes in Git on a granular scale. This issue thus tracks the creation of a new Git-specific suite of benchmarks that can be run against arbitrary versions of the Git repository. These benchmarks should be built incrementally over time and run as part of regular CI, e.g. every time changes land on the `master` branch and on releases to compare with the last release. Data should be collected and ideally reported via a dashboard. This would allow us to notice performance regressions before the changes are released and allow us to demonstrate the impact that our work has on making Git more efficient.

issue