Build RSpec Test Profiling Into GitLab
Problem to solve
Software tests make up a significant portion of many CI pipelines yet there is currently no easy way within the GitLab application to determine which RSpec tests in a pipeline are slowest and cost the most in terms of database queries, time-to-run and, ultimately, CI minutes.
We created an internal tool for this in GitLab and discovered that our suite spends over 15 hours in test and issues over 9 million database queries. This slows down feedback cycles, prolongs review times and pushes up CI cost.
We could provide this kind of data to our users for their test pipelines.
Intended users
- Parker (Product Manager)
- Delaney (Development Team Lead)
- Sasha (Software Developer)
- Simone (Software Engineer in Test)
User experience goal
The user should be able to easily insert a job in their CI pipeline that uses data from their test profiler to identify slow tests and suggest improvements.
Proposal
Profilers that print useful data after test execution exist for many test frameworks.
GitLab, the application, uses Rspec as its backend testing framework. To solve this problem in the gitlab-org/gitlab
project @stanhu added scripts/insert-rspec-profiling-data. This script runs in the CI pipeline against merge commits in the default branch and stores the data generated by RSpec's profiler output in a database.
@jprovaznik Extended this by adding a new database and building a frontend that groups slow test files and specific examples and allows them to be sorted and searched: https://gitlab-org.gitlab.io/rspec_profiling_stats/ (Source code here)
This page is interesting in two ways.
- The data on individual slow tests and examples/suites allow contributors to quickly identify problem areas of the test code and clean them up.
- The overall data in the bottom table gives a summary view from which trends up or down can be measured over time.
First Iteration Proposal
A first iteration for including this in GitLab could focus on the first of these and be very simple:
- Parse the output from the Rspec profiler and store data on the slowest 1000 tests in an artifact that can then be downloaded by the user (MVC);
- Add documentation on how to set up this job;
- Add further documentation for to set up similar jobs for other common languages and test framework profilers;
- Build parsers, if necessary, so that the output from each can be converted to the same format, then
- Display data from these artifacts in a list view under the 'CI/CD' or 'Analytics' tabs in the sidebar and periodically cleaned up.
This is similar to the way we do Requirements Management. A job in the pipeline generates an artifact which is parsed and used to mark requirements as passed or failed.
Further details
- The GitLab Rspec Profiling project has more info on how that project works: https://gitlab.com/gitlab-org/rspec_profiling_stats
- The epic for making rspec profiling data usable within
gitlab-org/gitlab
is here &3752