Fail pipeline early when there is rspec failure related to MR changes
What does this MR do?
Provide faster feedback to MR authors by failing the pipeline early when there is rspec failure related to the MR changes.
This is done by:
- Adds a
rspec fail-fast
job that runs in parallel to otherrspec *
jobs. This job runs test files that have been identified indetect-tests
job. - Adds a
cancel-current-pipeline
job that would run ifrspec fail-fast
job failed. This job will cancel the ongoing pipeline and all its jobs.
The jobs are dependent on the following CI variables:
-
RSPEC_FAIL_FAST_ENABLED=true
needs to be explicitly set in order for the 2 jobs to be created. Otherwise, the pipeline will run as per status quo, without failing early -
RSPEC_FAIL_FAST_TEST_FILE_COUNT_THRESHOLD
- this sets the threshold number of test files in order to keep therspec fail-fast
job duration short. The jobdetect-tests
detects a list of test files that may be affected by the MR. If the number of test files is greater than the threshold, therspec fail-fast
job ends without running the tests. All the otherrspec
jobs in the pipeline runs as per normal. This is to preventrspec fail-fast
taking too long and delaying the entire pipeline.
For MR authors:
- MR authors may choose to skip the fast feedback by adding
[SKIP RSPEC FAIL-FAST]
in the MR title.
Test plan
- Test against types of pipelines:
- docs !41278 (closed) - no change as docs pipeline don't run tests
✅ - backend !41276 (closed) -
✅ - frontend !41277 (closed) -
✅ - qa !41279 (closed) -
✅
- docs !41278 (closed) - no change as docs pipeline don't run tests
- Test threshold: !41659 (closed) -
✅ - Test skipping using MR title: https://gitlab.com/gitlab-org/gitlab/-/pipelines/187073389 -
✅ - Test security MR - https://gitlab.com/gitlab-org/security/gitlab/-/merge_requests/923 -
✅ - the behaviour works in security MR, but we could start with disabling this in security MR by not setting CI variable
RSPEC_FAIL_FAST_ENABLED=true
- the behaviour works in security MR, but we could start with disabling this in security MR by not setting CI variable
Measurements
Leading indicators:
- Failed MR pipeline duration: https://app.periscopedata.com/app/gitlab/496118/Engineering-Productivity-Sandbox?widget=6752376&udv=833427
- the duration of failed MR pipeline duration is expected to go down as we are failing the pipeline earlier
- average
rspec fail-fast
job duration: https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=9699553&udv=1005631- the average duration of this job correlates to
RSPEC_FAIL_FAST_TEST_FILE_COUNT_THRESHOLD
. This measurement gives us visibility into the effect of varyingRSPEC_FAIL_FAST_TEST_FILE_COUNT_THRESHOLD
, which in turn affect the failed MR pipeline duration.
- the average duration of this job correlates to
- pass/fail ratio on
rspec fail-fast
job - part of https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=9699553&udv=1005631:- this gives visibility into how useful this job is in shortening feedback time.
Lagging indicators:
- average cost of MR pipeline
- average cost of MR failed pipeline -- not sure if we have this at the moment?
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Part of #227531 (closed)
Edited by Albert Salim