[Experiment] Retry failed specs in a new process after the initial run
Abstract
Given that a lot of flaky tests are unreliable due to previous test which are affecting the global state, retrying only the failing specs in a new RSpec process should result in a better overall success rate.
Pros
- Automatically make the pipelines more successful
Cons
- The retrying is transparent so we'll need a way to surface the automatically retried tests so that they're solved at some point
- The retrying will add a few more minutes to the RSpec job
Timeframe
- 14 days - from 2022-02-16 to 2022-03-02
Expected results
-
10-day moving average
master
success rate increase to 91% (currently at 86%) - at least 25% less broken
master
notifications - lower "Average Retry Count", ideally 0.03 instead of 0.07
- Increase of TtFF due to valid RSpec failures that will be run twice
MR
Communication
-
2022-01-26 Communicate in #development
:mega: We’re going to start a 20-days experiment to retry failing tests in a separate RSpec process in order to detect order-dependent/global state problems and to increase `master`’s stability: https://gitlab.com/gitlab-org/quality/team-tasks/-/issues/1148. If you notice anything weird related to RSpec jobs, please let us know in #g_engineering_productivity. :mega:
-
Add an item in the upcoming Eng. Week in Review -
2022-02-16 Communicate in #development
:mega: We’re going to resume a 14-days experiment to retry failing tests in a separate RSpec process in order to detect order-dependent/global state problems and to increase `master`’s stability: https://gitlab.com/gitlab-org/quality/team-tasks/-/issues/1148. If you notice anything weird related to RSpec jobs, please let us know in #g_engineering_productivity. :mega:
Once the experiment is validated:
-
Add an item in the upcoming Eng. Week in Review -
Communicate in #development
,#quality
and more
How to enable
Set the $RETRY_FAILED_TESTS_IN_NEW_PROCESS
variable to true
.
-
Set to true
on 2022-01-26 at 10:50 UTC for https://gitlab.com/gitlab-org/gitlab/-/settings/ci_cd and https://gitlab.com/gitlab-org/security/gitlab/-/settings/ci_cd. -
Set to false
on 2022-01-27 at 17:29 UTC for https://gitlab.com/gitlab-org/gitlab/-/settings/ci_cd and https://gitlab.com/gitlab-org/security/gitlab/-/settings/ci_cd due to gitlab-org/gitlab#351341 (closed). -
Set to true
on 2022-02-16 at 10:58 UTC for https://gitlab.com/gitlab-org/gitlab/-/settings/ci_cd and https://gitlab.com/gitlab-org/security/gitlab/-/settings/ci_cd. -
Set to true
on 2022-02-23 at 16:20 UTC for https://gitlab.com/gitlab-org/gitlab-foss/-/settings/ci_cd. -
Set to true
on 2022-04-15 at 16:30 UTC for https://dev.gitlab.org/gitlab/gitlab-ee/.
How to disable
- Globally: remove the
$RETRY_FAILED_TESTS_IN_NEW_PROCESS
variable (or set it to something different thantrue
)
Edited by Rémy Coutable