Skip to content

Fail after 20 specs failure by default

David Dieulivol requested to merge ddieulivol-fail_after_x_failures_rspec into master

From Draft to Ready

Context

Failed specs can take a long time to run, due to potential timeouts before reaching those errors. This currently makes some RSpec jobs run for a long time, and this lengthen the pipelines duration.

Most of those errors would probably happen because of flaky tests anyways, where we would want to retry the job anyways.

What does this MR do and why?

Fail after 20 specs failure by default.

This commit introduces a limit to the number of failures before we fail the RSpec job that ran them (defaults to 20 for now).

Why a fixed value? Isn't it too little?

I think 20 failures in a single job constitutes a "good feedback" about what specs to fail, especially if we get that feedback much earlier on because the pipeline fails earlier.

A smaller amount of failed specs feels like a good cognitive value, as I don't expect people to read more than 5 specs failures from a single job.

I thought about using a percentage, but I don't think it'll add a lot of value after all, particularly if the number of specs varies greatly (e.g. 10% errors on 5000 specs means we need to have more than 500 errors to "fail fast", which doesn't sounds helpful anyways, as the engineer most probably won't parse those 500 errors).

If we see there are valid use-cases for having a higher limit, we could consider adding a label like skip-rspec-fail-fast to remove this limit for the current MR.

Does it work?

I introduced a test commit (!124444 (253e865a)) that should introduce 122 specs failure.

We stop after 20 failed specs 🎉 :

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by David Dieulivol

Merge request reports