Implement of a global timeout at the spec level
Context
Some specs are running for more than 60 minutes (see #434948 (closed) and #435135 (comment 1692699550) for examples).
Discussed initially in gitlab-org/quality/engineering-productivity/team#323 (comment 1686250592).
Goal
Consider adding a global timeout mechanism for any RSpec test, that would short-circuit any spec taking more than a certain time (e.g. double the longest test time - apparently 5 minutes). This would have a few advantages:
- We would have less RSpec jobs timing out (and more erroring out), so the artifacts would be available in the job
- It would make it clearer which spec took longer than they should have
- It's a more actionable error message than "the job timed out after 90min"
Considerations
Interesting points from the initial discussion:
I'm all for it, but we do need to think about what action are we prompting MR authors, if their MR pipeline failed because a spec exceed that threshold - should we ask them to quarantine it? Is that going to cause more friction to the workflow? On the plus side, this approach encourages that we fail early so I like it.
I do like this, but I think we shouldn't fail the job if it can pass because it can hinder development.