Improve systemic errors detection by looking at the first backtrace line (!46) · Merge requests · GitLab.org / ruby / gems / gitlab_quality-test_tooling

Rémy Coutable requested to merge improve-systemic-error-detection into main Jun 30, 2023

What does this MR do and why?

We now look at the first line of the error backtrace to group errors, and better detect systemic errors (i.e. errors that happen systemically after a certain point due to, most probably runner environment issue, e.g. resources are exhausted, PG doesn't have enough memory etc.).

Note: There's a risk that "legit" failure messages that are a bit generic, e.g. Failure/Error: expect(job).to be_successful(timeout: 400) might be detected as systemic with this change. I'll open an MR to allow to customize the systemic error detection threshold.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Edited Jun 30, 2023 by Rémy Coutable

Improve systemic errors detection by looking at the first backtrace line

What does this MR do and why?

MR acceptance checklist

Merge request reports