MRs changing the "lib/gitlab/metrics" folder and triggering an error are not caught in CI
Context
We've had two separate incidents where changes in the lib/gitlab/metrics
folder caused an error, and the error wasn't caught in the CI:
- gitlab-org/quality/engineering-productivity/review-apps-broken-incidents#207 (closed) (wasn't caught by review-apps, because review-apps were allowed to fail back then)
- #425841 (closed) (wasn't caught by review-apps, because we disabled them in MRs)
An initial Root Cause Analysis was made after the first incident (1, 2, 3), but we didn't find a fix.
Goal
Find a way to have those changes tested and run inside the MR (the earlier in the CI/CD pipeline, the better).
Questions to answer
- Why wasn't this problem caught by CI/CD preflight checks?
- Why wasn't this problem caught by GDK tests?
Steps to reproduce
- Take the most recent MR that caused some issues into your own MR
- Experiment to find a CI fix to actually see this error (ideally in a preflight check, or in a E2E test)
Edited by David Dieulivol