Automate opening release/task issues on QA failures
🌳 Context
According to our Resolving QA Failures runbook, we should open an issue on release/tasks for every QA failure:
When a QA job fails in any of our environments (
gstg-cny
,gstg
,gprd
,pre
orrelease
)
- Follow the steps in the Handling Deploy Failures runbook to create an issue in the release tracker.
In the mentioned Handling Deploy Failures runbook, it says
If you cannot resolve the failure within 5 minutes...[open an issue]
QA tests almost always take more than 5 minutes to run, so this means we should always be opening an issue.
Having record of these retries also allows us to see how they effect our deployment velocity and missed packages. Additionally, it creates a way to identify common flaky tests to communicate with Quality.
Optimally we open an issue for every QA failure. But for flaky tests that succeed after a single retry, this is tedious and time consuming, adding to release manager toil.
💡 Proposal
- Update the CI rules to automatically open an issue on release/tasks when QA fails and assign it to release managers. If the test succeeds after a retry, the release manager can simply close the issue, but we now have record of it.
- Update the QA failure runbook to ask release managers to note these failures on the auto-deploy packages dashboard for higher visibility and insight into how they impact the weekly deployment frequency.
Implementation specs
-
Add classes to automatically generate a report gitlab-org/release-tools!2635 (merged) / gitlab-org/release-tools!2645 (merged) -
Add a feature flag and logging information -
Add ci configuration - [-]
Add a class to link the report on the Slack.Moved to #19860 -
Test -
Document (add to changelog)