Flaky Test Intervention - part 1
Pipeline instability
High numbers of flaky tests are significantly hindering our ability to deliver. Each day pipelines are failing because of known flaky test problems and we need to act.
Call to Action
Action Required by Friday, 2025-10-17
Each team listed below has at least one unreliable test. These tests will be quarantined on Friday, 2025-10-17 unless you take action before then.
What you need to do:
Review each test issue and choose one of these options:
- Fix or close – Repair the flaky test, or close the issue if the failure was environment-related (Duo can help identify the root cause)
- Delete – Remove the test if it's no longer needed
- Quarantine – Temporarily quarantine the test if you need more time to investigate or decide if it's necessary
- Reassign – Update the team labels if a different team should own this test
- Remove labels – Remove the group labels if this is a shared test/library that your team cannot fix
The grace period is intended to give teams time to review the affected tests and address any critical tests so we don’t lose test coverage.
Affected teams
| Group | Tests categorized as "Top Flaky Test" |
Flaky Test Issue* |
Quarantined Tests | EM |
|---|---|---|---|---|
| group::authorization | 0 | |||
| group::authentication | ||||
| group::security policies | ||||
| group::compliance | ||||
| group::product planning |
DRI: @mksionek |
|||
| group::project management | ||||
| group::seat management | ||||
| group:platform insights | 0 | |||
| group::ci platform | 0 | |||
| group::database frameworks | ||||
| group::provision | 0 | |||
| group::container registry | 0 | |||
| group::source code | ||||
| group::security infrastructure | ||||
| group::runners platform | 0 | |||
| group::custom models | 0 | |||
| group::code review | ||||
| group::import | ||||
| group::security insights | 0 | |||
| group::package registry | 0 | |||
| group::runner core | ||||
| group::pipeline execution | ||||
| group::acquisition | ||||
| group::knowledge | ||||
| group::geo | ||||
| group::engagement | ||||
| group::mlops | ||||
| group::editor extensions | 0 |
- Test failure issues don't always map to number of tests failing.
Timeline
- Between now and Friday, 2025-10-17 - Investigate test failures and take action
- Friday, 2025-10-17 – DevEx will quarantine all outstanding flaky tests in this report
- Between 2025-10-15 to January 17, 2026 – Teams have 3 months to fix and un-quarantine these tests. Any tests still quarantined after 2026-01-16 will be permanently deleted.
Why we're doing this
This intervention quickly restores our test health and rebuilds confidence in our testing suite. It contributes to ongoing efforts to unflake the flakys and identify the top issues affecting pipeline stability each week.
In the future, DevEx will introduce an automated, data-driven quarantine process powered by Duo. We'll share the working epic as we approach the design and implementation phase.