Flaky Test Intervention - part 1

Pipeline instability

High numbers of flaky tests are significantly hindering our ability to deliver. Each day pipelines are failing because of known flaky test problems and we need to act.

Call to Action

Action Required by Friday, 2025-10-17

Each team listed below has at least one unreliable test. These tests will be quarantined on Friday, 2025-10-17 unless you take action before then.

What you need to do:

Review each test issue and choose one of these options:

  1. Fix or close – Repair the flaky test, or close the issue if the failure was environment-related (Duo can help identify the root cause)
  2. Delete – Remove the test if it's no longer needed
  3. Quarantine – Temporarily quarantine the test if you need more time to investigate or decide if it's necessary
  4. Reassign – Update the team labels if a different team should own this test
  5. Remove labels – Remove the group labels if this is a shared test/library that your team cannot fix

The grace period is intended to give teams time to review the affected tests and address any critical tests so we don’t lose test coverage.

Affected teams

Group Tests categorized as "Top Flaky Test"

Flaky Test Issue*

Quarantined Tests EM
group::authorization

4

0

@ajaythomasinc

group::authentication

9

6

@adil.farrukh

group::security policies

3

3

@alan

group::compliance

45

2

@nrosandich

group::product planning

96

15

@vshushlin

DRI: @mksionek

group::project management

89

16

@acroitor

group::seat management

23

3

group:platform insights

1

0

@nicholasklick

group::ci platform

3

0

@golnazs

group::database frameworks

2

2

@alexives

group::provision

12

0

@bhrai

group::container registry

0

0

@crystalpoole

group::source code

9

38

3

@andrevr

group::security infrastructure

15

2

@ryaanwells

group::runners platform

1

0

@kkyrala

group::custom models

1

0
group::code review

23

22

@francoisrose

group::import

7

7

@thiagocsf

group::security insights

1

1

0
group::package registry

4

0

@crystalpoole

group::runner core

2

1

@adebayo_a

group::pipeline execution

6

9

@drew

group::acquisition

3

1

@kniechajewicz

group::knowledge

3

1

@armin.pasalic

group::geo

5

3

@luciezhao

group::engagement

10

1

@ghosh-abhinaba

group::mlops

1

1

group::editor extensions

1

0

@aelhusseiny

  • Test failure issues don't always map to number of tests failing.

Timeline

  1. Between now and Friday, 2025-10-17 - Investigate test failures and take action
  2. Friday, 2025-10-17 – DevEx will quarantine all outstanding flaky tests in this report
  3. Between 2025-10-15 to January 17, 2026 – Teams have 3 months to fix and un-quarantine these tests. Any tests still quarantined after 2026-01-16 will be permanently deleted.

Why we're doing this

This intervention quickly restores our test health and rebuilds confidence in our testing suite. It contributes to ongoing efforts to unflake the flakys and identify the top issues affecting pipeline stability each week.

In the future, DevEx will introduce an automated, data-driven quarantine process powered by Duo. We'll share the working epic as we approach the design and implementation phase.

Edited by 🤖 GitLab Bot 🤖