Feature Flags sometimes cause QA failures which can go unnoticed

Problem Statement

In one example: Recently, Auto-Deploy was blocked for a large chunk of the AMER timezone. While a target change was identified, we first determined a flaky test was the culprit and pushed forward with later Auto-Deploy packages. Instead, had we investigated the toggling of feature flags first, we could have come across at least 2 flags that were responsible for tests to have failed.

We should determine how we can better test feature flag changes such that they do not impact Auto-Deploy. There are a few ways we could attempt this, but there are some concerns that we should think about, including roping in the rest of Engineering to ensure the option we choose to implement is not a hindrance to Engineering workflows.

Potential Mitigation Options

Trigger QA after FF Toggle

We already do this!

**Any time a Feature Flag is changed, kick off QA.**

Cons:

We need to update procedures around FF's that indicate a passing QA test
FF's can be flipped at any point in time, should we block FF changes during deploys?
We'll want to figure out how to expose the QA pipelines to development teams so they can see the failed tests - currently this is not the case
We do not limit FF changes, thus we could run into a situation where many flags are toggled, each kicking off a new QA pipeline 😬

Option N...

Milestones

Discuss options
Issues/Epics for implementation created

cc @gitlab-org/delivery

Edited Nov 16, 2022 by John Skarbek

Admin message