Automate Feature Flag Clean Up
Problem
In Feature Flags For GitLab Development, there are many manual steps are involved and one of them is Clean Up. Today, engineers have to do the following steps to correctly complete a feature flag usage.
- Remove a feature flag YAML entry in code.
- Remove a overriding feature flag in databaase.
- Close a rollout issue.
I think we can automate these steps by the following process change:
Proposal
- We introduce a maintenance worker (i.e. sidekiq cron worker) to clean up unused overriding flags.
- The unused flags can be fetched by
(YAML_definitions - persisted_flags)
. Given all feature flag references in code must have YAML definitions, persisted flags don't exist in definitions are not used in anywhere. - This approach works in any GitLab instances, including production.
- The unused flags can be fetched by
- We introduce a new attribute in YAML definition, named
auto_clean_up: true/false
.- This option can be enabled when:
-
type
isdevelopment
-
default_enabled
istrue
-
milestone
is present -
introduced_by_url
is in the gitlab canonical project. -
rollout_issue_url
is in the gitlab canonical project.
-
- This option can be enabled when:
- The maintenance worker auto-generate a merge request for cleanable flags.
- Definition of "cleanable"
-
auto_clean_up
istrue
-
milestone
is older than the current milestone. e.g. A feature was introduced in13.6
. The flag is removed in13.7
.
-
- Process flow:
- Fetch a cleanable flag
- Create a branch named
chore/feature-flags/removal/#{flag.name}
in the gitlab canonical project. - Create a commit on the branch that deletes a feature flag YAML definition.
- Create a merge request targets
master
.- This MR includes
Close #{rollout_issue_url}
mention, which means it automatically closes the rollout issue when it's merged. - This MR is labeled feature flag and automations
- This MR is assigned to the author of the flag.
- This MR includes
- The author fixes the pipeline in the merge request by removing all occurrences in the source code.
- All of the automation events are reported in
rollout_issue_url
.
- Definition of "cleanable"
- Documentation
Please see !48463 (closed) for more details on the complete flow of this.
Previous proposal
### Development phase- To enable a feature flag by default (typically for shipping a feature for a release), engineers should change
default_enabled
totrue
in YAML. - When the
default_enabled
change is merged intomaster
and deployed on production, the automation removes overriding flags in database. - When the
default_enabled
change is merged intomaster
and deployed on production, the automation closes the rollout issue i.e.rollout_issue_url
.
Cleanup phase
- If a feature flag fulfills these conditions, the automation creates a cleanup MR:
-
default_enabled
istrue
more than 30 days. - Overriding flag in database does not exist (in case SRE disabled the flag after the above phase is complete)
-
type
isdevelopment
-
- Cleanup MR is automatically generated with the following changes:
- Automatically, remove the YAML entry.
- DRI manually check the invocation of the feature flag and remove them. Otherwise, the test will fail.
- Assign group-label to the MR based on
group
attribute. - Assign the creator of flag to the MR.
Edited by Shinya Maeda