Skip to content

Automate Feature Flag Clean Up

Problem

In Feature Flags For GitLab Development, there are many manual steps are involved and one of them is Clean Up. Today, engineers have to do the following steps to correctly complete a feature flag usage.

  1. Remove a feature flag YAML entry in code.
  2. Remove a overriding feature flag in databaase.
  3. Close a rollout issue.

I think we can automate these steps by the following process change:

Proposal

  • We introduce a maintenance worker (i.e. sidekiq cron worker) to clean up unused overriding flags.
    • The unused flags can be fetched by (YAML_definitions - persisted_flags). Given all feature flag references in code must have YAML definitions, persisted flags don't exist in definitions are not used in anywhere.
    • This approach works in any GitLab instances, including production.
  • We introduce a new attribute in YAML definition, named auto_clean_up: true/false.
  • The maintenance worker auto-generate a merge request for cleanable flags.
    • Definition of "cleanable"
      • auto_clean_up is true
      • milestone is older than the current milestone. e.g. A feature was introduced in 13.6. The flag is removed in 13.7.
    • Process flow:
      • Fetch a cleanable flag
      • Create a branch named chore/feature-flags/removal/#{flag.name} in the gitlab canonical project.
      • Create a commit on the branch that deletes a feature flag YAML definition.
      • Create a merge request targets master.
        • This MR includes Close #{rollout_issue_url} mention, which means it automatically closes the rollout issue when it's merged.
        • This MR is labeled feature flag and automations
        • This MR is assigned to the author of the flag.
      • The author fixes the pipeline in the merge request by removing all occurrences in the source code.
    • All of the automation events are reported in rollout_issue_url.
  • Documentation

Please see !48463 (closed) for more details on the complete flow of this.

Previous proposal ### Development phase
  • To enable a feature flag by default (typically for shipping a feature for a release), engineers should change default_enabled to true in YAML.
  • When the default_enabled change is merged into master and deployed on production, the automation removes overriding flags in database.
  • When the default_enabled change is merged into master and deployed on production, the automation closes the rollout issue i.e. rollout_issue_url.

Cleanup phase

  • If a feature flag fulfills these conditions, the automation creates a cleanup MR:
    • default_enabled is true more than 30 days.
    • Overriding flag in database does not exist (in case SRE disabled the flag after the above phase is complete)
    • type is development
  • Cleanup MR is automatically generated with the following changes:
    • Automatically, remove the YAML entry.
    • DRI manually check the invocation of the feature flag and remove them. Otherwise, the test will fail.
    • Assign group-label to the MR based on group attribute.
    • Assign the creator of flag to the MR.
Edited by Shinya Maeda