[Feature flag] Rollout of verify_ff_merge_advancement

Summary

This issue is to roll out the fix on production, that is currently behind the verify_ff_merge_advancement feature flag.

The flag guards the fast-forward / semi-linear + automatic-rebase merge path: it verifies the fast-forward actually advanced the target branch (using an optimistic lock) instead of silently recording the merge request as merged when no commit landed. Without the guard, such a no-op merge is recorded with merge_commit_sha = nil and — with source-branch deletion enabled — the source branch is removed, causing data loss.

Owners

  • Most appropriate Slack channel to reach out to: #g_code_review
  • Best individual to reach out to: @marc_shaw

Expectations

What are we expecting to happen?

On the fast-forward and "merge commit with semi-linear history" merge methods with Enable automatic rebase prior to merge turned on, a merge whose fast-forward would be a no-op (for example a rebase that collapses onto the target tip, or a stale/concurrent ref) is rejected with a merge error and the merge request stays open — instead of being silently recorded as merged and having its source branch deleted. Legitimate merges that advance the target branch are unaffected.

What can go wrong and how would we detect it?

  • The failure this fixes is data loss (see https://gitlab.com/gitlab-org/gitlab/-/work_items/598820): an MR recorded as merged with merge_commit_sha = nil while the target branch never advanced, followed by source-branch deletion.
  • Risk when enabling: a false positive could reject a legitimate fast-forward merge. The user would see An error occurred while merging; the merge service log records Fast-forward merge did not advance the target branch. Detect via merge error rates, support tickets, and merge service logs.
  • Mitigation: the flag can be disabled instantly to restore the previous behavior (see Rollback Steps).
  • Actor type: project.

Rollout Steps

Note: Please make sure to run the chatops commands in the Slack channel that gets impacted by the command.

Rollout on non-production environments

  • Verify the MR with the feature flag is merged to master and has been deployed to non-production environments with /chatops gitlab run auto_deploy status <merge-commit-of-your-feature>
  • Deploy the feature flag at a percentage (recommended percentage: 50%) with /chatops gitlab run feature set verify_ff_merge_advancement <rollout-percentage> --actors --dev --pre --staging --staging-ref
  • Monitor that the error rates did not increase (repeat with a different percentage as necessary).
  • Enable the feature globally on non-production environments with /chatops gitlab run feature set verify_ff_merge_advancement true --dev --pre --staging --staging-ref
  • Verify that the feature works as expected.

Specific rollout on production

For visibility, all /chatops commands that target production must be executed in the #production Slack channel and cross-posted (with the command results) to the responsible team's Slack channel.

  • Ensure that the feature MRs have been deployed to both production and canary with /chatops gitlab run auto_deploy status <merge-commit-of-your-feature>
  • This flag uses a project actor: /chatops gitlab run feature set --project=gitlab-org/gitlab,gitlab-org/gitlab-foss,gitlab-com/www-gitlab-com verify_ff_merge_advancement true
  • Verify that the feature works for the specific actors.

Preparation before global rollout

  • Set a milestone to this rollout issue to signal for enabling and removing the feature flag when it is stable.
  • Ensure that you or a representative in development can be available for at least 2 hours after feature flag updates in production.
  • Ensure that documentation exists for the feature, and the version history text has been updated.

Global rollout on production

  • Incrementally roll out the feature on production.
    • Example: /chatops gitlab run feature set verify_ff_merge_advancement <rollout-percentage> --actors.
    • Between every step wait for at least 15 minutes and monitor the appropriate graphs on https://dashboards.gitlab.net.
  • After the feature has been 100% enabled, wait for at least one day before releasing the feature.

Release the feature

  • Create a merge request to remove the verify_ff_merge_advancement feature flag, with these changes:
    • Remove all references to the feature flag from the codebase.
    • Remove the YAML definitions for the feature from the repository.
  • Ensure that the cleanup MR has been included in the release package.
  • Once the cleanup MR has been deployed to production, clean up the feature flag from all environments by running this chatops command in #production: /chatops gitlab run feature delete verify_ff_merge_advancement --dev --pre --staging --staging-ref --production
  • Close this rollout issue.

Rollback Steps

  • This feature can be disabled on production by running the following Chatops command:
/chatops gitlab run feature set verify_ff_merge_advancement false
  • Disable the feature flag on non-production environments:
/chatops gitlab run feature set verify_ff_merge_advancement false --dev --pre --staging --staging-ref
  • Delete feature flag from all environments:
/chatops gitlab run feature delete verify_ff_merge_advancement --dev --pre --staging --staging-ref --production