[FF] `verify_create_ref_advancement` -- Reject rebase-collapse in CreateRefService (auto-rebase + merge trains)
Summary
This issue is to roll out the fix on production, that is currently behind the verify_create_ref_advancement feature flag.
The flag guards MergeRequests::CreateRefService — the shared ref builder used by both the auto-rebase merge path (MergeStrategies::FromSourceBranch) and merge trains (MergeTrains::CreateRefService). When a fast-forward source is rebased onto the target tip but carries no unique commits, the generated ref collapses back to its first parent. Recording that as a merge writes no commit yet still deletes the source branch — silently losing the merge request's work. With the flag on, the collapse is rejected (CreateRefError) and the error is propagated to both callers instead of producing a no-op merge.
This is the companion guard to verify_ff_merge_advancement (rollout #602293). That flag verifies the fast-forward at the FromSourceBranch#fast_forward! layer; this flag closes the same data-loss gap one layer earlier, inside CreateRefService, which is the only protection for the merge-train path. It is kept as a separate flag so it can be rolled out and rolled back independently.
Owners
- Most appropriate Slack channel to reach out to:
#g_code_review - Best individual to reach out to: @marc_shaw
Expectations
What are we expecting to happen?
On the fast-forward / "merge commit with semi-linear history" merge methods with Enable automatic rebase prior to merge turned on — and on merge trains — a ref build whose generated commit does not advance past its first parent (for example a rebase that collapses onto the target tip / previous car's ref, or a blank result) is rejected with an error and the merge request stays open, instead of being silently recorded as merged and having its source branch deleted. Legitimate merges that produce a real commit are unaffected.
What can go wrong and how would we detect it?
- The failure this fixes is data loss (see #598820): an MR recorded as merged with
merge_commit_sha = nilwhile the target never advanced, followed by source-branch deletion. - Risk when enabling: a false positive could reject a legitimate ref build. On the merge path the user would see
An error occurred while merging; the merge service log recordsThe merge request has no changes to merge after rebasing onto the target branch. On merge trains the car would fail to build its ref. Detect via merge error rates, merge-train failures, support tickets, and merge service logs. - Mitigation: the flag can be disabled instantly to restore the previous behavior (see Rollback Steps).
- Actor type: project (target project).
Rollout Steps
Note: Please make sure to run the chatops commands in the Slack channel that gets impacted by the command.
Rollout on non-production environments
- Verify the MR with the feature flag is merged to
masterand has been deployed to non-production environments with/chatops gitlab run auto_deploy status <merge-commit-of-your-feature> - Deploy the feature flag at a percentage (recommended percentage: 50%) with
/chatops gitlab run feature set verify_create_ref_advancement <rollout-percentage> --actors --dev --pre --staging --staging-ref - Monitor that the error rates did not increase (repeat with a different percentage as necessary).
- Enable the feature globally on non-production environments with
/chatops gitlab run feature set verify_create_ref_advancement true --dev --pre --staging --staging-ref - Verify that the feature works as expected.
Specific rollout on production
For visibility, all /chatops commands that target production must be executed in the #production Slack channel and cross-posted (with the command results) to the responsible team's Slack channel.
- Ensure that the feature MRs have been deployed to both production and canary with
/chatops gitlab run auto_deploy status <merge-commit-of-your-feature> - This flag uses a project actor:
/chatops gitlab run feature set --project=gitlab-org/gitlab,gitlab-org/gitlab-foss,gitlab-com/www-gitlab-com verify_create_ref_advancement true - Verify that the feature works for the specific actors.
Preparation before global rollout
- Set a milestone to this rollout issue to signal for enabling and removing the feature flag when it is stable.
- Ensure that you or a representative in development can be available for at least 2 hours after feature flag updates in production.
- Ensure that documentation exists for the feature, and the version history text has been updated.
Global rollout on production
- Incrementally roll out the feature on production.
- Example:
/chatops gitlab run feature set verify_create_ref_advancement <rollout-percentage> --actors. - Between every step wait for at least 15 minutes and monitor the appropriate graphs on https://dashboards.gitlab.net.
- Example:
- After the feature has been 100% enabled, wait for at least one day before releasing the feature.
Release the feature
- Create a merge request to remove the
verify_create_ref_advancementfeature flag, removing all references and the YAML definition. - Close the feature issue to indicate the feature will be released.
- Once the cleanup MR has been deployed to production, clean up the feature flag from all environments by running
/chatops gitlab run feature delete verify_create_ref_advancement --dev --pre --staging --staging-ref --productionin#production. - Close this rollout issue.
Rollback Steps
- This feature can be disabled on production by running the following Chatops command:
/chatops gitlab run feature set verify_create_ref_advancement false- Disable the feature flag on non-production environments:
/chatops gitlab run feature set verify_create_ref_advancement false --dev --pre --staging --staging-ref- Delete feature flag from all environments:
/chatops gitlab run feature delete verify_create_ref_advancement --dev --pre --staging --staging-ref --production