[FF] `fix_merge_api_train_bypass` - Fix Accept MR API bypassing merge trains
## Summary This issue tracks the rollout of [the fix for issue #593465](https://gitlab.com/gitlab-org/gitlab/-/issues/593465) on production, currently behind the `fix_merge_api_train_bypass` feature flag. The flag gates a correctness fix to `PUT /projects/:id/merge_requests/:iid/merge`: - **Before:** When merge trains are enabled and the head pipeline has already succeeded, calling the endpoint with `auto_merge=true` silently bypasses the merge train and merges directly into the target branch — defeating the train's purpose. - **After (flag on):** - `auto_merge=true` + train enabled → MR is added to the merge train via the preferred train strategy (`merge_train` or `add_to_merge_train_when_checks_pass`). - `auto_merge=false` + train enabled → returns `422` with a message pointing the caller at `auto_merge=true`, the merge trains API, or `skip_merge_train=true`. - `skip_merge_train=true` → preserved as the explicit immediate-merge escape hatch. ## Owners - Most appropriate Slack channel to reach out to: `#g_code_review` - Best individual to reach out to: @marc_shaw ## Expectations ### What are we expecting to happen? API clients (CI bots, `glab`, custom automation) calling the Accept MR endpoint on merge-train-enabled projects will: - Correctly enqueue MRs onto the merge train when passing `auto_merge=true`, instead of silently bypassing the train. - Receive a clear `422` when attempting an immediate merge without explicit `skip_merge_train=true`, preventing the train from being silently bypassed by ambiguous calls. ### What can go wrong and how would we detect it? - **Behavior change for existing clients on train-enabled projects.** Clients that today call `auto_merge=true` and get an immediate merge will now get a train enqueue. Clients that today call without `auto_merge` and get an immediate merge will now get a `422` (unless they pass `skip_merge_train=true`). - Detection: increase in 422s on `PUT /merge_requests/:iid/merge` for train-enabled projects (logs / Kibana), customer reports. - **Wrong strategy selected.** `preferred_strategy` returns the first available; if `available_strategies` is empty for a train-enabled project we fall back to immediate merge or `not_allowed!`. Should not differ from current behavior in that edge case. - Detection: Kibana logs for the endpoint. Relevant dashboard: API error-rate panels on https://dashboards.gitlab.net/d/api. ## Rollout Steps Note: chatops commands run in the Slack channel impacted by the command. ### Rollout on non-production environments - Verify the MR with the feature flag is merged to `master` and has been deployed to non-production environments with `/chatops gitlab run auto_deploy status <merge-commit-of-your-feature>` - [ ] Deploy the feature flag at a percentage (recommended percentage: 50%) with `/chatops gitlab run feature set fix_merge_api_train_bypass 50 --actors --dev --pre --staging --staging-ref` - [ ] Monitor that the error rates did not increase (repeat with a different percentage as necessary). - [ ] Enable the feature globally on non-production environments with `/chatops gitlab run feature set fix_merge_api_train_bypass true --dev --pre --staging --staging-ref` - [ ] Verify the feature works as expected against a staging project with merge trains enabled. ### Specific rollout on production - Ensure that the feature MRs have been deployed to both production and canary. - [ ] Project-actor rollout to GitLab's own projects first: `/chatops gitlab run feature set --project=gitlab-org/gitlab,gitlab-org/gitlab-foss,gitlab-com/www-gitlab-com fix_merge_api_train_bypass true` - [ ] Verify behavior on `gitlab-org/gitlab` (which uses merge trains). ### Preparation before global rollout - [ ] Set a milestone to this rollout issue. - [ ] Check if the feature flag change needs to be accompanied with a [change management issue](https://about.gitlab.com/handbook/engineering/infrastructure-platforms/change-management/#feature-flags-and-the-change-management-process). - [ ] Ensure that you or a representative in development can be available for at least 2 hours after feature flag updates in production. - [ ] Update the [REST API docs](https://docs.gitlab.com/api/merge_requests/#merge-a-merge-request) noting the new `422` behavior on train-enabled projects. ### Global rollout on production - [ ] [Incrementally roll out](https://docs.gitlab.com/development/feature_flags/controls/#process) the feature on production. - Example: `/chatops gitlab run feature set fix_merge_api_train_bypass <rollout-percentage> --actors`. - Between every step wait for at least 15 minutes and monitor the appropriate graphs on https://dashboards.gitlab.net. - [ ] After the feature has been 100% enabled, wait for [at least one day before releasing the feature](#release-the-feature). ### Release the feature - [ ] Create an MR to remove the `fix_merge_api_train_bypass` feature flag and the legacy code path. - [ ] Once the cleanup MR has been deployed to production, clean up the feature flag from all environments: `/chatops gitlab run feature delete fix_merge_api_train_bypass --dev --pre --staging --staging-ref --production` - [ ] Close this rollout issue. ## Rollback Steps - [ ] Disable on production: ``` /chatops gitlab run feature set fix_merge_api_train_bypass false ``` - [ ] Disable on non-production: ``` /chatops gitlab run feature set fix_merge_api_train_bypass false --dev --pre --staging --staging-ref ``` - [ ] Delete from all environments: ``` /chatops gitlab run feature delete fix_merge_api_train_bypass --dev --pre --staging --staging-ref --production ```
issue