[FF] `fix_merge_api_train_bypass` - Fix Accept MR API bypassing merge trains

Summary

This issue tracks the rollout of the fix for issue #593465 on production, currently behind the fix_merge_api_train_bypass feature flag.

The flag gates a correctness fix to PUT /projects/:id/merge_requests/:iid/merge:

  • Before: When merge trains are enabled and the head pipeline has already succeeded, calling the endpoint with auto_merge=true silently bypasses the merge train and merges directly into the target branch — defeating the train's purpose.
  • After (flag on):
    • auto_merge=true + train enabled → MR is added to the merge train via the preferred train strategy (merge_train or add_to_merge_train_when_checks_pass).
    • auto_merge=false + train enabled → returns 422 with a message pointing the caller at auto_merge=true, the merge trains API, or skip_merge_train=true.
    • skip_merge_train=true → preserved as the explicit immediate-merge escape hatch.

Owners

  • Most appropriate Slack channel to reach out to: #g_code_review
  • Best individual to reach out to: @marc_shaw

Expectations

What are we expecting to happen?

API clients (CI bots, glab, custom automation) calling the Accept MR endpoint on merge-train-enabled projects will:

  • Correctly enqueue MRs onto the merge train when passing auto_merge=true, instead of silently bypassing the train.
  • Receive a clear 422 when attempting an immediate merge without explicit skip_merge_train=true, preventing the train from being silently bypassed by ambiguous calls.

What can go wrong and how would we detect it?

  • Behavior change for existing clients on train-enabled projects. Clients that today call auto_merge=true and get an immediate merge will now get a train enqueue. Clients that today call without auto_merge and get an immediate merge will now get a 422 (unless they pass skip_merge_train=true).
    • Detection: increase in 422s on PUT /merge_requests/:iid/merge for train-enabled projects (logs / Kibana), customer reports.
  • Wrong strategy selected. preferred_strategy returns the first available; if available_strategies is empty for a train-enabled project we fall back to immediate merge or not_allowed!. Should not differ from current behavior in that edge case.
    • Detection: Kibana logs for the endpoint.

Relevant dashboard: API error-rate panels on https://dashboards.gitlab.net/d/api.

Rollout Steps

Note: chatops commands run in the Slack channel impacted by the command.

Rollout on non-production environments

  • Verify the MR with the feature flag is merged to master and has been deployed to non-production environments with /chatops gitlab run auto_deploy status <merge-commit-of-your-feature>
  • Deploy the feature flag at a percentage (recommended percentage: 50%) with /chatops gitlab run feature set fix_merge_api_train_bypass 50 --actors --dev --pre --staging --staging-ref
  • Monitor that the error rates did not increase (repeat with a different percentage as necessary).
  • Enable the feature globally on non-production environments with /chatops gitlab run feature set fix_merge_api_train_bypass true --dev --pre --staging --staging-ref
  • Verify the feature works as expected against a staging project with merge trains enabled.

Specific rollout on production

  • Ensure that the feature MRs have been deployed to both production and canary.
  • Project-actor rollout to GitLab's own projects first: /chatops gitlab run feature set --project=gitlab-org/gitlab,gitlab-org/gitlab-foss,gitlab-com/www-gitlab-com fix_merge_api_train_bypass true
  • Verify behavior on gitlab-org/gitlab (which uses merge trains).

Preparation before global rollout

  • Set a milestone to this rollout issue.
  • Check if the feature flag change needs to be accompanied with a change management issue.
  • Ensure that you or a representative in development can be available for at least 2 hours after feature flag updates in production.
  • Update the REST API docs noting the new 422 behavior on train-enabled projects.

Global rollout on production

Release the feature

  • Create an MR to remove the fix_merge_api_train_bypass feature flag and the legacy code path.
  • Once the cleanup MR has been deployed to production, clean up the feature flag from all environments: /chatops gitlab run feature delete fix_merge_api_train_bypass --dev --pre --staging --staging-ref --production
  • Close this rollout issue.

Rollback Steps

  • Disable on production:
    /chatops gitlab run feature set fix_merge_api_train_bypass false
  • Disable on non-production:
    /chatops gitlab run feature set fix_merge_api_train_bypass false --dev --pre --staging --staging-ref
  • Delete from all environments:
    /chatops gitlab run feature delete fix_merge_api_train_bypass --dev --pre --staging --staging-ref --production