Add diagnostic logging for stuck auto-merge MRs

What does this MR do and why?

Adds diagnostic logging, behind the default-off auto_merge_diagnostic_logging ops flag, to pin down the root cause of intermittently stuck auto-merge merge requests (#596177) — MRs that stay unmerged after a green pipeline and only merge once the page is loaded.

The stuck state happens when the auto-merge worker reads the CI mergeability check as checking even though the pipeline has already succeeded, then bails with no retry. The pipeline-success trigger runs in run_after_commit and the worker is pinned to that WAL location, so a stale pipeline-status read is unlikely — the prime suspect is the pipeline_creating? Redis flag, which forces checking regardless of the database (and is therefore immune to the previously-tried, now-reverted "force primary reads" change). This logging confirms that, or points instead at a stale/mismatched head_pipeline.

What it logs (flag-gated, per-project, non-success only)

  • MergeRequests::Mergeability::CheckCiStatusService — when CI is checking/failure for an auto-merge MR, logs auto_merge_ci_diagnostic with pipeline_creating, the raw pipeline_creation_requests, head_pipeline_id vs diff_head_pipeline_id, head_pipeline_status, merge_status, and diff_head_sha.
  • AutoMergeProcessWorker — logs auto_merge_worker_invoked with the trigger source and triggering_pipeline_ids, to correlate the bailing run with the trigger that fired it.

How to read it

Log shows Verdict
ci_check_status=checking, head_pipeline_status=success, pipeline_creating=true stale pipeline_creating? Redis flag
pipeline_creation_requests has a lingering in_progress entry orphaned creation request
head_pipeline_status running + triggering_pipeline_ids differ from diff_head_pipeline_id stale/mismatched head pipeline
diff_head_pipeline_id=nil while head_pipeline_id is set sha mismatch

No behaviour change while the flag is off.

Rollout

  • Enable for the affected project only, then watch for an auto_merge_ci_diagnostic line on the next stuck MR: /chatops run feature set auto_merge_diagnostic_logging true --project=datahow/projects/dhl3/devops/deployment-dhl-multi
  • Remove the flag and this logging once the root cause is confirmed.

MR acceptance checklist

  • Tests added (flag on / flag off)
  • Behind a feature flag (auto_merge_diagnostic_logging, default off)
  • No documentation changes needed

Merge request reports

Loading