Fix PG::UniqueViolation and NoMethodError in merge train services

What does this MR do and why?

This merge request fixes two critical vulnerabilities in the GitLab merge train services that cause production issues:

  1. PG::UniqueViolation on index_merge_trains_on_merge_request_id - Race condition causing duplicate key constraint violations in AutoMerge::MergeTrainService when multiple concurrent requests attempt to create merge train cars.
  2. NoMethodError for undefined sha method - Nil reference error when accessing diff_head_pipeline.sha in AutoMerge::AddToMergeTrainWhenChecksPassService when the pipeline is missing or deleted.

See: Fix race condition to ensure concurrent MRs suc... (#584920)

Implementation Details

Fix 1: Atomic Car Creation with Exception Handling

Modified ee/app/services/auto_merge/merge_train_service.rb to use atomic create_merge_train_car with exception handling:

  • Changed from non-atomic build_merge_train_car() to atomic create_merge_train_car()
  • Added rescue ActiveRecord::RecordNotUnique to handle race conditions gracefully
  • Reload merge request to get existing car after race condition
  • Return early if car is nil after reload

Fix 2: Safe Navigation Operator for Nil-Safe Access

Modified ee/app/services/auto_merge/add_to_merge_train_when_checks_pass_service.rb to use safe navigation operator:

  • Changed from merge_request.diff_head_pipeline.sha to merge_request.diff_head_pipeline&.sha
  • Allows pipeline_sha to be nil when pipeline is missing
  • SystemNoteService handles nil SHA gracefully

Benefits

  • Prevents duplicate key violations and NoMethodError exceptions
  • Handles concurrent requests and missing pipelines gracefully
  • Maintains backward compatibility with no API or database schema changes
  • Follows existing patterns in the codebase
  • Comprehensive test coverage added for both race condition and nil pipeline scenarios

References

How to set up and validate locally

  1. No database migrations needed
  2. No feature flags needed
  3. No configuration changes needed
  4. Can be deployed immediately
  5. Monitor error logs for any issues post-deployment
  6. Monitor merge train success rates post-deployment

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Click to expand

Quality

  • Self-reviewed this MR per code review guidelines
  • Code follows software design guidelines - Uses existing patterns for atomic operations and safe navigation
  • Automated tests exist - Added comprehensive test coverage for race condition and nil pipeline scenarios
  • Considered technical impacts on GitLab.com, Dedicated and self-managed - No schema changes, backward compatible
  • Considered impact on frontend, backend, and database - Backend only, no database schema changes
  • Tested in all supported browsers - Not applicable (backend service changes only)
  • Confirmed backward compatibility - No API or database schema changes, fully backward compatible
  • Properly separated EE content - Changes are in ee/app/services/ directory
  • Considered existing data variation - No data model changes, handles nil gracefully
  • Fixed flaky tests - No flaky tests introduced

Performance, reliability, and availability

  • Confident this MR does not harm performance - Atomic operations improve performance by reducing race conditions
  • Added information for database reviewers - No database changes required
  • Considered availability and reliability risks - Fixes critical production issues, improves reliability
  • Considered scalability risk - No scalability concerns, fixes race condition that impacts scalability
  • Considered performance/reliability/availability impacts on large customers - Fixes issues affecting all customers
  • Considered impacts on minimum system requirements - No impact on system requirements

Observability instrumentation

  • Included enough instrumentation - Existing error handling and logging in services captures issues

Documentation

  • Included changelog trailers - Changelog entry should be added for bug fixes
  • Added/updated documentation - No documentation changes needed for internal service fixes

Security

  • Confirmed credential/token/authorization/authentication handling - Not applicable, no security-sensitive changes
  • Reviewed security guidelines - No security review needed for this bug fix
  • Addressed security scan results - No security concerns identified

Deployment

  • Considered using a feature flag - Not needed for bug fix, can be deployed immediately
  • If using feature flag, planned staging/production testing - N/A
  • Informed Infrastructure department - No infrastructure changes needed

Compliance

  • Confirmed correct MR type label - Should be labeled as ~bug or ~bugfix
Edited by Ket Slaats

Merge request reports

Loading