Replace waiting_for_runner_ack? with runner_ack_wait_status

What does this MR do and why?

This MR replaces the boolean waiting_for_runner_ack? method with a more expressive runner_ack_wait_status method that returns symbolic status values. This improvement provides better clarity and precision when checking the state of jobs in the two-phase commit workflow.

Background

As part of addressing race conditions in the two-phase commit logic (see #578881), we identified that the boolean waiting_for_runner_ack? method was insufficient for representing the different states a job can be in during the runner acknowledgment process.

Changes

Replaced waiting_for_runner_ack? with runner_ack_wait_status:

The new method returns more precise status information:

  • :waiting - Job is actively waiting for runner acknowledgment
  • :not_waiting - Job is not in a waiting state (either not assigned to a runner or not pending)
  • :wait_expired - Job was assigned to a runner but the wait period has expired (runner manager ID is no longer present in Redis)

Updated all callers across the codebase:

  • Ci::AuthJobFinder - Updated job validation logic
  • Ci::RetryWaitingJobService - Simplified logic by removing helper methods and using direct status checks
  • Ci::UpdateBuildStateService - Updated two-phase commit workflow handling
  • All test files - Updated expectations to use the new status values

Benefits

  1. Better expressiveness: The status now clearly indicates whether a job is actively waiting, not waiting, or has an expired wait
  2. Improved debugging: More granular status information helps with troubleshooting
  3. Cleaner logic: Eliminates the need for multiple helper methods in service classes
  4. Future extensibility: The symbolic approach makes it easier to add new states if needed

References

Related to #578881

Screenshots or screen recordings

N/A - Backend API improvement

How to set up and validate locally

  1. Run the updated tests to verify all status transitions work correctly:

    bundle exec rspec spec/models/ci/build_two_phase_commit_spec.rb
    bundle exec rspec spec/services/ci/register_job_service_two_phase_commit_spec.rb
    bundle exec rspec spec/services/ci/retry_waiting_job_service_spec.rb
    bundle exec rspec spec/workers/ci/retry_stuck_waiting_job_worker_spec.rb
  2. Test the new status method in Rails console:

    # Create a build in different states and check the status
    build = Ci::Build.create!(...)
    
    # Not waiting (no runner assigned)
    build.runner_ack_wait_status # => :not_waiting
    
    # Waiting (assigned to runner manager)
    build.set_waiting_for_runner_ack(runner_manager.id)
    build.runner_ack_wait_status # => :waiting
    
    # Wait expired (runner manager ID removed from Redis)
    build.cancel_wait_for_runner_ack
    build.runner_ack_wait_status # => :wait_expired

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

  • Performance: No performance impact - same underlying logic with improved API
  • Reliability: Improves reliability through clearer state representation
  • Security: No security implications
  • Maintainability: Significantly improves code maintainability with more expressive API and simplified service logic
  • Testing: Comprehensive test coverage updated to verify all status transitions and edge cases
Edited by Pedro Pombeiro

Merge request reports

Loading