Anchor build finished_at to pending_state.created_at

What does this MR do and why?

When a runner reports a terminal state to PUT /api/v4/jobs/:id and live trace chunks are still pending in Redis, Ci::UpdateBuildStateService returns 202 with backoff. The build stays in running until all chunks are persisted, and the CommitStatus state machine then sets finished_at at finalization time.

Under Sidekiq pressure, that trace flush window can extend up to Ci::UpdateBuildStateService::ACCEPT_TIMEOUT (5.minutes), and the resulting build.duration can inflate compute-minute billing for the customer.

Ci::BuildPendingState#created_at already captures the moment the runner's terminal report first hit the API. This MR uses that timestamp, when available, as the source of truth for Ci::Build#finished_at while the feature flag is enabled.

Implementation details

The behavior is gated by the ci_anchor_finished_at_to_pending_state feature flag. The flag uses Project.actor_from_id(project_id) so the Project record is not loaded only for feature flag evaluation.

CommitStatus now passes the state-machine transition into set_finished_at. Ci::Build#set_finished_at only changes behavior when the feature flag is enabled; otherwise it delegates to the existing CommitStatus behavior so the disabled path remains unchanged.

With the feature flag enabled:

  • Ci::Build#set_finished_at is the single place that chooses the terminal finished_at timestamp.
  • pending_state.created_at is compared with the normal terminal timestamp, and the earlier timestamp is used.
  • Server-timeout transitions derive their timeout cap (started_at + timeout) from the transition instead of depending on later timeout hooks overwriting finished_at.
  • Transition failure reasons are normalized through Gitlab::Ci::Build::Status::Reason.fabricate, because transition args can be a symbol, a string, or a Reason object.

With the feature flag disabled:

  • The generic terminal transition continues to set finished_at to Time.current.
  • The existing server-timeout hooks continue to overwrite finished_at with started_at + timeout.
  • No pending_state lookup is needed for timestamp selection.

To avoid high-traffic query regressions while the flag is disabled, added pending_state preloads are also gated by the feature flag. Pipeline-scoped paths use the pipeline project as the feature-flag actor; shared stuck/timed-out sweepers load their normal batch first, then preload pending_state only for jobs whose project has the flag enabled.

References

Screenshots or screen recordings

Not applicable. This is a backend-only change.

How to set up and validate locally

Run the targeted specs:

bundle exec rspec spec/models/ci/build_spec.rb spec/services/ci/update_build_state_service_spec.rb

Run targeted RuboCop:

bundle exec rubocop app/models/commit_status.rb app/models/ci/build.rb app/services/ci/cancel_pipeline_service.rb app/services/ci/drop_pipeline_service.rb app/services/ci/stuck_builds/drop_helpers.rb ee/app/services/ci/pipeline_creation/drop_not_runnable_builds_service.rb spec/models/ci/build_spec.rb spec/services/ci/update_build_state_service_spec.rb

The new coverage exercises:

  • Ci::Build#set_finished_at with ci_anchor_finished_at_to_pending_state enabled and disabled.
  • The pending_state.created_at anchor for outdated pending states in Ci::UpdateBuildStateService.
  • Server-timeout clamps preserving disabled-flag behavior.
  • Server-timeout paths using pending_state.created_at when the feature flag is enabled and the pending-state timestamp is earlier.
  • Server-timeout reason normalization for symbol, string, and Reason object transition args.

Validated locally:

bundle exec rspec spec/models/ci/build_spec.rb:4751 spec/models/ci/build_spec.rb:4801 spec/models/ci/build_spec.rb:4881 spec/models/ci/build_spec.rb:4946
bundle exec rspec spec/services/ci/update_build_state_service_spec.rb:430
bundle exec rubocop app/models/commit_status.rb app/models/ci/build.rb spec/models/ci/build_spec.rb

gdk predictive also ran RuboCop against all changed Ruby files with no offenses. Its RSpec phase was not started because the command required interactive confirmation and the terminal was non-interactive.

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Hordur Freyr Yngvason

Merge request reports

Loading