Use new status collection to evaluate needs_processing? in AtomicProcessingService

What does this MR do and why?

Context

The needs_processing? query in AtomicProcessingService is being executed at a very high rate. It gets executed in AtomicProcessingService twice: at the start and at the end. We seek to reduce this call rate to avoid potential LWlock contention (#598584).

This MR addresses one of the proposed changes, which is to use the new status_collection we already evaluate for new_alive_jobs and check new_collection.processing_jobs.any? in place of the last needs_processing?.

This new_collection is only evaluated in new_alive_jobs when there were stopped jobs at the beginning of processing. We expect this to be the majority of cases, so this means that we could potentially near-eliminate the last needs_processing? query.

In !235948 (merged), we introduced logging to monitor the return value from both queries. Per #600063 (comment 3364674789), there are some logs for mismatching values but they are deemed inconsequential. So we can proceed with this change behind a new feature flag.

This MR

Updated AtomicProcessingService so that when @new_collection is populated, we use it to determine needs_processing? instead of pipeline.needs_processing?. Otherwise we fall back on the latter.

As an extra precaution, we will roll out the FF with the strategy in #600063 (comment 3368875813), which is to roll it out on GitLab projects only and spot check discrepancies for a few days before rolling out globally.

Feature flag: ci_check_needs_processing_using_new_status_collection. Roll-out: #600901

References

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #598584

Edited by Leaminn Ma

Merge request reports

Loading