[15.8] Fix automatically-retried jobs stuck in pending state
What does this MR do and why?
This backports !116480 (merged) and !117275 (merged) to the 15-8-stable-ee
branch.
This fixes an issue where concurrent runners would not pick up retried builds due to the runner tick value not being invalidated.
Previously when a job failed, Ci::RetryJobService
would clone the
failed job, add its own after_commit
hooks, and immediately attempt
to start the pipeline. However, starting a pipeline loads its own
instances of Ci::Build
, and for builds that are updated, the state
changes add their own after_commit
hooks.
For a given transaction, Rails only runs one after_commit
hook for a
specific model (see https://github.com/rails/rails/pull/45280), so the
after_commit
hooks performed when starting the build would be
discarded. As a result, BuildQueueWorker
was never executed for the
build, causing the runner tick value to be left in a stale
state. Other run_after_commit
blocks were not executed as well, such
as build hooks.
To avoid this double after_commit
business, move the starting of the
pipeline into the run_after_commit
block of the cloned job. This
slightly delays the starting of the pipeline and the job, but it also
avoids starting a Sidekiq worker inside a transaction
(#398229 (closed)).
Relates to #387775 (closed)
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
This MR is backporting a bug fix, documentation update, or spec fix, previously merged in the default branch. -
The original MR has been deployed to GitLab.com (not applicable for documentation or spec changes). -
This MR has a severity label assigned (if applicable). -
Ensure the e2e:package-and-test-ee
job has either succeeded or been approved by a Software Engineer in Test.
Note to the merge request author and maintainer
The process of backporting bug fixes into stable branches is tracked as part of an internal pilot. If you have questions about this process, please:
- Refer to the internal pilot issue for feedback or questions.
- Refer to the patch release runbook for engineers and maintainers for guidance.