Delete the duplicate job even in case of errors
🔍 Context
During the investigation of !87649 (merged), we noticed that with the
deduplicate :until_executed
configuration, the deduplication key doesn't seem to be removed when there is an error in the worker execution.
See:
This was surprising and counter intuitive. A job that fails is still a job's end: the deduplication key should be freed. Subsequent executions should be allowed to run.
🔬 What does this MR do and why?
- Update the
until_executed
to make sure that the duplicated job is deleted even in case of errors. - Update the related specs.
📺 Screenshots or screen recordings
n / a
⚙ How to set up and validate locally
This is not ease to test locally but we can try.
We're going to use the ContainerRegistry::Migration::GuardWorker
for this verification.
- Update
ContainerRegistry::Migration::GuardWorker#perform
todef perform raise ArgumentError, 'Boom' end
- Update
config/initializers/1_settings.rb
so that the worker is executed each 2 minutes:Settings.cron_jobs['container_registry_migration_guard_worker']['cron'] ||= '*/2 * * * *'
- Start the background jobs and observe the logs:
$ tail -f log/sidekiq.log | grep "GuardWorker"
master
On We get:
{"time":"2022-05-17T11:54:21.704Z","class":"ContainerRegistry::Migration::GuardWorker","job_status":"start"}
{"time":"2022-05-17T11:54:24.199Z","class":"ContainerRegistry::Migration::GuardWorker","job_status":"fail"}
{"time":"2022-05-17T11:56:32.540Z","class":"ContainerRegistry::Migration::GuardWorker","job_status":"deduplicated"}
Execution 1 failed and that failure left the deduplication key behind. As a consequence, execution 2 is deduplicated
.
With this MR
We get this log:
{"time":"2022-05-17T11:46:25.180Z","class":"ContainerRegistry::Migration::GuardWorker","job_status":"start"}
{"time":"2022-05-17T11:46:27.650Z","class":"ContainerRegistry::Migration::GuardWorker","job_status":"fail"}
{"time":"2022-05-17T11:48:26.756Z","class":"ContainerRegistry::Migration::GuardWorker","job_status":"start"}
{"time":"2022-05-17T11:48:29.651Z","class":"ContainerRegistry::Migration::GuardWorker","job_status":"fail"}
Execution 1 fails but execution 2 is allowed to run because the deduplication key is freed when execution 1 fails.
🚥 MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.