Skip to content

Implement different deduplication strategies for idempotent jobs

In gitlab-org/gitlab!25447 (merged) we've added a single strategy for recognizing duplicate jobs. The strategy takes a lock when a job is added to the queue, and removes that lock before the job starts.

We could add more strategies for covering more duplicates similar to what unique-jobs provides.

Our implementation would need to take the idempotent? attribute of a worker into account before deduplicating, so we can't switch to using that gem just yet. But we could make the API for specifying a deduplication method on a worker compatible.

The API works by adding a lock: option to the sidekiq_options. If a worker specifies a lock explicitly, we could consider it idempotent?, if no lock is specified, we default to :until_executing.

The gem provides the following strategies:

  1. :until_executing (detection implemented in gitlab-org/gitlab!25447 (merged)) Lock until the previously scheduled job starts.

  2. :until_executed Lock until the previously scheduled job finishes

  3. :until_expired Lock solely based on time

  4. :until_and_while_executing Prevents a job from being scheduled twice, but allows a job to be scheduled when one was already running. Does not allow 2 jobs to run simultaneously.

  5. :while_executing Allows jobs to be scheduled at the same time, but does not allow them to be run simultaneously. This could be a replacement for our exclusive_lease pattern.

  6. :none Since we automatically deduplicate with :until_executing we might want to add a :none if a job should not be deduplicated, even though it is idempotent

Edited by Bob Van Landuyt