Skip to content

Skip duplicate job check on resumed concurrency limit job

What does this MR do and why?

Optimize performance of resuming jobs (deferred by concurrency limit) by performing duplicate jobs check before deferring the job and skipping Redis calls from duplicate job checks.

Before

  • ConcurrencyLimit::Client (deferring job) middleware runs before DuplicateJobs::Client (deduplication check) middleware
  • Job 1 gets deferred by ConcurrencyLimit::Client
  • Job 2 gets deferred by ConcurrencyLimit::Client
  • Jobs 1 & 2 are resumed by ConcurrencyLimit::ResumeWorker
  • Job 1 runs deduplication check. Job 1 sets cookie A.
  • Job 1 is scheduled.
  • Job 2 runs deduplication check. Job 2 tries to set cookie A, but already exists.
  • Job 2 is deduplicated.
  • Job 1 is picked up by sidekiq.
  • Job 1's cookie A deleted by DuplicateJobs::Server middleware.

After

  • ConcurrencyLimit::Client (deferring job) middleware runs after DuplicateJobs::Client (deduplication check) middleware
  • Job 1 runs deduplication check. Job 1 sets duplicate job cookie A in Redis.
  • Job 1 gets deferred by ConcurrencyLimit::Client
  • Cookie A exists as long as job 1 is still inside concurrency queue or 10 minute TTL has passed.
  • Job 2 runs deduplication check
  • Job 2 tries to set duplicate job cookie A in Redis.
  • Job 2 deduplicated.
  • Only job 1 is resumed by ConcurrencyLimit::ResumeWorker
  • Job 1 is rescheduled without checking for duplicate jobs.
  • Job 1 picked up by sidekiq
  • Job 1's cookie A is deleted by DuplicateJobs::Server middleware.

Changes:

  • Reorder DuplicateJobs::Client middleware to run before ConcurrencyLimit::Client middleware
  • Reorder ConcurrencyLimit::Server middleware to run before DuplicateJobs::Server middleware. Similar to changes in client, if the job ends up deferred in the server middleware, we still want to keep the duplicate job cookie in Redis.
  • Skip duplicate job check when a job is resumed because the deduplication check was done prior to being deferred

The changes are gated by environment variable REORDER_DUPLICATE_JOBS_AND_CONCURRENCY_LIMIT_MIDDLEWARE. We can't use feature flag because the middleware order is initialized on load time.

References

gitlab-com/gl-infra/production#20567 (comment 2817004998)

How to set up and validate locally

To validate the middleware order:

❯ GITLAB_LOG_LEVEL=debug bundle exec sidekiq

image.png

❯ REORDER_DUPLICATE_JOBS_AND_CONCURRENCY_LIMIT_MIDDLEWARE=true GITLAB_LOG_LEVEL=debug bundle exec sidekiq

image.png

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Marco Gregorius

Merge request reports

Loading