Skip to content

Change LimitedCapacity::JobTracker to use SharedState redis rather than Queues (sidekiq)

From #867 (comment 510710754)

To move to any multi-cluster sidekiq deployment, be that per-zone clusters in k8s or some other design, LimitedCapacity::JobTracker needs to use a a global shared Redis rather than the implied one per-cluster. Otherwise it will only limit the capacity within the sidekiq cluster where such workers are running, when the general use case is intended to be a system-wide limit (e.g. Ci::DeleteObjectsWorker)

Proposal

We have four phases:

  1. Support reading this information from the persistent Redis and the Sidekiq Redis. As we're interested in a number of running jobs at points here, we'll need to sum the cardinalities of the sets. (Although we typically expect one or the other to be zero, we still need to read both.)
  2. Start writing to the persistent Redis instead of the Sidekiq Redis. This is backwards-compatible because of step 1.
  3. Migrate existing data (with a Redis migration). This will only be needed if the two Redis-es are actually configured to be different, which they are on GitLab.com.
  4. Stop reading from the Sidekiq Redis.
Edited by Sean McGivern