Skip to content

Update sidekiq migration helpers to be sharding compatible

Sylvester Chin requested to merge sc1-sidekiq-mig-helper into master

What does this MR do and why?

This MR addresses shard-compatibility for sidekiq migration helpers.

  • For deleting jobs, we iterate over all shard instances in Gitlab::Redis::Queues
  • For migrating queues, we perform the usual rpoplpush if theres is only 1 instance or if the instance are identical.
    • else, we rpop from the source instance and lpush to the destination instance.

The router receiving nil will default to the main store so this does not affect SM users.

See gitlab-com/gl-infra/scalability#2817 (comment 1763832817)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

For non-sharded setup:

  1. Run gdk stop rails to halt background jobs
  2. Open gdk rails console and enqueue some jobs
[1] pry(main)> Chaos::SleepWorker.perform_async(1)
  Feature::FlipperGate Pluck (0.4ms)  SELECT "feature_gates"."key", "feature_gates"."value" FROM "feature_gates" WHERE "feature_gates"."feature_key" = 'enable_sidekiq_shard_router' /*application:console,db_config_name:main,console_hostname:SylvestersMBP2.localdomain,console_username:sylvesterchin,line:/lib/feature.rb:314:in `block in current_feature_value'*/
=> "d724538d6352f0f7b1a4e753"
[2] pry(main)> Chaos::SleepWorker.perform_async(2)
=> "2e07a373332a4b8496b0d33b"
  1. Verify their existence using gdk redis-cli -n 1 lrange queue:default 0 -1
  2. Run the migration on the console
[4] pry(main)> model = ActiveRecord::Migration.new.extend(Gitlab::Database::Migrations::SidekiqHelpers)
=> #<ActiveRecord::Migration:0x0000000142cd87a8 @connection=nil, @name="ActiveRecord::Migration", @version=nil>
[5] pry(main)> model.sidekiq_queue_migrate('default', to: 'chaos')
[sidekiq#5788] Redis has deprecated the `rpoplpush`command, called at ["/Users/sylvesterchin/work/gitlab-development-kit/gitlab/lib/gitlab/database/migrations/sidekiq_helpers.rb:116:in `block in migrate_within_instance'"]
[sidekiq#5788] Redis has deprecated the `rpoplpush`command, called at ["/Users/sylvesterchin/work/gitlab-development-kit/gitlab/lib/gitlab/database/migrations/sidekiq_helpers.rb:116:in `block in migrate_within_instance'"]
=> nil
  1. Verify with
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> lrange queue:default 0 -1
(empty array)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> lrange queue:chaos 0 -1
1) "{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"version\":0,\"store\":null,\"queue_namespace\":\"chaos\",\"args\":[2],\"class\":\"Chaos::SleepWorker\",\"jid\":\"2e07a373332a4b8496b0d33b\",\"created_at\":1710392156.5215611,\"meta.sidekiq_destination_shard_redis\":\"main\",\"correlation_id\":\"7d5492b516ff0b6a07514c93946aaac5\",\"worker_data_consistency\":\"always\",\"idempotency_key\":\"resque:gitlab:duplicate:default:24808edb6838c48efcad29c8e4b7b5b1c8243aa7a545bb43a333e8284b18e49e\",\"size_limiter\":\"validated\",\"enqueued_at\":1710392156.529687}"
2) "{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"version\":0,\"store\":null,\"queue_namespace\":\"chaos\",\"args\":[1],\"class\":\"Chaos::SleepWorker\",\"jid\":\"d724538d6352f0f7b1a4e753\",\"created_at\":1710392154.7580101,\"meta.sidekiq_destination_shard_redis\":\"main\",\"correlation_id\":\"4ad675b5ae94d3b181875d521aec2ec9\",\"worker_data_consistency\":\"always\",\"idempotency_key\":\"resque:gitlab:duplicate:default:2f0bc834677c58fe9b274ea1acbbcd384d61ad88243f7ae8b6c38429900bba4f\",\"size_limiter\":\"validated\",\"enqueued_at\":1710392154.7707222}"

Using a sharded setup

  1. Initialise 2 docker containers
docker run -p 6378:6379 -d redis:6.2-alpine
docker run -p 6377:6379 -d redis:6.2-alpine
  1. Update config/redis.yml and create feature flag config files
➜  gitlab git:(sc1-sidekiq-mig-helper) ✗ cat config/redis.yml
---
development:
  queues_shard_01:
    url: "redis://localhost:6378"
  queues_shard_02:
    url: "redis://localhost:6377"
➜  gitlab git:(sc1-sidekiq-mig-helper) ✗ cat config/feature_flags/ops/sidekiq_route_to_queues_shard_01.yml
---
name: sidekiq_route_to_queues_shard_01
feature_issue_url:
introduced_by_url:
rollout_issue_url:
milestone: '16.9'
group: group::scalability
type: ops
default_enabled: false
  1. Update config/gitlab.yml with 2 new routing rules.
sidekiq:
    log_format: json # (default is also supported)
    routing_rules:
      - ["tags=needs_own_queue", null]
      - ["worker_name=Chaos::SleepWorker", "chaos", "queues_shard_01"]
      - ["worker_name=Chaos::SleepInvalidWorker", "chaos", "queues_shard_02"]
      - ["*", "default"]
  1. Run the following commands on console:
# enqueue 2 different jobs
[1] pry(main)> Chaos::SleepWorker.perform_async(4)
=> "ea796bc94eeb1d110f075d1c"
[2] pry(main)> Chaos::CpuSpinWorker.perform_async(4)
=> "f2bce665d5298a7778b0ae61"

# enable feature flags
Feature.enable(:sidekiq_route_to_queues_shard_01)
Feature.enable(:sidekiq_route_to_queues_shard_02)
  1. Verify their existence using gdk redis-cli -n 1 lrange queue:default 0 -1

  2. Run migration on console:

[1] pry(main)> model = ActiveRecord::Migration.new.extend(Gitlab::Database::Migrations::SidekiqHelpers)
=> #<ActiveRecord::Migration:0x0000000154bdc4c8 @connection=nil, @name="ActiveRecord::Migration", @version=nil>
[2] pry(main)> model.sidekiq_queue_migrate('default', to: 'chaos')
=> [nil]
  1. Verify that the queues were migrated across instance. Chaos::SleepWorker is defined in the gitlab.yml but CpuSpinWorker was not. Nonetheless, it is moved to the first valid store which serves the chaos queue.
127.0.0.1:6378> lrange queue:chaos 0 -1
1) "{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"version\":0,\"store\":null,\"queue_namespace\":\"chaos\",\"args\":[4],\"class\":\"Chaos::SleepWorker\",\"jid\":\"c64b5e26a47bce8d212b1f10\",\"created_at\":1710393821.8792598,\"meta.sidekiq_destination_shard_redis\":\"main\",\"correlation_id\":\"f9f093e72eaf2341d98422a9d1668e17\",\"worker_data_consistency\":\"always\",\"idempotency_key\":\"resque:gitlab:duplicate:default:3f4b3177099baa5dd93a8ab6b3ae0f70cc91725fdfd17581421f12284b074382\",\"duplicate-of\":\"68be05076dae204120d30250\",\"size_limiter\":\"validated\",\"enqueued_at\":1710393821.89473}"
2) "{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"version\":0,\"store\":null,\"queue_namespace\":\"chaos\",\"args\":[4],\"class\":\"Chaos::CpuSpinWorker\",\"jid\":\"aed1c2b90c2a2400d467303f\",\"created_at\":1710393820.2140522,\"meta.sidekiq_destination_shard_redis\":\"main\",\"correlation_id\":\"d7365ea69ff31b5d59c28a270fc3722a\",\"worker_data_consistency\":\"always\",\"idempotency_key\":\"resque:gitlab:duplicate:default:cc5c37df2fcc9c75db0285309610a5c8f6ff1e5421c9d00dfbea9ee45d3a09eb\",\"duplicate-of\":\"25222e80626a3efbc4d32499\",\"size_limiter\":\"validated\",\"enqueued_at\":1710393820.226442}"
Edited by Sylvester Chin

Merge request reports