Skip to content

Drop duplicate jobs from Sidekiq when enqueuing

What does this MR do?

This extends the UntilExecuting deduplication strategy to cancel scheduling jobs when they are already in the queue.

When we drop a job, we log that to the Sidekiq.logger

The log messages looks like this (for now):

I, [2020-03-05T11:50:24.765516 #67227]  INFO -- : {"class"=>"ProjectImportScheduleWorker", "retry"=>false, "queue"=>"project_import_schedule", :backtrace=>true, "jid"=>"474ddc7fd2ebecd467ef534e", "created_at"=>1583405424.761091, "enqueued_at"=>1583405424.764548, "meta.project"=>"gitlab-org/gitlab-shell", "meta.root_namespace"=>"gitlab-org", "meta.subscription_plan"=>"default", "correlation_id"=>"16e24e1b1696c7392de6a215d55d61bd", "duplicate-of"=>"ce57381d57ca6c4f671b3fab", "pid"=>67227, "job_status"=>"deduplicated", "message"=>"ProjectImportScheduleWorker JID-474ddc7fd2ebecd467ef534e: deduplicated: dropped until executing", "deduplication_type"=>"dropped until executing"}
I, [2020-03-05T11:50:24.768354 #67227]  INFO -- : {"class"=>"ProjectImportScheduleWorker", "retry"=>false, "queue"=>"project_import_schedule", :backtrace=>true, "jid"=>"0f744ea81107a95f8a648ac5", "created_at"=>1583405424.761091, "enqueued_at"=>1583405424.765619, "meta.project"=>"gnuwget/wget2", "meta.root_namespace"=>"gnuwget", "meta.subscription_plan"=>"default", "correlation_id"=>"3e9a45d693f927b5c119d25cdb8ed33d", "duplicate-of"=>"be79a0f45e3b145a56fccd86", "pid"=>67227, "job_status"=>"deduplicated", "message"=>"ProjectImportScheduleWorker JID-0f744ea81107a95f8a648ac5: deduplicated: dropped until executing", "deduplication_type"=>"dropped until executing"}

It is part of the work for gitlab-com/gl-infra/scalability#42 (closed).

We can't enable the feature yet until we've converted the sidekiq logger to a structured logger, and ingest it in the sidekiq index:

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by Bob Van Landuyt

Merge request reports