Geo: Deduplicate more jobs using `:until_executed` strategy
Instead of being deduplicated at the time of work being picked up, maybe switch to not scheduling an item for work if one is already being processed?
Good idea
👍 Independent of exclusive leases, the Scalability team implemented job deduplication generically. These jobs shouldn't be getting enqueued if any are already enqueued with the same job arguments: https://gitlab.com/gitlab-org/gitlab/blob/master/doc/development/sidekiq_style_guide.md#deduplication. So the problem should be somewhat limited already.
Proposal: Deduplicate until executed
But actually this is a perfect case for changing this worker from the default job deduplication strategy (
:until_executing
) to instead use:until_executed
. This will throw out jobs if another one with duplicate args is currently running. I think we could even remove the exclusive lease code. This should help a bit. I'll open an issue.
(Referring to Geo::RepositoryVerification::Primary::ShardWorker
)
There is no reason to schedule one of these workers if one with the same args is already running. Further, this is labelled a bug because in the above issue it contributes to poor/buggy performance.
Proposal
- Add
deduplicate :until_executed
toGeo::RepositoryVerification::Primary::ShardWorker
-
RemoveIt doesn't matter much if we do this now or later, so let's avoid touching everything, while reaping the benefits of the easy change.try_obtain_lease
block and any code it depends on that is no longer needed.
-
- Open similar issues for any others you notice could benefit from this => I modified the superclasses in !59799 (merged) so many similar workers will benefit