Prevent amplification of ReactiveCachingWorker jobs upon failures
When ReactiveCachingWorker
hits an SSL or other exception that occurs
quickly and reliably, automatically rescheduling a new worker could lead
to excessive number of jobs being scheduled. Each run of ReactiveCachingWorker
reschedules itself,
but a failure also causes Sidekiq to schedule up to 3 retires in the retry set. These retries, in turn, will also schedule more jobs.
In busy instances, this can become an issue because large numbers of
ReactiveCachingWorker
running can cause high rates of ExclusiveLease
reads to occur and possibly saturate the Redis server with queries.
We now disable this automatic retry and rely on Sidekiq to perform its 3 retries with a backoff period.
Edited by Stan Hu