sidekiq does not increment retries when shutting down
Summary
When updating the "retries" value on a job, sidekiq treats a job differently if it was simply pushed back to redis because of a sidekiq shutdown as compared to if the job actually failed. Both are logged as failed. But if you have a job that (for example) blows out the sidekiq max RSS every time and triggers the memory_killer, that job will be pushed back to redis on the shutdown and never get its retry value incremented. It will remain a zombie job that you can't get out of the queue except by shutting down sidekiq and deleting it from redis.
Customer experienced this in this zendesk ticket.
This behavior inadvertently keeps around the worst of all possible sidekiq jobs to create havoc on the system on an ongoing basis. We should figure out a way to handle this more gracefully? I'm not certain exactly how--this might be a bug that should be filed against sidekiq.