Sidekiq memory killer can cause jobs to be retried indefinitely
This happened again today in https://gitlab.com/gitlab-com/gl-infra/production/issues/804.
If a Sidekiq job runs, takes too much memory, and then gets killed, jobs in the queue will be retried indefinitely. The Sidekiq job_retry
counter doesn't get decremented here because the failure may not be due to this specific job.
Ultimately we should prevent memory usage from one specific job from going too high, but it seems to me the memory killer might want to have a separate way of tracking the jobs it retried and prevent them from being retried too many times.
Edited by Stan Hu