Investigate why Sidekiq-cluster restarts all workers when one worker dies
After !97694 (merged), it seems that we run Sidekiq Cluster everywhere.
While working on this epic, I found that we have really complex termination process for the Sidekiq worker in Sidekiq Memory Killer
Now we are running everything in Sidekiq Cluster, this means that if one of the workers is terminated because it exceeds RSS memory, Sidekiq Memory Killer will shut down the whole cluster.
https://gitlab.com/gitlab-org/gitlab/-/blob/master/sidekiq_cluster/cli.rb#L146-149
This is not a problem for SaaS at this moment, since we are running a single process in a single pod, but as a self-managed customer, I would be upset if all my Sidekiqs would restart just because one of them violates an RSS threshold.
Not sure what is the reason behind this.
Edited by Nikola Milojevic