Refactor sidekiq memory killer
What does this MR do and why?
This MR is extracting business logic that actually restarts the process in the separate handler (RestartSidekiqHandler)
It also:
- Simplifies the current logic
- Allows us to Introduce
NullHandler
in Introduce NullHandler for Sidekiq MemoryKiller, which will allow us to safely fine-tune soft limit and grace balloon period - Simplifies specs - current specs are focused on implementation details and testing private methods, which makes refactoring and adding new logic very difficult
- Separates current monitoring
phase
(memory usage is fine, we are over the soft limit, we are over the hard limit), and lifecycle state (running, shutting down gracefully, killing sidekiq)
phase | description |
---|---|
phase_rss_within_the_range |
Do nothing, RSS is within the range |
phase_rss_above_soft_limit |
Log and update metrics, only when the grace period is exceeded, handle high rss |
phase_rss_above_hard_limit |
Log and update metrics, handle high rss immediately |
state | descripition |
---|---|
:running |
Sidekiq process is running |
:stop_fetching_new_jobs |
handler sends SIGTSTP |
:shutting_down |
handler sends SIGTSTP |
:killing_sidekiq |
handler sends SIGKILL if process didn't shutdown gracefully |
Screenshots or screen recordings
These are strongly recommended to assist reviewers and reduce the time to merge your change.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #368691 (closed)
Edited by Nikola Milojevic