SidekiqHandler should terminate parent process
What does this MR do and why?
Renames TermProcessHandler to SidekiqHandler since it is used only by Sidekiq. For Sidekiq, we are now using SidekiqCluster everywhere. SidekiqCluster uses ProcessSupervisor that will trap the Term signal and send it to all child processes. It will wait for termination, and in case the process is stuck, it will hard stop/SIGKILL it.
We want to reuse this for Sidekiq Handler. So instead of calling TERM with process pid, we want to call TERM with the parent process pid, which is SidekiqCluster.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
How to set up and validate locally
This is a little tricky on local environment especially if you are using MacOS:
- Enable Watchdog for Sidekiq
export GITLAB_MEMORY_WATCHDOG_ENABLED = true
- Set MaxRSS
export SIDEKIQ_MEMORY_KILLER_MAX_RSS = 500
- In case of running on Darwin, Gitlab::Metrics::System.memory_usage_rss[:total] will return 0, because this module relies on the /proc filesystem being available. The workaround would be to monkey patch
rss_memory_limit.rb
diff --git a/lib/gitlab/memory/watchdog/monitor/rss_memory_limit.rb b/lib/gitlab/memory/watchdog/monitor/rss_memory_limit.rb
index ac71592294ca..600c47e25e65 100644
--- a/lib/gitlab/memory/watchdog/monitor/rss_memory_limit.rb
+++ b/lib/gitlab/memory/watchdog/monitor/rss_memory_limit.rb
@@ -13,7 +13,7 @@ def initialize(memory_limit_bytes:)
end
def call
- worker_rss_bytes = Gitlab::Metrics::System.memory_usage_rss[:total]
+ worker_rss_bytes = get_rss#Gitlab::Metrics::System.memory_usage_rss[:total]
return { threshold_violated: false, payload: {} } if worker_rss_bytes <= memory_limit_bytes
@@ -22,6 +22,16 @@ def call
private
+ def get_rss
+ output, status = Gitlab::Popen.popen(%W(ps -o rss= -p #{Process.pid}), Rails.root.to_s)
+
+ return 0 unless status&.zero?
+
+ output.to_i
+ end
+
def payload(worker_rss_bytes, memory_limit_bytes)
{
message: 'rss memory limit exceeded',
- Restart and tail rails-background-jobs
- Monitor processes (I am using HTOP) Watch for sidekiq and sidekiq-cluster
- After ~10 minutes you should see in logs:
{"severity":"INFO","time":"2023-01-10T15:45:09.857Z","message":"stopped","memwd_reason":"successfully handled","memwd_handler_class":"Gitlab::Memory::Watchdog::SidekiqHandler","memwd_sleep_time_s":3,"pid":36206,"worker_id":"sidekiq_0","memwd_rss_bytes":0,"retry":0}
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Relates to #387794 (closed)