Skip to content

Fix metrics server not shutting down when Puma is stopped

Stan Hu requested to merge sh-fix-metrics-server-shutdown into master

What does this MR do and why?

Previously if a separate metrics server were configured and Puma were stopped, the metrics server would continue to run. When Puma is started again, it attempts to spawn a new metrics server, but this fails repeatedly because the port has already been taken.

To fix this, add an at_exit handler that will shutdown the process supervisor if Puma exists. This won't help the case if Puma is abruptly killed, however.

Relates to omnibus-gitlab#8109 (closed)

How to set up and validate locally

  1. In an Omnibus GitLab instance, enable the metrics server (https://docs.gitlab.com/ee/administration/monitoring/prometheus/web_exporter.html):
puma['exporter_enabled'] = true
puma['exporter_address'] = "127.0.0.1"
puma['exporter_port'] = 8083
  1. Run gitlab-ctl reconfigure.
  2. Run gitlab-ctl restart puma.
  3. After Puma starts, ps -ef | grep metrics-server should show that the metrics server is running.
  4. Run gitlab-ctl stop puma.
  5. ps -ef | grep metrics-server should still be running.
  6. Apply this patch in /opt/gitlab/embedded/service/gitlab-rails/metrics_server/metrics_server.rb.
  7. Retry steps 4 to 6. metrics-server should be gone.
  8. /var/log/gitlab/gitlab-rails/application_json.log should show messages:
{"severity":"INFO","time":"2023-12-12T20:41:56.921Z","message":"Puma process 449340 is exiting, shutting down metrics server..."}
{"severity":"INFO","time":"2023-12-12T20:41:56.925Z","message":"Puma process 449338 is exiting, shutting down metrics server..."}
{"severity":"INFO","time":"2023-12-12T20:41:56.925Z","message":"Puma process 449342 is exiting, shutting down metrics server..."}
{"severity":"INFO","time":"2023-12-12T20:41:56.925Z","message":"Puma process 449336 is exiting, shutting down metrics server..."}
{"severity":"INFO","time":"2023-12-12T20:41:57.321Z","message":"Puma process 449300 is exiting, shutting down metrics server..."}

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports