Skip to content

Fix Prometheus Unicorn metrics not coming back after a HUP

Stan Hu requested to merge sh-fix-prometheus-metrics-cleanup into master

In https://gitlab.com/gitlab-com/infrastructure/issues/1962, we saw that HTTP metrics would often not return after a HUP. This is what was happening:

  1. User runs gitlab-ctl hup unicorn
  2. gitlab-unicorn-wrapper sends SIGUSR2 to unicorn, which tells it to spawn a new master
  3. When the new master is running with the right number of worker processes, gitlab-unicorn-wrapper sends a SIGQUIT to the old master
  4. gitlab-unicorn-wrapper script quits, causing sv to restart it
  5. The sv run script deletes the metrics files.
  6. The new unicorn processes are accessing deleted files

To avoid this problem, we remove the Prometheus .db files before executing the unicorn process in the before_exec block.

Edited by GitLab Release Tools Bot

Merge request reports