Fix missing metrics for Sidekiq exporter server
What does this MR do and why?
Fixes missing sidekiq_exporter_ruby_*
metrics for the Sidekiq exporter server process.
We were accidentally deleting metrics the exporter server exported about itself. Actual sidekiq worker metrics were not affected, nor are request metrics for this server. Furthermore, this only happens when running a dedicated metrics server process for Sidekiq, which is not yet the default, but is the case for SaaS since we're dog-fooding it.
This happened because we would wipe the metrics dir
whenever sidekiq_0
starts up, but this happens after
the metrics server starts, so the worker was deleting
those existing metrics.
The fix moves the cleanup logic out of the shared 7_prometheus
initializer and into the CLI wrapper script for Sidekiq. This is now possible to do because we started using this in SaaS, despite there only being a single Sidekiq worker. We can think of this script as the "Sidekiq supervisor", so any work that affects all Sidekiq processes should happen here, prior to starting them up.
I had to refactor CleanupMultiprocDirService
slightly to take the target directory as an argument since otherwise, initialization might enter a race with Prometheus::Client
being configured first, which isn't always the case.
Screenshots or screen recordings
How to set up and validate locally
- Enable the
sidekiq_exporter
(it's amonitoring
settings) - Make sure
sidekiq_exporter
andsidekiq_health_checks
are not sharing a port (since this will start the metrics server in legacy mode, and the bug will not surface.) - Start Sidekiq (via sidekiq-cluster)
- A metrics server should appear as the
sidekiq-exporter
process - curl its
/metrics
endpoint - You should see metrics starting in
sidekiq_exporter_*
(these were previously AWOL)
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #356505 (closed)