Skip to content

Sidekiq RubySampler prometheus metrics files are intialized twice

Summary

In config/initializers/7_prometheus_metrics.rb, it uses Prometheus::Client.reinitialize_on_pid_change to initialize prometheus metrics files. When new Sidekiq master process start, the initialization happens twice: one time from on_worker_start, another time from on_master_start.

This will result in two .db files in folder tmp/prometheus_multiproc_dir/sidekiq:

-rw-r--r--  1 qingyu  staff    4096  1 Oct 18:38 gauge_all_sidekiq-0.db
-rw-r--r--  1 qingyu  staff    4096  1 Oct 18:38 gauge_all_sidekiq-1.db 

I guess the metrics data are still correct. Just it not ideal to have redundant metrics file.

Here is the code in config/initializers/7_prometheus_metrics.rb

if !Rails.env.test? && Gitlab::Metrics.prometheus_metrics_enabled?
  Gitlab::Cluster::LifecycleEvents.on_worker_start do
    defined?(::Prometheus::Client.reinitialize_on_pid_change) && Prometheus::Client.reinitialize_on_pid_change

    Gitlab::Metrics::Samplers::RubySampler.initialize_instance(Settings.monitoring.ruby_sampler_interval).start
  end

  Gitlab::Cluster::LifecycleEvents.on_master_start do
    ::Prometheus::Client.reinitialize_on_pid_change(force: true)

    if defined?(::Unicorn)
      Gitlab::Metrics::Samplers::UnicornSampler.instance(Settings.monitoring.unicorn_sampler_interval).start
    elsif defined?(::Puma)
      Gitlab::Metrics::Samplers::PumaSampler.instance(Settings.monitoring.puma_sampler_interval).start
    end
  end
end

Steps to reproduce

This is the steps to reproduce this issue in GDK:

  1. in config/gitlab.yml, enable Sidekiq_exporter sidekiq_exporter: enabled: true address: localhost port: 3807

  2. delete all files under tmp/prometheus_multiproc_dir/sidekiq

  3. In config/initializers/7_prometheus_metrics.rb, comment out the line of Prometheus::CleanupMultiprocDirService.new.execute --- Why we do this? because there is an issue(where RubySampler for Sidekiq are deleted) #33125 (closed)

Sidekiq.configure_server do |config|
  config.on(:startup) do
    # webserver metrics are cleaned up in config.ru: `warmup` block
    # Prometheus::CleanupMultiprocDirService.new.execute
    Gitlab::Metrics::SidekiqMetricsExporter.instance.start
  end
end
  1. gdk run
  2. watch the folder tmp/prometheus_multiproc_dir/sidekiq

Possible fixes

I do not have proposed fix yet..

Edited by Qingyu Zhao