Skip to content

Fix for sidekiq prometheus files being wiped accidentally

Matthias Käppler requested to merge 33125-fix-sidekiq-prometheus into master

What does this MR do?

We have a rails initializer for prometheus, which is run both by rails-web and sidekiq. We have seen an issue (cf #33125 (closed)) where prometheus's *.db files in the sidekiq multiproc folder get wiped accidentally, because the cleanup logic runs after the prometheus client is initialized (which is when these files get created initially.) It seems as if that wasn't a problem until recently, since the sidekiq init logic relied on the prometheus client init to happen elsewhere, and in particular, after its own init hook (although I'm unsure about how or why, maybe someone can clarify?)

A simple but not super clean fix is to run Prometheus::Client.reinitialize_on_pid_change again after running the cleanup.

I am not confident that this doesn't introduce other problems, so it would be good if someone with more experience in prometheus could verify that this is safe to do.

Alternatively, I wonder if there is a way to hook into life-cycle events in such a way that we can run the cleanup once, then wait for all processes to come up and run through the prometheus init just once, instead of doing all of this for every process/worker?

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

I tested this only locally, and I can confirm that the necessary sidekiq/*.db files now survive the application boot.

Closes #33125 (closed)

Edited by 🤖 GitLab Bot 🤖

Merge request reports