Fix for sidekiq prometheus files being wiped accidentally
What does this MR do?
We have a rails initializer for prometheus, which is run both by rails-web and sidekiq. We have seen an issue (cf #33125 (closed)) where prometheus's
*.db files in the
sidekiq multiproc folder get wiped accidentally, because the cleanup logic runs after the prometheus client is initialized (which is when these files get created initially.) It seems as if that wasn't a problem until recently, since the sidekiq init logic relied on the prometheus client init to happen elsewhere, and in particular, after its own init hook (although I'm unsure about how or why, maybe someone can clarify?)
A simple but not super clean fix is to run
Prometheus::Client.reinitialize_on_pid_change again after running the cleanup.
I am not confident that this doesn't introduce other problems, so it would be good if someone with more experience in prometheus could verify that this is safe to do.
Alternatively, I wonder if there is a way to hook into life-cycle events in such a way that we can run the cleanup once, then wait for all processes to come up and run through the prometheus init just once, instead of doing all of this for every process/worker?
Does this MR meet the acceptance criteria?
- Changelog entry
- Documentation created/updated or follow-up review issue created
- Code review guidelines
- Merge request performance guidelines
- Style guides
- Separation of EE specific content
Availability and Testing
I tested this only locally, and I can confirm that the necessary
sidekiq/*.db files now survive the application boot.
Closes #33125 (closed)