Extract report logic into Reporter class
What does this MR do and why?
We recently added the ability to collect runtime diagnostics dumps from gitlab-rails and upload these to GCS. For the MVC, we only had one such report, which was running on a timer.
Because we only had a single report, it wasn't entirely clear at the time which parts of this system would need to be reused for multiple reports, so in the spirit of iteration we didn't overthink this. Now that we are looking to add a second kind of report, heap dumps, some cracks started to show in these abstractions.
In order to streamline logic to handle multiple kinds of reports and the signals that trigger them, I made the following changes:
- The actual
Report
types are very light-weight now and conform to a duck-typed API: they have aname
, and arun
method that accepts awriter
into which they emit their data. This will most likely be some kind ofIO
. - The
ReportsDaemon
, a timer thread that was used to trigger the original report is now greatly slimmed down and any remaining logic not related to timers has been moved to theReporter
class. This is a direct follow-up to !104264 (merged), where I had started doing this. - The
Reporter
type is now responsible for providing awriter
to each report; this means all logic from the originalJemalloc
report that involved streaming data to a temp file has moved here since we will re-use this later to produce heap dumps.
In summary, the old flow was:
ReportsDaemon
-> trigger N reports based on a timer
-> ask each report where they wrote their data
Report(s):
-> stream data to a temp dir using a self-chosen path
-> return this path to the caller
-> move report data from that path to an upload queue/directory
The new flow is:
ReportsDaemon
-> creates a new `Reporter`
Reporter
-> prepare a temp dir where reports are written
-> construct a report file writer
-> hand this ready-to-use writer to each report
Report(s)
-> stream data into the writer; that's it
-> move report data from that path to an upload queue/directory
Plus, the Reporter
can also be used outside of the ReportsDaemon
as is the case with HeapDump
, which is not triggered by a timer but rather an on-process-stop signal.
Behavioral changes
While this is largely a refactor, I had to make a few behavioral changes because Reporter
was a new class while ReportsDaemon
had some behavior like error handling baked in already. Changes I made:
- Errors are now caught in
Reporter#run_report
and logged. - I added a
started
log event; I find that these can be useful in case a thread or process gets stuck, in which case there would never be afinished
orfailed
event. - The
finished
log event now contains the full path to the report file.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
How to set up and validate locally
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #370077 (closed)