Avoid using mmap if not required and speedup metrics collection
Avoid using MMAP when parsing metrics
When parsing many metrics files to generate Prometheus endpoints response, we don't actually have to use MMAP. And as such we can use a much safer option of simply reading the file using standard
File methods. Those operations are also a bit faster due to
File access require fewer operations than MMAP.
Introduce checks for PID change on each metric operation
When a worker is forked from the master process all its metrics are copied over. This also means file handles and locks already used. Previously we've been relying on Unicorn hooks to reset the metrics when the process is forked. But that is not sufficient to guard against possible corruption.
Because a metric can fire between the time when the process is forked and when metrics are reset, we cannot guarantee that files will be reinitialized in time.
This method allows atomic read and parsing double from MMAP'ed memory as with the previous implementation the underlying memory could have been freed between reading and parsing.
Previous ruby implementation used code similar to
This is composed of two methods
slice(0..3) which returns String that internally points to mmapped memory. And unpack which parses the string into
It can theoretically happen that right after
slice(0..3) is called and before
unpack('d') executes the underlying memory is freed by another thread. This would cause
unpack('d') to access invalid memory potentially leading to many different errors.
Through some manual testing when adding access synchronization using additional locks in code that instruments methods in GitLab. Resulted in a significant 10x+ slowdown of the execution time of the whole process.
Porting this method to C allowed using GIL to synchronize access as well using the speed of C to optimize performance ofcode that gets executed whenever new metric is initialized which was also a contributor to initial warm'up time of code that relied heavily on instrumenting using Prometheus metrics.
refactor metric processing and representation to separate module to aid with optimization/rewriting to C
meant as a helper to diagnose and parse various data files, it outputs in both undigested JSON representing all data points and it can output Prometheus text as well