Fix broken gauge/all metric emission

This had been broken due to memory and performance optimizations we made.

Because we emitted a metric group as soon as a pid file with gauges that use the all aggregation was read, and since we process files in parallel, tuples from the same metrics (but different pids) would be interleaved with other metric data. We also couldn't collect these before emitting them as a group together because the existing mechanism was keying on the encoded metric string, which does not contain a pid. We were also sorting samples based on this string, which is also incorrect because the pid label was only injected after these metrics were emitted and sorted (in the renderer).

I fixed all these problems by going back to the original design, where we do not emit encoded metrics anymore. We now always emit decoded Samples, and decoding happens in the mmap probe, not the renderer, which makes more sense too. The renderer is a cross-probe concept and must work with all kinds of probes, but decoding only makes sense for mmap files, so it never made sense to have the encoded metric be part of a Sample.

This comes with additional memory use, but we may need to look for optimizations elsewhere. On the plus side, the data structures are much simpler and more consistent now.

I also added a regression test that performs a sanity check on merged probe results to verify that they do not appear out of order.

Edited by Matthias Käppler

Merge request reports

Loading