Skip to content

Implement metrics parsing in pure C

Paweł Chojnacki requested to merge waay_too_much_c_code into master

This MR reimplements all of metrics Parsing used by exporter endpoint in C. This was done to speed up the exporter endpoint. Test data used for profiling parsed in roughly 27s on 2017 MacBook before this branch. With changes introduced by this MR the same data parses now in 100ms. Which gives approximately 270x speedup.

Some optimizations done in this branch include:

  • simplified JSON parsing, instead of using big and relatively slow JSON library, we use JSMN a 0 copy parser/tokenizer.
  • using only one flat hashmap/hashset to aggregate repeating observations of the same metric before it was nested hashmaps

A lot of the algorithm differences were introduced in order to simplify C implementation. Currently, the overview of the process is as follows:

  • An array of file paths and metadata is passed to C code.
  • each file path is read and its entries are stored in common hashmap where if an entry with the same key already exists entries are merged depending on the rules for a specific type.
  • every entry in hashmap is sorted
  • sorted entries are rendered to Prometheus text format.

There is one caveat to current setup: when ingesting data entries are stored in Hashmap using the whole entry string, which means that the order of labels matters otherwise its considered as a separate entry. The current implementation of saving of the labels relies on Ruby ordering of KV entries, this could prove problematic if that ordering would change at runtime.

All code has been run on server and checked for long-term leaks.

In addition to all necessary mmap gem C code was copied over and cleaned to some extent. In the future, it could be further simplified with some effort. This means that this Gem no longer requires mmap2 Gem.

Data writing and reading were wholly ported to C as well to speed up metrics initialization even more.

Edited by Paweł Chojnacki

Merge request reports