Metrics parsing is slow

Parsing metrics from mmaped files to display them is very slow when having too many metrics.

Posible actions

  • limit the number of metrics in the system until the performance of parsing is improved
  • optimize + rewrite some parts into more efficient language - C/C++

Possible optimizations areas

  • JSON usage for serializing, is showing on Prof because we have to parse small JSON for each metric
    • the JSON entries used are quite small and not complicated, writing very simple optimized parser for subset of the code should be possible, and should give at least some performance improvements
    • switch JSON to something else (protobuf, messagepack, etc...)
  • Data processing, after reading the metrics, all data is processed using many intermediary hashmaps. There are quite a few opportuninities where native code could outperform current implementation
  • Investigate possibility of improvements of processing algorithms

Current performance characteristic:

On 2017 MBP it takes ~5s to process ~50k metric entries which represents ~10k metrics (this includes all histogram sub countrers.

Example RubyProf outout

pawel$ ./bin/parse -t -p ~/data/a1/*.db  > /dev/null
Measure Mode: wall_time
Thread ID: 70205279648260
Fiber ID: 70205280417060
Total: 5.534470
Sort by: self_time

 %self      total      self      wait     child     calls  name
  9.18      0.508     0.508     0.000     0.000    45735   JSON::Ext::Parser#parse
  3.79      0.210     0.210     0.000     0.000   850632   String#hash
  3.04      0.179     0.168     0.000     0.011   121446   Kernel#format
  2.62      0.145     0.145     0.000     0.000   576964   String#eql?
  2.36      0.440     0.131     0.000     0.310       51   Enumerator::Generator#each
  2.18      0.120     0.120     0.000     0.000    45735   Array#zip
  2.12      0.125     0.117     0.000     0.007    45786   Array#reject
  1.85      0.179     0.102     0.000     0.077    45735   JSON::Ext::Parser#initialize
  1.78      5.460     0.098     0.000     5.361      232  *Array#each
  1.61      0.751     0.089     0.000     0.661   564896  *Array#hash
  1.61      0.206     0.089     0.000     0.117    45735   Enumerator::Yielder#yield
  1.53      0.085     0.085     0.000     0.000    91572   String#unpack
  1.39      0.077     0.077     0.000     0.000   411615   Hash#key?
  1.37      0.287     0.076     0.000     0.211    45837   Class#new
  1.35      0.151     0.075     0.000     0.076    91470   Prometheus::Client::Helper::EntryParser#type
  1.31      0.692     0.072     0.000     0.620    91470   Hash#fetch
  1.26      0.070     0.070     0.000     0.000    62090   String#gsub
  1.01      0.862     0.056     0.000     0.806    45735   <Module::Prometheus::Client::Helper::JsonParser>#load
  0.89      0.445     0.049     0.000     0.395   386356  *Array#eql?
  0.87      0.050     0.048     0.000     0.001   100400   Prometheus::Client::Helper::EntryParser#parts
  0.78      0.806     0.043     0.000     0.763    45735   <Module::JSON>#parse
  0.76      0.122     0.042     0.000     0.080    61860   <Module::Prometheus::Client::Helper::MetricsRepresentation>#escape
  0.56      0.031     0.031     0.000     0.000       51   <Class::IO>#read
  0.56      0.031     0.031     0.000     0.000    45837   String#slice
  0.54      0.030     0.030     0.000     0.000    91470   String#to_sym
  0.50      0.059     0.028     0.000     0.031    45837   Prometheus::Client::Helper::PlainFile#slice
  0.46      0.025     0.025     0.000     0.000    18222   Array#join
  0.44      0.024     0.024     0.000     0.000    13787   Array#to_h
  0.37      0.302     0.020     0.000     0.282    13787   <Module::Prometheus::Client::Helper::MetricsRepresentation>#format_labels
  0.30      0.016     0.016     0.000     0.000    91521   Fixnum#zero?
  0.22      0.392     0.012     0.000     0.380    13821  *Hash#each
  0.20      3.939     0.011     0.000     3.928    13757   Enumerable#each_with_object
  0.19      0.011     0.011     0.000     0.000    13787   Float#to_s
  0.18      0.010     0.010     0.000     0.000    61416   String#to_s
  0.14      0.008     0.008     0.000     0.000    47561   Kernel#nil?
  0.12      0.043     0.007     0.000     0.036    13787   <Module::Prometheus::Client::Helper::MetricsRepresentation>#metric
  0.10      0.013     0.006     0.000     0.007     4465   Prometheus::Client::Helper::EntryParser#pid
  0.04      0.004     0.002     0.000     0.002     4465   Prometheus::Client::Helper::EntryParser#multiprocess_mode
  0.03      0.002     0.002     0.000     0.000       51   Hash#reject
  0.03      0.001     0.001     0.000     0.000     4465   Array#[]
  0.02      0.001     0.001     0.000     0.000     4942   Kernel#hash
  0.01      0.001     0.001     0.000     0.000     2854   Kernel#eql?
  0.01      0.000     0.000     0.000     0.000       51   <Class::File>#basename
  0.01      0.032     0.000     0.000     0.031       51   Prometheus::Client::Helper::PlainFile#initialize
  0.01      0.033     0.000     0.000     0.032       52   Array#map
  0.00      0.000     0.000     0.000     0.000     1206   Numeric#eql?
  0.00      1.468     0.000     0.000     1.468        1   Hash#each_value
  0.00      0.001     0.000     0.000     0.000      102   Prometheus::Client::Helper::EntryParser#used
  0.00      0.000     0.000     0.000     0.000       51   String#split
  0.00      0.477     0.000     0.000     0.477      115   Enumerable#map
  0.00      0.468     0.000     0.000     0.468       51   Prometheus::Client::Helper::EntryParser#parsed_entries
  0.00      3.673     0.000     0.000     3.673       51   Prometheus::Client::Helper::EntryParser#to_metrics
  0.00      0.000     0.000     0.000     0.000       51   Enumerator#initialize
  0.00      0.000     0.000     0.000     0.000      286   NilClass#to_s
  0.00      0.001     0.000     0.000     0.000       51   Prometheus::Client::Helper::EntryParser#empty?
  0.00      1.431     0.000     0.000     1.431       64   <Module::Prometheus::Client::Helper::MetricsProcessing>#merge_samples
  0.00      0.001     0.000     0.000     0.001       51   Prometheus::Client::Helper::EntryParser#entries
  0.00      0.000     0.000     0.000     0.000       51   Prometheus::Client::Helper::PlainFile#size
  0.00      0.000     0.000     0.000     0.000        1   IO#write
  0.00      0.440     0.000     0.000     0.440       51   Enumerator#each
  0.00      5.534     0.000     0.000     5.534        1   Object#run
  0.00      0.000     0.000     0.000     0.000       64   Symbol#to_s
  0.00      0.000     0.000     0.000     0.000      112   Fixnum#to_s
  0.00      5.534     0.000     0.000     5.534        1   Object#to_prom_text
  0.00      0.000     0.000     0.000     0.000       52   Enumerable#max
  0.00      0.000     0.000     0.000     0.000       78   FalseClass#to_s
  0.00      0.000     0.000     0.000     0.000       52   Float#<=>
  0.00      0.000     0.000     0.000     0.000        1   Kernel#puts
  0.00      1.829     0.000     0.000     1.829        2   Object#transform
  0.00      0.361     0.000     0.000     0.361        1   <Module::Prometheus::Client::Helper::MetricsRepresentation>#to_text
  0.00      0.000     0.000     0.000     0.000        2   Kernel#method
  0.00      0.000     0.000     0.000     0.000        2   Method#to_proc
  0.00      1.468     0.000     0.000     1.468        1   <Module::Prometheus::Client::Helper::MetricsProcessing>#merge_metrics
  0.00      0.000     0.000     0.000     0.000        1   IO#puts

* indicates recursively called methods

Steps to reproduce:

Here is sample metrics files taken from a test GitLab system running with 4 workers. https://drive.google.com/file/d/1jh1Cf51fGsv9UMUXnQwCV1CI1jZ8oSjf/view?usp=sharing

In MR https://gitlab.com/gitlab-org/prometheus-client-mmap/merge_requests/11 there is command bin/parse that is able to parse the files and give Profiling output.

Edited Nov 20, 2017 by Paweł Chojnacki
Assignee Loading
Time tracking Loading