Parse extra payload parameters for Generic Alert (!16441) · Merge requests · GitLab.org / GitLab

Vitali Tatarintev requested to merge ee-parse-unknown-attrs-for-generic-alert into master Sep 11, 2019

What does this MR do?

Related to the Generic Alert Endpoint feature.

Besides expected payload parameters, an external monitoring tool can (an will) send additional data as a payload. All those extra parameters will be listed in the issue description.

Since the parameters can be nested, we are going to inline them and merge their keys.

For example

{
  "foo": {
    "bar": {
      "baz": "Foo Bar Baz"
    }
  }
}

Becomes

{
  "foo.bar.baz": "Foo Bar Baz"
}

Within the Generic Alert Endpoint feature, we are going to receive a payload from various monitoring tools. We are expecting those tools will send us some predefined parameters, such as:

title
start_time
description
monitoring_tool
service
hosts

On top of those parameters, we are expecting a payload with many more additional parameters. They can be anything our customers need to log.

This is how we are going to display them.

Why do we merge the keys (using .) ?

According to design we are going to display nested params merged with dots. That's why I pick the default connector as a dot.

Implementation considerations

We considered a wide variety of implementations: recursive, iterative (with several tweaks as listed below).

We've chosen an iterative approach due to a possible SystemStackError (note, Tail Call Optimization is disabled by default) and performance.

Code

All implementations

# frozen_string_literal: true

module Gitlab
  module Utils
    class InlineHash
      # Chosen approach: Iterative using `String`-concat and `unshift` to retain key order
      def self.merge_keys(hash, prefix: nil, connector: '.')
        return {} unless hash.is_a?(Hash)

        result = {}
        base_prefix = prefix ? "#{prefix}#{connector}" : ''
        pairs = hash.map { |key, value| ["#{base_prefix}#{key}", value] }

        until pairs.empty?
          key, value = pairs.shift

          if value.is_a?(Hash)
            value.each { |k, v| pairs.unshift ["#{key}#{connector}#{k}", v] }
          else
            result[key] = value
          end
        end

        result
      end

      # Same'ish performance: Iterative using `String`-concat and `append` not retaining key order
      def self.merge_keys_append(hash, prefix: nil, connector: '.')
        return {} unless hash.is_a?(Hash)

        result = {}
        base_prefix = prefix ? "#{prefix}#{connector}" : ''
        pairs = hash.map { |key, value| ["#{base_prefix}#{key}", value] }

        until pairs.empty?
          key, value = pairs.shift

          if value.is_a?(Hash)
            value.each { |k, v| pairs << ["#{key}#{connector}#{k}", v] }
          else
            result[key] = value
          end
        end

        result
      end

      # Slower: Iterative using `Array`-concat
      def self.merge_keys_array(hash, prefix: nil, connector: '.')
        return {} unless hash.is_a?(Hash)

        result = {}
        pairs = hash.map { |key, value| [[prefix, key].compact.join(connector), value] }

        until pairs.empty?
          key, value = pairs.shift

          if value.is_a?(Hash)
            value.each { |k, v| pairs.unshift [[key, k].join(connector), v] }
          else
            result[key] = value
          end
        end

        result
      end

      # Original approach: Best readability but prone to `SystemStackError`
      def self.merge_keys_recur(hash, prefix: nil, connector: '.')
        return {} unless hash.is_a?(Hash)

        hash.reduce({}) do |result, (key, value)|
          flat_key = [prefix, key].compact.join(connector)
          if value.is_a?(Hash)
            result.merge(merge_keys_recur(value, prefix: flat_key, connector: connector))
          else
            result.merge(flat_key => value)
          end
        end
      end
    end
  end
end

Benchmarks

require "benchmark/ips"

def make_wide_hash(depth: 1_000)
  hash = {}
  (0..depth).to_a.each do |i|
    hash[i] = "v"
  end

  hash
end

small_hash = {
  nested_param: {
    key: 'Value'
  },
  'root_param' => 'Root',
  'very' => {
    'wide' => {
      'nested' => {
        'param' => 'wide nested value'
      }
    }
  }
}

{
  small: small_hash,
  medium: make_wide_hash(depth: 100),
  large: make_wide_hash(depth: 2000),
  huge: make_wide_hash(depth: 10_000),
}.each do |name, hash|

  puts "Benchmarks for #{name}"
  puts

  Benchmark.ips do |x|
    x.report("#{name} recurive") { Gitlab::Utils::InlineHash.merge_keys_recur(hash) } unless name == :huge
    x.report("#{name} iter array") { Gitlab::Utils::InlineHash.merge_keys_array(hash) }
    x.report("#{name} iter append") { Gitlab::Utils::InlineHash.merge_keys_append(hash) }
    x.report("#{name} iter unshift") { Gitlab::Utils::InlineHash.merge_keys(hash) }

    x.compare!
  end

  puts "=" * 80
  puts
end

Results

The benchmark winner is Gitlab::Utils::InlineHash.merge_keys_append but does not retain key order.

Gitlab::Utils::InlineHash.merge_keys retains key order and has same'ish performance ⭐

Benchmarks for small

Warming up --------------------------------------
      small recurive    15.986k i/100ms
    small iter array    23.370k i/100ms
   small iter append    32.904k i/100ms
  small iter unshift    33.010k i/100ms
Calculating -------------------------------------
      small recurive    164.282k (±11.7%) i/s -    815.286k in   5.044761s
    small iter array    225.320k (± 8.3%) i/s -      1.122M in   5.017635s
   small iter append    338.443k (± 2.2%) i/s -      1.711M in   5.058211s
  small iter unshift    308.362k (± 8.8%) i/s -      1.551M in   5.076510s

Comparison:
   small iter append:   338442.8 i/s
  small iter unshift:   308362.2 i/s - same-ish: difference falls within error
    small iter array:   225320.0 i/s - 1.50x  slower
      small recurive:   164282.0 i/s - 2.06x  slower

================================================================================

Benchmarks for medium

Warming up --------------------------------------
     medium recurive   670.000  i/100ms
   medium iter array     1.298k i/100ms
  medium iter append     2.316k i/100ms
 medium iter unshift     2.197k i/100ms
Calculating -------------------------------------
     medium recurive      6.766k (±14.9%) i/s -     33.500k in   5.069706s
   medium iter array     11.390k (±11.2%) i/s -     57.112k in   5.097605s
  medium iter append     20.835k (± 6.4%) i/s -    104.220k in   5.024600s
 medium iter unshift     21.796k (± 4.2%) i/s -    109.850k in   5.049348s

Comparison:
 medium iter unshift:    21796.0 i/s
  medium iter append:    20835.4 i/s - same-ish: difference falls within error
   medium iter array:    11390.2 i/s - 1.91x  slower
     medium recurive:     6766.0 i/s - 3.22x  slower

================================================================================

Benchmarks for large

Warming up --------------------------------------
      large recurive     8.000  i/100ms
    large iter array    55.000  i/100ms
   large iter append   112.000  i/100ms
  large iter unshift   116.000  i/100ms
Calculating -------------------------------------
      large recurive     87.739  (± 5.7%) i/s -    440.000  in   5.029038s
    large iter array    597.455  (± 4.0%) i/s -      3.025k in   5.072750s
   large iter append      1.229k (± 1.7%) i/s -      6.160k in   5.015178s
  large iter unshift      1.184k (± 6.8%) i/s -      6.032k in   5.125802s

Comparison:
   large iter append:     1228.6 i/s
  large iter unshift:     1183.8 i/s - same-ish: difference falls within error
    large iter array:      597.5 i/s - 2.06x  slower
      large recurive:       87.7 i/s - 14.00x  slower

================================================================================

Benchmarks for huge

Warming up --------------------------------------
     huge iter array     7.000  i/100ms
    huge iter append    19.000  i/100ms
   huge iter unshift    22.000  i/100ms
Calculating -------------------------------------
     huge iter array     88.391  (±12.4%) i/s -    441.000  in   5.061004s
    huge iter append    235.649  (± 5.5%) i/s -      1.178k in   5.015261s
   huge iter unshift    223.060  (± 9.0%) i/s -      1.122k in   5.072089s

Comparison:
    huge iter append:      235.6 i/s
   huge iter unshift:      223.1 i/s - same-ish: difference falls within error
     huge iter array:       88.4 i/s - 2.67x  slower

================================================================================

Does this MR meet the acceptance criteria?

Conformity

Performance and Testing

Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
~~Tested in all supported browsers~~

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

Label as security and @ mention @gitlab-com/gl-security/appsec
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
Security reports checked/validated by a reviewer from the AppSec team

Edited Sep 22, 2019 by Thong Kuah

Parse extra payload parameters for Generic Alert