Skip to content

Add property_name handling for RedisHLL

Michał Wielich requested to merge michold-add-redis-uniques into master

What does this MR do and why?

Related to #415139 (closed)

Make it possible to save & read multiple unique_by values for a single event.

To achieve that, we want to have separate redis counters based on the property_name that the event is triggered with. As we're not migrating any data immediately, we also want to make sure legacy events will get saved under their old keys.

Script used to geenrate the legacy events file:
comment = "# This file lists all of the internal events that need to be saved with their legacy HLL Redis keys
#
# This file has been generated using the script included in
# the description of https://gitlab.com/gitlab-org/gitlab/-/merge_requests/137890
#
# It is only safe to regenerate it using the same script if the
# :redis_hll_property_name_tracking feature flag is disabled on prod environment.\n"
events = Gitlab::InternalEvents::EventDefinitions.send(:events)
events = events.flat_map do |event, props|
  props.map do |prop|
    next unless prop
    prop = prop.to_s.split('.').first.to_sym

    ["#{event}-#{prop}", event]
  end.compact
end.sort_by(&:second).to_h

path = Rails.root.join('lib/gitlab/usage_data_counters/hll_redis_key_overrides.yml')
File.open(path, 'w') { |f| f.write(comment + events.to_yaml) }

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  1. Choose a legacy event (an event listed as a value in hll_redis_key_overrides.yml) and try reading its value, for example: Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_dashboard', start_date: Date.yesterday, end_date: Date.tomorrow + 7.days)
  2. Enable the new feature flag: Feature.enable(:redis_hll_property_name_tracking)
  3. Try triggering the chosen legacy event, for example: Gitlab::UsageDataCounters::HLLRedisCounter.track_event('g_compliance_dashboard', values: SecureRandom.uuid)
  4. You can trigger the event with the property exposed as a suffix in the key in hll_redis_key_overrides.yml too, for example: Gitlab::UsageDataCounters::HLLRedisCounter.track_event('g_compliance_dashboard', values: SecureRandom.uuid, property_name: 'user.id'). This should also work when the property name doesn't have the .id part, eg: Gitlab::UsageDataCounters::HLLRedisCounter.track_event('g_compliance_dashboard', values: SecureRandom.uuid, property_name: :user)
  5. Read the event's value again, eg: Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_dashboard', start_date: Date.yesterday, end_date: Date.tomorrow + 7.days). Its value should be equal to the value retrieved in point 1, incremented by 1 for each time #track_event has been called.
  6. In case an event with a legacy property name has been chosen, it should also be possible to specify the property_name when reading the value and still get the same value, for example: Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_dashboard', start_date: Date.yesterday, end_date: Date.tomorrow + 7.days, property_names: ['user'])
  7. Increment the legacy event using a non-legacy property_name, for example: Gitlab::UsageDataCounters::HLLRedisCounter.track_event('g_compliance_dashboard', values: SecureRandom.uuid, property_name: 'project')
  8. This should be saved as a separate counter: the data from the previous triggers (eg. Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_dashboard', start_date: Date.yesterday, end_date: Date.tomorrow + 7.days, property_names: ['user'])) should not be changed, and the data from the new triggers (eg. Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_dashboard', start_date: Date.yesterday, end_date: Date.tomorrow + 7.days, property_names: ['project'])) should be equal to the times the new event_name - property_name pair has been triggered.
  9. The same track & read process should also work for new events. To test it with a new event, first add it to known_events [for example: edit a sample metric to use the new event's name] and run reload!. Then, follow the instructions for tracking & reading the data again, using the new event's name.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Michał Wielich

Merge request reports