Skip to content

Migrate all daily aggregated RedisHLL events to weekly

What does this MR do and why?

This MR is marked as Draft to avoid being merged right before 16.0 release.

This MR migrates some daily aggregated events to weekly aggregation.

Currently we store RedisHLL events in either daily or weekly slots. This is the historical decision and in fact we dont need daily keys, we just dont need daily granularity as we aggregate events weekly anyway.

So basically daily events stored as (pseudo-keys for clarity) 2023-04-01_eventa, 2023-04-02_eventa... 2023-04-07_eventa and then during service ping generation we get HLL union on these keys. The result in the end absolutely the same as if we were storing these events in weekly slots (like 2023-12_eventa, 12-th week) but it puts heaps more load on Redis server and also created some maintainece overhead on developers.

So the purpose of this MR to 1) merge old daily keys into according weekly and 2) to change code to put all the events into weekly keys.

After this MR is merged all the newly emitted events will be written as they were weekly and once migration run, we will merge all the old existing daily keys into according weekly once.

This step is required to remove known_events config in the future and use weekly aggregation by default

How to set up and validate locally

  1. Prepare data in redis
[1] pry(main)> q = Gitlab::UsageDataCounters::HLLRedisCounter.known_events.select {|e| e[:aggregation] == 'daily'}.first
=> {"name"=>"g_edit_by_web_ide", "aggregation"=>"daily"} # random daily event
[2] pry(main)> day = (Date.today - 3.weeks).beginning_of_week
=> Mon, 13 Mar 2023 
[3] pry(main)> key1 = Gitlab::UsageDataCounters::HLLRedisCounter.send(:redis_key, q, day)
=> "2023-072-{hll_counters}_g_edit_by_web_ide" # daily keys
[4] pry(main)> key1 = Gitlab::UsageDataCounters::HLLRedisCounter.send(:redis_key, q, day + 1.day)
=> "2023-073-{hll_counters}_g_edit_by_web_ide"
[5] pry(main)> key1 = Gitlab::UsageDataCounters::HLLRedisCounter.send(:redis_key, q, day + 2.days)
=> "2023-074-{hll_counters}_g_edit_by_web_ide"
[6] pry(main)> key1 = Gitlab::UsageDataCounters::HLLRedisCounter.send(:redis_key, q, day + 4.days)
=> "2023-076-{hll_counters}_g_edit_by_web_ide"
[7] pry(main)> key1 = Gitlab::UsageDataCounters::HLLRedisCounter.send(:redis_key, q, day + 5.days)
=> "2023-077-{hll_counters}_g_edit_by_web_ide"
[8] pry(main)> q['aggregation'] = 'weekly' # change the event aggregation to weekly
=> "weekly"
[9] pry(main)> q
=> {"name"=>"g_edit_by_web_ide", "aggregation"=>"weekly"}
[10] pry(main)> Gitlab::UsageDataCounters::HLLRedisCounter.send(:redis_key, q, day + 2.days)
=> "{hll_counters}_g_edit_by_web_ide-2023-11" # get a weekly key
[11] pry(main)> Gitlab::Redis::HLL.add(key: "2023-072-{hll_counters}_g_edit_by_web_ide", value: 1, expiry: 10.days)
=> [true, true]
[12] pry(main)> Gitlab::Redis::HLL.add(key: "2023-073-{hll_counters}_g_edit_by_web_ide", value: 2, expiry: 10.days)
=> [true, true]
[13] pry(main)> Gitlab::Redis::HLL.add(key: "2023-074-{hll_counters}_g_edit_by_web_ide", value: 3, expiry: 10.days)
=> [true, true]
[14] pry(main)> Gitlab::Redis::HLL.add(key: "2023-076-{hll_counters}_g_edit_by_web_ide", value: 4, expiry: 10.days)
=> [true, true]
[15] pry(main)> Gitlab::Redis::HLL.add(key: "2023-077-{hll_counters}_g_edit_by_web_ide", value: 4, expiry: 10.days)
=> [true, true]
[16] pry(main)> Gitlab::Redis::HLL.add(key: "{hll_counters}_g_edit_by_web_ide-2023-11", value: 5, expiry: 10.days)
=> [true, true]
[17] pry(main)> Gitlab::Redis::HLL.add(key: "{hll_counters}_g_edit_by_web_ide-2023-11", value: 6, expiry: 10.days)
=> [true, true]
[18] pry(main)> Gitlab::Redis::HLL.add(key: "{hll_counters}_g_edit_by_web_ide-2023-11", value: 6, expiry: 10.days)
=> [false, true]
[19] pry(main)> Gitlab::Redis::HLL.add(key: "{hll_counters}_g_edit_by_web_ide-2023-11", value: 7, expiry: 10.days)
=> [true, true]
[20] pry(main)> Gitlab::Redis::HLL.count(keys: "{hll_counters}_g_edit_by_web_ide-2023-11")
=> 3 # verify the weekly key value
  1. execute the migration bin/rails db:migrate:redo:main VERSION=20230317004428
  2. Validate data was merged
[21] pry(main)> Gitlab::Redis::HLL.count(keys: "{hll_counters}_g_edit_by_web_ide-2023-11")
=> 7 # 4 daily values merged with 3 weekly ones from above

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #390357 (closed)

Edited by Niko Belokolodov

Merge request reports