Skip to content

Harden Redis HLL metric

We have now available HLL from Redis !35580 (merged)

HLL class https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/redis/hll.rb

Redis HLL PFCOUNT docs https://redis.io/commands/pfcount

Gitlab::Redis::HLL.count(keys: keys)

Redis HLL PFADD docs https://redis.io/commands/pfadd

Gitlab::Redis::HLL.add(key: target_key, value: author_id, expiry: KEY_EXPIRY_LENGTH)

Redis HLL PFMERGE not implemented

pfmerge

Merge multiple HyperLogLog values into an unique value that will approximate the cardinality of the union of the observed Sets of the source HyperLogLog structures.

PFMERGE vs PFCOUNT with multiple keys

  • PFMERGE(destination_key, *source_keys) - merges multiple keys into a unique value, and stores the approximate the cardinality of the union of the observed Sets of the source HyperLogLog structures into the given destination_key

  • PFCOUNT(*keys) - When called with multiple keys, returns the approximated cardinality of the union of the HyperLogLogs passed, by internally merging the HyperLogLogs stored at the provided keys into a temporary HyperLogLog.

In the end is a matter of storing the result temporary or in a given key. Considering this I think we do not need a the moment PFMERGE method

Opening this issue to discuss next step here

Limitations:

Data is stored in Redis for a limited amount of time

Current implementations

Next steps

Please add any concerns and ideas, thank you!

Edited by Alina Mihaila