feat(rate_limit): SET-mode check_unique / peek_unique primitive

What this MR does

Adds an optional count_distinct: field to Rule. When set to a Symbol naming an identifier key, the rule counts the distinct values seen for that key (SADD + SCARD) within the (characteristics-bucketed) period — the labkit-side equivalent of Gitlab::ApplicationRateLimiter::IncrementPerActionedResource. When count_distinct is nil (default), the rule keeps its current call-counting (INCR) behavior.

Same Limiter#check(identifier) / #peek(identifier) entry points; the rule's shape decides which Redis op to use.

Lays the foundation for the cohort 4 rollout (unique_project_downloads_for_application / unique_project_downloads_for_namespace, EE GitAbuse::*ThrottleService). The consumer-side follow-up will:

  • define static Rules with count_distinct: :project_id for those entries;
  • plumb resource.id as project_id in the identifier hash from LabkitAdapter;
  • add rate_limiter_use_labkit_cohort_4{,_enforce} feature flag YAMLs.

Cohort 4 is also blocked on rule_extras: (gitlab-com/gl-infra/production-engineering#29071 (closed)) so per-namespace limit: / period: callables can resolve without out-of-band DB queries. That lands separately.

API surface

Labkit::RateLimit::Rule.new(
  name: "unique_project_downloads_for_namespace",
  characteristics: [:user_id, :namespace_id],
  count_distinct:  :project_id,
  limit: 5, period: 600,
)

limiter.check(user_id: u, namespace_id: ns, project_id: p)
limiter.peek(user_id: u, namespace_id: ns)   # peek doesn't need :project_id

Result shape unchanged. info.count is the post-add SCARD for count_distinct rules.

Design

  • Distinction at the rule level, not the call level: check/peek take the same identifier shape regardless of mode. Per !292 (merged) review (@reprazent). Earlier revision added Limiter#check_unique / #peek_unique; that API surface is gone.
  • count_distinct: must be a Symbol/String and must not overlap characteristics:. The named key supplies the SET member; characteristics: supply the Redis bucket key.
  • SADD_SCRIPT (atomic Lua), mirroring INCR_SCRIPT — addresses @reprazent's TTL-on-first-write comment. Same shape as the INCR path: pre-read TTL, mutate (SADD+SCARD), EXPIRE if no TTL was set, return the actual post-state TTL. Self-heals orphan keys whose TTL is -1, and closes the pre-existing crash window between SADD and a separate EXPIRE. No more -1 sentinel; build_result no longer needs its TTL fallback for SET-mode.
  • Value coerced to string and SHA-256-encoded above CHAR_VALUE_MAX_LENGTH.
  • :log rule non-termination preserved.
  • Fail-open + log when the count_distinct identifier key is missing/nil/empty for a matched rule: emits a rate_limit_missing_count_distinct warn log, bumps errors_total, and the rule iteration continues to the next rule (does not abort the whole check). peek does not require the count_distinct key — it reads SCARD on the bucket key.

Tests

Rate-limit suite: 313 examples passing (269 unit + 44 integration, of which 6 integration examples in rate_limit_spec.rb are new in this MR). Coverage spans count_distinct validation, the atomic SADD path (including orphan-key self-heal), missing-key fail-open + log, peek without the count_distinct key, and real-Redis integration (distinct counting, bucketing, TTL, enforcement, peek). Rubocop clean.

Reviewer notes

  • Branched off master. Merged master in to pick up !291 (merged)'s INCR_SCRIPT infrastructure; SADD_SCRIPT is built on the same Labkit::Redis::Script helper.
  • This MR is a no-op for end users on its own — cohort 4 also needs rule_extras: (#29071) to land first.
  • Peek is intentionally asymmetric with check on the missing-count_distinct-key path: peek reads SCARD on the rule-keyed bucket, which doesn't need the per-call member, so missing-key fail-open doesn't apply. Documented inline in peek_rules. Happy to make it symmetric if you'd prefer.
Edited by Max Woolf

Merge request reports

Loading