Add labkit rate limit adapter for cohort 1 keys

What does this MR do?

Adds Gitlab::ApplicationRateLimiter::LabkitAdapter, which routes a first cohort of five rate-limit keys (pipelines_create, notes_create, search_rate_limit, users_get_by_id, user_sign_in) through Labkit::RateLimit::Limiter. This is the start of the migration from the in-house ApplicationRateLimiter strategy classes to the shared labkit primitive.

How is the rollout controlled?

Two wip-type feature flags per key:

  • rate_limiter_use_labkit_<key> opts the key into the labkit path.
  • rate_limiter_use_labkit_<key>_enforce lets labkit's decision win over legacy.

The two flags produce three meaningful states:

_use_labkit _enforce What runs
off off Legacy only (status quo).
on off Both paths run; legacy decides. The Prometheus shadow counter records per-key agreement so a 24-hour shadow run can confirm parity.
on on Only the labkit path runs; its decision is returned.

The legacy and labkit Redis key namespaces are intentionally disjoint (application_rate_limiter:... vs labkit:rl:...) so both counters can run in parallel during shadow validation without interference.

The full per-key rollout procedure (with chatops commands and pass criteria) is tracked in #598560.

Verification

  • 71 RSpec examples (52 existing + 19 new) cover signature stability, scope normalization, the labkit Redis key format, fail-open behavior, the dual-flag wiring, and the Prometheus shadow counter.
  • Manual end-to-end testing against a local GDK confirmed each of the five keys behaves correctly in all three states (off / shadow / enforce). Both Redis key shapes appear under shadow; only the labkit shape appears under enforce.

Operational notes

  • Feature flag changes take up to 60 seconds to propagate to all puma workers (Flipper L1 process cache TTL). The rollout runbook should pause 60+ seconds between toggles to avoid mixed-state behavior visible in dashboards.
  • The labkit path adds one Redis round-trip per check (the recovery GET used to recover current_count for the utilization-ratio histogram, since Labkit::RateLimit::Result does not yet expose it). This is a temporary cost until labkit's Result carries the count natively.

References

Edited by Max Woolf

Merge request reports

Loading