Stage 2a: Migrate all ApplicationRateLimiter call sites to labkit
## Summary
This issue tracks the full migration of `Gitlab::ApplicationRateLimiter` call sites to `Labkit::RateLimit`. The migration is done in cohorts, each following the repeatable process below.
Parent epic: https://gitlab.com/groups/gitlab-com/gl-infra/-/work_items/2021
## Cohorts
| Cohort | Issue | Keys | Strategy | Labkit prerequisite | Status |
|---|---|---|---|---|---|
| 1 | https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28803 | `pipelines_create`, `notes_create`, `search_rate_limit`, `users_get_by_id`, `user_sign_in` | IncrementPerAction | None (v1.14.0) | In progress |
| 2 | https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28809 | Remaining ~64 IncrementPerAction keys (non-peek) | IncrementPerAction | None | Draft |
| 3 | https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28810 | 10 `.peek` callers | IncrementPerAction (peek) | `Limiter#peek` in labkit | Draft |
| 4 | https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28811 | 1 `IncrementPerActionedResource` caller | Set-based (SADD/SCARD) | Set strategy in labkit | Draft |
| 5 | https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28812 | 2 `IncrementResourceUsagePerAction` callers | Float-cost (INCRBYFLOAT) | Float-cost strategy in labkit | Draft |
_The cohort table will be updated with issue links as draft issues are created and refined._
## Repeatable migration process per cohort
Extracted from https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28803 (Cohort 1). Each cohort follows this process:
### 1. Key selection
Select 5-10 rate limit keys for the cohort. Consider:
- Traffic volume diversity (mix of high and low traffic keys)
- Scope shape diversity (different numbers of scope elements)
- Entry point diversity (REST API, GraphQL, controller concern, direct call)
- Redis headroom (see https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28807)
### 2. Feature flags
Two ops flags per key:
- `rate_limiter_use_labkit_<key>` — enables the labkit adapter for this key (default: off)
- `rate_limiter_use_labkit_<key>_enforce` — switches the labkit rule action from `:log` to `:block` (default: off)
Flags are independent per key. No global kill switch needed — labkit fails open on Redis errors.
### 3. Adapter implementation
Inside `ApplicationRateLimiter._throttled?`, when the use-labkit flag is on for a key:
- Construct the call site name from the rate limit key name
- Construct the identifier from the scope objects (serialized as key-value pairs)
- Build a single-element rules array with characteristics, limit, period, and action derived from existing config
- Call `Labkit::RateLimit::Limiter#check(identifier)` and use the result
- Preserve: allowlist short-circuit, bypass header check, utilization-ratio histogram
### 4. Shadow validation (per key)
1. Enable `_use_labkit_<key>` with `_<key>_enforce` off
2. Labkit counts and logs with `action: :log` but does not block; legacy path still enforces
3. Soak for minimum 24 hours of production traffic
4. Compare labkit decisions against legacy decisions — divergence must be < 0.5% (excluding window-boundary noise within 1 second of period rollover)
5. Post screenshot of divergence query result to the cohort issue
### 5. Enforcement flip (per key)
1. Enable `_<key>_enforce` — labkit's decision is now authoritative
2. Soak for minimum 24 hours
3. Monitor utilization-ratio histogram — p99 should not shift more than 10% from pre-flip baseline
### 6. Rollback
- Flip `_<key>_enforce` off → enforcement returns to legacy within seconds
- Flip `_use_labkit_<key>` off → stops labkit Redis writes entirely
- Both flags independent, can be flipped separately
### 7. Sequence
Roll out lowest-traffic key first, one key per deploy cycle. Sequence within each cohort is determined by the cohort issue.
## Requirements for cohort completion
A cohort is complete when:
- All keys in the cohort have `_use_labkit_<key>` and `_<key>_enforce` both enabled in production
- Shadow validation passed for each key (< 0.5% divergence)
- Enforcement soak passed for each key (24h, histogram stable)
- Evidence posted to the cohort issue for each key
issue