Draft: Stage 2a Cohort 2 — Remaining IncrementPerAction keys
Migrate 95 `IncrementPerAction` call sites (83 CE + 12 EE) in `Gitlab::ApplicationRateLimiter` to `Labkit::RateLimit`, gated by a shared `rate_limiter_use_labkit_cohort_2` / `_enforce` flag pair. Builds on Cohort 1 (#28803) without disturbing its per-key flag rollout. Overarching issue: https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28808 Cohort 1 (reference): https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28803 Parent epic: https://gitlab.com/groups/gitlab-com/gl-infra/-/work_items/2021 ## Pattern changes from Cohort 1 Cohort 2 reuses Cohort 1's adapter module and dispatch boundary, with three deliberate extensions: 1. **Cohort-wide flag pair instead of per-key flags.** `rate_limiter_use_labkit_cohort_2` and `rate_limiter_use_labkit_cohort_2_enforce` gate every Cohort 2 entry. Per-key flags would have meant ~190 YAMLs and 95 individual flag flips; per-key telemetry is preserved at the Prometheus layer (`gitlab_rate_limiter_labkit_shadow_total{key=...}`), so shadow validation and enforce-flip decisions still happen per key. Selection between per-key (Cohort 1) and cohort-wide (Cohort 2+) flag basis is encoded by an optional `flag_scope:` field on each registry entry. 2. **`SUPPORTED_RATE_LIMITS` registry extracted to a sibling file.** Now lives at `lib/gitlab/application_rate_limiter/labkit_adapter/supported_rate_limits.rb`, exposed via `SupportedRateLimits.all` (memoized + frozen) built from `SupportedRateLimits.entries` (overridable). EE-only keys are merged via `super.merge` in `ee/lib/ee/gitlab/application_rate_limiter/labkit_adapter/supported_rate_limits.rb`, prepended onto the CE registry through the standard `prepend_mod` mechanism. 3. **Type-aware identifier building.** The adapter routes scope values to characteristics by AR class (`User → :user`, `Project → :project`, `Group → :group`, `Namespace → :namespace`, `Environment → :environment`, `Ci::PipelineSchedule → :ci_pipeline_schedule`, `Import::SourceUser → :import_source_user`, `Key → :key`); non-AR values fill the remaining characteristics positionally. Polymorphic positions (`web_hook_test` parent is Project or Group; `members_delete` source is Project, Group, or Namespace; `actor` characteristics expand to `%i[user ip]`) list every accepted name; labkit's `_unknown_` sentinel fills the alternatives that aren't in scope, distinguishing Redis keys per real type. This shape caught a latent collision in `web_hook_event_resend`, which would otherwise conflate Project#5 and Group#5 hook resends on `_enforce` flip. ## Cohort 1 entries (preserved as-is) `pipelines_create`, `notes_create`, `search_rate_limit`, `users_get_by_id`, `user_sign_in`. Per-key flags retained; staged rollout from #28803 / !233816 unaffected. ## Cohort 2 entries 83 CE entries enumerated in `lib/gitlab/application_rate_limiter/labkit_adapter/supported_rate_limits.rb`. 12 EE entries in `ee/lib/ee/gitlab/application_rate_limiter/labkit_adapter/supported_rate_limits.rb`: `ai_catalog_item_report`, `code_suggestions_connection_details`, `code_suggestions_direct_access`, `code_suggestions_x_ray_dependencies`, `code_suggestions_x_ray_scan`, `create_duo_otel_workflow`, `credit_card_verification_check_for_reuse`, `dependency_scanning_sbom_scan_api_throttling`, `duo_workflow_direct_access`, `orbit_query`, `package_metadata`, `virtual_registries_endpoints_api_limit`. Three keys initially omitted were added during review: `email_verification`, `web_hook_test`, `project_testing_integration`. ## Out of scope Documented in the exclusion comment block at the top of `supported_rate_limits.rb`: - **Multi-modal keys** (peek + non-peek call sites; would silently see zero on peek under enforce): `notification_emails`, `glql`, `permanent_email_failure`, `temporary_email_failure`, `update_namespace_name`, `hard_phone_verification_transactions_limit`, `soft_phone_verification_transactions_limit`, `web_hook_calls{,_low,_mid}`. `phone_verification_send_code` / `_verify_code` (no peek callers) are in Cohort 2. - **Resource-strategy keys**: `IncrementPerActionedResource` and `IncrementResourceUsagePerAction`; the dispatcher already filters these. - **Threshold-zero keys** (`web_hook_calls` trio without explicit thresholds): legacy short-circuits before any Redis call. - **Override callers** (`threshold:` / `interval:` kwargs): override frequency is observable via `gitlab_rate_limiter_labkit_override_total`. - **`peek` callers** stay on the legacy path. Adapter gate is `!peek && IncrementPerAction`. - **EE keys with no current call sites**: `container_scanning_for_registry_scans`, `dependency_scanning_sbom_scan_api_download`, `dependency_scanning_sbom_scan_api_upload`, `semantic_code_search_ad_hoc_indexing`, `semantic_search_rate_limit`. Scope shape is unknown without a caller; reassess when one lands. - **EE partner API keys** (`partner_aws_api`, `partner_gcp_api`, `partner_postman_api`): sub-second intervals (`interval: 1.second`) don't fit labkit's TTL-based windowing cleanly; deferred to a specialised rollout. ## Inherited contract from Cohort 1 - Adapter module: `lib/gitlab/application_rate_limiter/labkit_adapter.rb`. Each entry registers `limiter_name`, `rule_name`, `characteristics`, `action`, plus optional `flag_scope:`. - Observability: existing `gitlab_rate_limiter_labkit_shadow_total{key,agreement,boundary}` (per-key labkit-vs-legacy agreement, with `boundary:true` tagging for sub-second window-edge skew) and `gitlab_rate_limiter_labkit_override_total{key,override}` Prometheus counters. - Disjoint Redis namespaces by design: `application_rate_limiter:...` (legacy) and `labkit:rl:...` (labkit) increment independently during shadow validation. Counts will not match across namespaces. ## Labkit prerequisite `gitlab-labkit ~> 1.17.0` (matches Cohort 1's floor). Note: `Labkit::RateLimit::Result#action` reflects outcome (`:block` / `:log` / `:allow` / nil) rather than the rule's configured action. The configured action remains available via `result.rule.action`. The adapter relies on `result.exceeded?`. ## TODO - [x] Enumerate the Cohort 2 keys (83 CE + 12 EE registered) - [x] Register each in `SupportedRateLimits.entries` and the EE override - [x] Add the cohort-wide wip feature flag YAMLs (`rate_limiter_use_labkit_cohort_2{,_enforce}`) - [x] Extract registry to `supported_rate_limits.rb` sibling file with EE override module - [ ] Open MR (decision pending: stack on Cohort 1 branch, or wait for #28803 / !233816 to merge then rebase) - [ ] Confirm shadow-mode parity (cohort-wide flag enabled, enforce off) via `gitlab_rate_limiter_labkit_shadow_total` per key before flipping `_enforce` - [ ] Follow the shadow validation and enforcement flip process from the overarching issue (#28808)
issue