Draft: [Spec 7] Stage 1b: Rule names — stable identity in keys, logs, and headers (labkit-ruby)
## Spec
### Problem Statement [required]
Stage 1a (Spec 5 / [labkit-ruby!270](https://gitlab.com/gitlab-org/ruby/gems/labkit-ruby/-/merge_requests/270)) ships the `Labkit::RateLimit.check(call_site:, identifier:, rules:)` entry point, the `Identifier`/`Rule` value objects, and the `Evaluator` that does the per-rule Redis counting. Each matching rule gets an independent counter, with the key shape:
```
labkit:rl:{call_site}:{rule_index}:{char_key}:{char_value}
```
`rule_index` is the rule's 0-based position in the `rules` array. It was introduced in Spec 5 as a uniqueness suffix so that two rules sharing characteristics on the same call site never collide — but it has three problems that this spec addresses:
1. **Counters silently move when a rule is reordered or removed.** Inserting a new rule at index 0 reassigns every existing rule's counter to a new key. In-flight counts are abandoned mid-window, which is invisible at runtime and dangerous during a live rollout (e.g. a SRE adding a temporary `:log` rule to investigate an incident inadvertently resets every other counter for that call site).
2. **Logs and response headers cannot identify which rule fired in human terms.** Spec 5's log payload includes `rule_index: 0`, not a name. The epic's 1b/1c sections explicitly call for a `name` field "for logging and response headers" so that an SRE reading a log or a customer reading `RateLimit-Remaining` can identify *which* rule fired (e.g. `"authenticated_api"` vs `"unauthenticated_api"`).
3. **Spec 6 / Stage 1c (response-header state) has no stable handle to expose.** Its `RuleState` currently carries `rule_index`, which is meaningless to a customer or downstream log consumer.
Stage 1b closes this gap by introducing a required `name` attribute on `Rule`, validating its shape, requiring uniqueness within a single `.check` call, and threading it through the Redis key, the structured log payload, and the data passed to Stage 1c's `RuleState`. The `Labkit::RateLimit.check` signature is unchanged; the change is in the rule shape callers pass.
This is the "real" Stage 1b from epic [#2021](https://gitlab.com/groups/gitlab-com/gl-infra/-/work_items/2021). Stage 1a (Spec 5) bundled the `RateLimit.check` orchestration alongside the identifier work; this spec finishes the Stage 1b scope from the epic — rule identity — without redoing what Spec 5 already shipped.
### Non-Goals [required]
- No changes to `Labkit::RateLimit.check` parameter list (still `call_site:`, `identifier:`, `rules:`)
- No changes to `Labkit::RateLimit.check` return type (still `:allow` / `:block` Symbol — Spec 6 / Stage 1c is responsible for the richer `Result` return type)
- No changes to `Identifier` or `Evaluator` orchestration logic beyond key construction and log payload
- No persistence/migration of existing `rule_index`-keyed counters: the Stage 1a key format has no production callers (Stage 2 has not begun); old counters age out on their natural TTL
- No support for a "rule namespace" or hierarchical name (e.g. `api.authenticated_api`); names are flat strings
- No support for unnamed rules (legacy / convenience)
- No expression-matching, OR-matching, or rule fallback ordering — that remains future work
- No Rails integration (Stage 2)
### Acceptance Criteria [required]
**Scenario 1: Rule requires `name`**
- Given: a rule built without a `name:` keyword
- When: `Rule.new(match: { user: 42 }, characteristics: [:user], limit: 100, period: 60, action: :block)` is called
- Then: `ArgumentError` is raised with a message containing `"name"`; test: `expect { Rule.new(...) }.to raise_error(ArgumentError, /name/)`
**Scenario 2: Valid `name` is preserved on the rule**
- Given: `Rule.new(name: "authenticated_api", match: { user: 42 }, characteristics: [:user], limit: 100, period: 60, action: :block)`
- When: `rule.name` is read
- Then: returns the symbol-equivalent or frozen string `"authenticated_api"` (whichever the spec settles on; tests assert equality against the canonical form documented in the gem's README)
**Scenario 3: Invalid `name` raises in dev/test**
- Given: `RAILS_ENV` is `"development"` or `"test"`; a rule with `name: "Authenticated API"` (contains uppercase and space)
- When: `Rule.new(...)` (or `.check` with that rule in the array) is called
- Then: `ArgumentError` is raised with a message naming the invalid `name` value and the format requirement; test: `expect { Rule.new(name: "Authenticated API", ...) }.to raise_error(ArgumentError, /name/)`
**Scenario 4: Invalid `name` is sanitized in production**
- Given: `RAILS_ENV` is `"production"` (or unset / not in `[development, test]`); a rule with `name: "Authenticated API!"`
- When: `.check` is called with that rule in the array
- Then: no exception; the name is sanitized by replacing characters not matching `/\A[a-z0-9_]+\z/` with `_` and lowercasing, producing `"authenticated_api_"`; a WARN-level structured log entry is written with `message: "rate_limit_invalid_rule_name"`, the original name, the sanitized name, and the `call_site`; the call proceeds with the sanitized name in the Redis key and log payload
**Scenario 5: Duplicate `name`s within one `.check` call raise in dev/test**
- Given: `RAILS_ENV` is `"test"`; a rules array with two rules sharing `name: "authenticated_api"`
- When: `.check` is called with that array
- Then: `ArgumentError` is raised with a message containing the duplicated name; test: `expect { check(rules: [r1, r1_dup]) }.to raise_error(ArgumentError, /authenticated_api/)`
**Scenario 6: Duplicate `name`s in production drop later occurrences**
- Given: `RAILS_ENV` is `"production"`; a rules array with two rules sharing `name: "authenticated_api"` (call them `r_first` and `r_dup`, in that order)
- When: `.check` is called
- Then: no exception; a WARN-level log with `message: "rate_limit_duplicate_rule_name"` is written naming the duplicated name and recording that one occurrence was dropped; only `r_first` is evaluated and only its counter is incremented; `r_dup` is silently skipped. Rationale: silently suffixing duplicates produces an unstable name (it depends on array ordering) and would surface in customer-facing response headers, leaking internal config layout. Dropping is loud in logs, deterministic, and avoids quota sharing.
**Scenario 7: Redis key uses `name` not index**
- Given: a rule `{ name: "authenticated_api", match: { user: 42 }, characteristics: [:user], limit: 100, period: 60, action: :block }` at any position in the rules array
- When: `.check` is called with `call_site: "rack_request"` and identifier `{ user: 42 }`
- Then: the Redis key incremented is `labkit:rl:rack_request:authenticated_api:user:42`; specifically, the key does NOT include the integer rule index
**Scenario 8: Reordering rules in the array does not move counters**
- Given: a rules array `[r_a, r_b]` where `r_a.name = "first"` and `r_b.name = "second"`; `.check` is called once with this array, incrementing both `r_a` and `r_b`'s counters by 1
- When: `.check` is called a second time with the array reordered to `[r_b, r_a]`
- Then: the key for `r_a` (`labkit:rl:rack_request:first:user:42`) has count 2 (not reset by the reordering); the key for `r_b` has count 2 likewise; no key with `:0:` or `:1:` segments exists
**Scenario 9: Two rules with the same characteristics but different names do not collide**
- Given: rules array with `r_a = { name: "rule_a", characteristics: [:user], ... }` and `r_b = { name: "rule_b", characteristics: [:user], ... }`; both match
- When: `.check` is called with identifier `{ user: 42 }`
- Then: two distinct keys are incremented: `labkit:rl:rack_request:rule_a:user:42` and `labkit:rl:rack_request:rule_b:user:42`; deleting `rule_a`'s key does not affect `rule_b`'s counter
**Scenario 10: Structured log includes `rule_name`**
- Given: a matched rule with `name: "authenticated_api"` and `action: :block`, count below limit
- When: `.check` is called
- Then: an INFO-level structured log entry is emitted with `rule_name: "authenticated_api"` (in addition to the fields already in Spec 5's log payload — `call_site`, `action`, `limit`, `period`, `count`, `matched`, `exceeded`, `identifier`, `redis_key`); `rule_index` is no longer present in the payload
**Scenario 11: WARN log when a `:block` rule is exceeded includes `rule_name`**
- Given: a matched rule with `name: "authenticated_api"`, `action: :block`, count above limit
- When: `.check` is called
- Then: a WARN-level structured log entry is emitted with `rule_name: "authenticated_api"`, `exceeded: true`, and `severity: "WARN"`; the aggregate return value is `:block`
**Scenario 12 (failure case): `name` is not a String or Symbol**
- Given: a rule built with `name: 42` (Integer)
- When: `Rule.new(...)` is called
- Then: `ArgumentError` is raised with a message indicating `name` must be a `String` or `Symbol`; test: `expect { Rule.new(name: 42, ...) }.to raise_error(ArgumentError, /name/)`
**Scenario 13 (failure case): `name` exceeds maximum length**
- Given: a rule with `name:` set to a 65-character string of valid characters
- When: `Rule.new(...)` is called in `RAILS_ENV=test`
- Then: `ArgumentError` is raised; the maximum length is **64 characters**, chosen so that the human-readable segment of the Redis key fits well within typical observability tool truncation limits (the surrounding key shape — `labkit:rl:` + a ~32-char `call_site` + `:` + 64-char name + `:` + char_key + `:` + char_value (≤200 raw or 64-char SHA-256 hex) — stays under 256 bytes for the common case). Production sanitizes by truncating to 64 and emitting a WARN.
**Scenario 14: Sanitized name is still unique-checked**
- Given: `RAILS_ENV=production`; rules array `[{ name: "Foo!" }, { name: "foo_" }]` — after sanitization, both become `"foo_"`
- When: `.check` is called
- Then: this is treated as a duplicate (Scenario 6 path) — the second occurrence is dropped with a WARN. The Redis key for the surviving rule is `labkit:rl:rack_request:foo_:user:42`.
### Security Considerations [required]
- **Redis key surface:** `name` becomes part of the Redis key. The `/\A[a-z0-9_]+\z/` validation prevents key-format injection (no `:` separator characters; no `*` glob characters that could mis-target `KEYS`/`SCAN`). The 64-character limit prevents unbounded key growth from a misconfigured caller.
- **Log injection:** `name` appears in structured log payloads. Same validation applies — no control characters, no quotes, no JSON-breaking content. The sanitization-in-production path runs *before* the value reaches the log output; the original (unsanitized) value is included in the `rate_limit_invalid_rule_name` WARN entry as a separate `original_name` field so log consumers can still see what the caller passed.
- **Quota theft via duplicate names:** Two rules with the same `name` (after sanitization) would share a counter, allowing one rule's traffic to consume another rule's quota. The dev/test raise path makes this loud during development; the production drop-with-WARN path eliminates the sharing without raising and without producing an unstable disambiguated name that would leak into customer-facing headers.
- **PII in names:** Names are caller-supplied static strings (e.g. `"authenticated_api"`), not derived from request data. There is no path by which user input can become part of a `name`. This is documented in the gem README and enforced socially, not technically — the same way `call_site` is treated.
- **Backward-compatibility key collision:** The new key shape (`:{name}:`) and the old key shape (`:{rule_index}:`) could in principle collide if a caller chose an all-numeric `name`. This is not prevented by validation — it is acceptable because no production caller exists yet (Stage 2 has not started), so there are no live `rule_index`-shaped counters to collide with. Once Stage 2 begins, callers should use descriptive names; this is a README convention, not a code-enforced rule.
### Rollout & Backwards Compatibility [required]
- **Depends on Spec 5 / MR [!270](https://gitlab.com/gitlab-org/ruby/gems/labkit-ruby/-/merge_requests/270) being merged first.** This spec edits files (`rule.rb`, `evaluator.rb`, README) that Spec 5 introduces. If !270 is still in review when implementation of this spec starts, this spec's MR is rebased onto Spec 5's branch.
- **No production callers exist.** Spec 5 is gem-only; Stage 2 (Rails integration) has not started. Therefore no live Redis counters use the old key shape, and no live log consumers parse `rule_index`. The `rule_index → name` switch is a clean replacement, not a migration.
- **Self-managed / Dedicated / Cells:** Unaffected — gem is not loaded from any new callsite until Stage 2. The new gem version is a minor bump on the 0.x line.
- **Versioning:** Minor version bump of `labkit-ruby`. Rails `Gemfile` is not updated until Stage 2.
- **No feature flag required:** Gem-level change with no live callers; nothing to flag.
- **README update:** The `lib/labkit/rate_limit/README.md` "Counter key construction" section is updated to use `name` in place of `rule_index`. The "Multi-characteristic rules: OR semantics" diagram is updated. The "What is intentionally not here" section's "Response-header state" bullet remains; "Rule names" is removed from any future-work list since this spec lands them.
- **Stage 1c (Spec 6 / 28785) coordination:** Spec 6's `RuleState` is currently spec'd with `rule_index`. The clean ordering is: Spec 6's body is updated to carry `name` instead of `rule_index` *before* Spec 6's implementation begins. If Spec 7 is posted before Spec 6 implementation starts, leave a coordinating note on 28785 referencing this spec so the next person picking up Spec 6 sees it. If Spec 6 has already started, this spec rebases onto Spec 6's branch and `RuleState` is updated as part of this MR.
### Validation Loop / Verification Process [required]
All of the following must pass on the implementation MR. Full output posted as a comment on the issue before the MR is marked ready for human review.
```bash
# In labkit-ruby repo, after rebasing onto Spec 5's branch (or master if !270 is merged):
# Full rate-limit suite
mise exec -- bundle exec rspec \
spec/labkit/rate_limit_spec.rb \
spec/labkit/rate_limit/identifier_spec.rb \
spec/labkit/rate_limit/rule_spec.rb \
spec/labkit/rate_limit/evaluator_spec.rb \
--format documentation
# Required-name validation
RAILS_ENV=test mise exec -- bundle exec rspec spec/labkit/rate_limit/rule_spec.rb -e "name is required"
# Format validation: dev/test raises, production sanitizes
RAILS_ENV=test mise exec -- bundle exec rspec spec/labkit/rate_limit/rule_spec.rb -e "invalid name"
RAILS_ENV=production mise exec -- bundle exec rspec spec/labkit/rate_limit_spec.rb -e "invalid name sanitized"
# Duplicate-name handling
RAILS_ENV=test mise exec -- bundle exec rspec spec/labkit/rate_limit_spec.rb -e "duplicate rule names raise"
RAILS_ENV=production mise exec -- bundle exec rspec spec/labkit/rate_limit_spec.rb -e "duplicate rule names dropped"
# Reorder-stability — the central regression test
mise exec -- bundle exec rspec spec/labkit/rate_limit/evaluator_spec.rb -e "reordering rules does not move counters"
# Rubocop
mise exec -- bundle exec rubocop \
lib/labkit/rate_limit/rule.rb \
lib/labkit/rate_limit/evaluator.rb \
spec/labkit/rate_limit/rule_spec.rb \
spec/labkit/rate_limit/evaluator_spec.rb \
spec/labkit/rate_limit_spec.rb
```
Files to modify:
- `lib/labkit/rate_limit/rule.rb` — add `name` to the `Struct.new(...)` keyword list; add validation in `initialize`
- `lib/labkit/rate_limit/evaluator.rb` — replace `rule_index` with rule `name` in `build_redis_key`; replace `rule_index` with `rule_name` in the log payload; add the duplicate-name check before the rule-evaluation loop
- `lib/labkit/rate_limit/README.md` — update key shape documentation, examples, and diagrams; remove any reference to `rule_index` in the on-call section
- `lib/labkit/rate_limit.rb` — if `.check` itself orchestrates duplicate-name detection (preferred — keeps `Evaluator` focused on counting), add it here
Files to create:
- (none — extends Spec 5's existing files)
Every Given/When/Then scenario maps to at least one `it` block. Tests must catch a no-op implementation: a `Rule` that ignores `name:` must fail Scenarios 7 and 9; an evaluator that retains `rule_index` in keys must fail Scenarios 7 and 8.
### Observability [optional]
Per-rule evaluation log payload after this spec lands (changes from Spec 5 marked with `←`):
```json
{
"severity": "INFO",
"message": "rate_limit_check",
"call_site": "rack_request",
"rule_name": "authenticated_api", // ← new (replaces rule_index)
"action": "block",
"limit": 100,
"period": 60,
"count": 42,
"matched": true,
"exceeded": false,
"identifier": { "user": 42, "ip": "1.2.3.4" },
"redis_key": "labkit:rl:rack_request:authenticated_api:user:42"
}
```
New WARN events introduced by this spec:
```json
{ "severity": "WARN", "message": "rate_limit_invalid_rule_name", "call_site": "rack_request",
"original_name": "Authenticated API!", "sanitized_name": "authenticated_api_" }
{ "severity": "WARN", "message": "rate_limit_duplicate_rule_name", "call_site": "rack_request",
"name": "authenticated_api", "dropped_occurrence": 2 }
```
On-call filters:
- `message=rate_limit_check AND rule_name="authenticated_api" AND exceeded=true` — find requests blocked by a specific rule
- `message=rate_limit_invalid_rule_name` — find misconfigured callers
- `message=rate_limit_duplicate_rule_name` — find accidental rule duplication (e.g. a feature flag adding a rule that already exists statically); the `dropped_occurrence` field tells you which 1-based position in the rules array was discarded
On-call key inspection — given a `rate_limit_check` event, the `redis_key` field is still directly usable; nothing changes about the inspection workflow except that the second-to-last key segment is now a human-readable name rather than an integer index.
---
issue