feat(rate_limit)!: cost-aware atomic increment via Lua script
What this MR does
Implements gitlab-com/gl-infra/production-engineering#28827 (closed). Consolidates the labkit rate-limit Redis interaction into a single atomic Lua script and extends the counter primitive to support arbitrary per-call cost via INCRBYFLOAT. Lays the foundation for cohort 5 (gitlab-com/gl-infra/production-engineering#28812 (closed), IncrementResourceUsagePerAction) without requiring a parallel float-cost path in the Evaluator.
The atomicity property fixes a latent race in the prior pipelined implementation: INCR + EXPIRE were two separate Redis operations, so a Redis crash between them left keys without TTL. The Lua script runs as one operation; no such window exists.
Design
See the design discussion on gitlab-com/gl-infra/production-engineering#28827 (closed) for the full rationale. Summary:
- Single Lua script handles both count-mode (cost=1, INCRBYFLOAT against integer-encoded keys, transparent to existing rules) and cost-mode (fractional cost for cohort 5's resource-usage callers).
cost=0short-circuits to a GET so resource-usage callers that observed zero usage do not allocate a Redis key.ttl_before < 0self-heals keys without TTL (defensive recovery from any prior pipelining bug).- EVALSHA with NOSCRIPT fallback to EVAL handles Redis restarts that wipe the script cache.
API surface
# New cost: keyword, default 1 preserves existing call-site behavior
Labkit::RateLimit::Limiter#check(identifier, cost: 1)
Labkit::RateLimit.check(name:, identifier:, rules:, cost: 1)Limiter#peek is unchanged: it doesn't increment, so it has no cost semantics.
Breaking change disclosure
Result::Info#count is now Float (previously documented as Integer). INCRBYFLOAT returns a string-encoded number that the evaluator parses uniformly as Float for both integer-valued and fractional counters. Implications:
result.exceeded?andresult.actionare unaffected.- Numeric comparisons against Integer thresholds work via Ruby's coercion (
Float > Integeris well-defined). - Pattern-matching on Integer or strict-type checks on
info.count/info.remainingwill break. to_response_headerscoerces back to Integer per the RateLimit header spec (RFC 9239 et al), so HTTP-header consumers see no change.
A git grep result.info.count in gitlab-org/gitlab returned zero hits; only exceeded? and action are consumed downstream.
Reviewer notes
spec/labkit/rate_limit/evaluator_spec.rbis a full rewrite from mock-based to TestRedis-backed, mirroringrate_limit_spec.rb's idiom now that !289 (merged) makes real Redis available in CI. Recommend reading the new file as a standalone, not as a line-by-line diff.spec/support/test_redis.rbTime.zone.now → Time.now is a drive-by fix; split into its own commit (a51a33d).Time.zone.nowrequires ActiveSupport::TimeWithZone to be loaded, which is not guaranteed on the first cold-specTestRedis.reset!call. PlainTime.nowhas equivalent semantics for the 15-second docker compose wait.- For cohort 5 reviewers (gitlab-com/gl-infra/production-engineering#28812 (closed)): when this MR is wired into
Gitlab::ApplicationRateLimiter::LabkitAdapter, callers passingcost: 0will observe the actual current count via GET (and may block over-quota workers). The legacyIncrementResourceUsagePerAction#incrementreturns 0 unconditionally when@usage == 0, so over-quota workers currently get a free pass on zero-cost requests. This is a real behavior change at enforce-time. Worth surfacing on the cohort 5 issue before flipping the enforce flag.
Test plan
- Full labkit-ruby spec suite passes (1796 examples, 0 failures) against the docker-compose Redis introduced by !289 (merged).
- Cost-mode integration scenarios in
spec/labkit/rate_limit_spec.rbcover cost=1 INCR parity, fractional accumulation, cost=0 read-without-write, missing-key handling, integer-limit overrun with fractional cost, TTL on first fractional write. - EVALSHA / NOSCRIPT recovery is exercised explicitly in
evaluator_spec.rb. - Self-healing of TTL=-1 keys is exercised in
evaluator_spec.rb.