agent -> kas rate limit on per token basis
Goals
In order to ensure scalability of the Agent and mitigate abuse we should enable rate limiting at the token auth request level.
Definition of Done
-
Enable rate limiting to N
new connections per minute per token, whereN
can be configured with Kas
Background
- Each agentk opens a long lived connection, along which which responses are streamed back by Kas at an interval controlled by Kas
- An agentk can only trigger a gitaly call by starting a new connection
- So Kas should rate limit new connections per token
Proposal
A redis based rate limit from within Kas, implemented as gRPC middleware
redigo
1. Redis client:This will be the first internal use of redis, so we need to bring a redis client. We should use redigo because it is
- already used by workhorse
- one of the two officially recommended redis clients at the time of writing
- actively maintained
Discrete minutely buckets
2. Algorithm:This is an efficient and simple to implement algorithm which should be good enough for our use case. It works on discrete minutely buckets, so if we want N
to be the maximum in any 1 minute interval, we should put the limit per bucket to N/2
.
N
: "Small", 100 new connections per token per minute
3. Proposed default value of The connections are expected to be long lived. 100 new connections per token per minute is probably excessive, but it should also be safe while we have low usage. We can revisit this in the future once we have data on the reconnect frequency distribution and better understand the impact of a reconnect on system load.
4. Code-level implementation details
- Implement a rate limiting gRPC middleware
- Use interfaces instead of concrete structures to facilitate testing
- Redis should be a soft dependency (possible to disable) for the first iteration, because we do not have a redis yet