agent -> kas rate limit on per token basis

Goals

In order to ensure scalability of the Agent and mitigate abuse we should enable rate limiting at the token auth request level.

Definition of Done

  • Enable rate limiting to N new connections per minute per token, where N can be configured with Kas

Background

  • Each agentk opens a long lived connection, along which which responses are streamed back by Kas at an interval controlled by Kas
  • An agentk can only trigger a gitaly call by starting a new connection
  • So Kas should rate limit new connections per token

Proposal

A redis based rate limit from within Kas, implemented as gRPC middleware

1. Redis client: redigo

This will be the first internal use of redis, so we need to bring a redis client. We should use redigo because it is

  1. already used by workhorse
  2. one of the two officially recommended redis clients at the time of writing
  3. actively maintained

2. Algorithm: Discrete minutely buckets

This is an efficient and simple to implement algorithm which should be good enough for our use case. It works on discrete minutely buckets, so if we want N to be the maximum in any 1 minute interval, we should put the limit per bucket to N/2.

3. Proposed default value of N: "Small", 100 new connections per token per minute

The connections are expected to be long lived. 100 new connections per token per minute is probably excessive, but it should also be safe while we have low usage. We can revisit this in the future once we have data on the reconnect frequency distribution and better understand the impact of a reconnect on system load.

4. Code-level implementation details

  • Implement a rate limiting gRPC middleware
  • Use interfaces instead of concrete structures to facilitate testing
  • Redis should be a soft dependency (possible to disable) for the first iteration, because we do not have a redis yet
Edited by Hordur Freyr Yngvason