agent -> kas rate limit on per token basis

Goals

In order to ensure scalability of the Agent and mitigate abuse we should enable rate limiting at the token auth request level.

Definition of Done

Enable rate limiting to N new connections per minute per token, where N can be configured with Kas

Background

Each agentk opens a long lived connection, along which which responses are streamed back by Kas at an interval controlled by Kas
An agentk can only trigger a gitaly call by starting a new connection
So Kas should rate limit new connections per token

Proposal

A redis based rate limit from within Kas, implemented as gRPC middleware

1. Redis client: redigo

This will be the first internal use of redis, so we need to bring a redis client. We should use redigo because it is

already used by workhorse
one of the two officially recommended redis clients at the time of writing
actively maintained

2. Algorithm: Discrete minutely buckets

This is an efficient and simple to implement algorithm which should be good enough for our use case. It works on discrete minutely buckets, so if we want N to be the maximum in any 1 minute interval, we should put the limit per bucket to N/2.

3. Proposed default value of `N`: "Small", 100 new connections per token per minute

The connections are expected to be long lived. 100 new connections per token per minute is probably excessive, but it should also be safe while we have low usage. We can revisit this in the future once we have data on the reconnect frequency distribution and better understand the impact of a reconnect on system load.

4. Code-level implementation details

Implement a rate limiting gRPC middleware
Use interfaces instead of concrete structures to facilitate testing
Redis should be a soft dependency (possible to disable) for the first iteration, because we do not have a redis yet

Edited Oct 08, 2020 by Hordur Freyr Yngvason