Skip to content

Remove Gitaly rate limiter

Quang-Minh Nguyen requested to merge qmnguyen0711/remove-rate-limiter into master

For #5011

Gitaly supports RPC-based rate limiting, which theoretically serves as an effective defense mechanism by guarding Gitaly against a sudden influx of requests. However, this approach may prove to be less effective in practice. Currently, Gitaly serves as an upstream service that caters to other user-facing services, which means that a user-initiated action may ultimately translate into one, a dozen, or even thousands of RPCs directed towards Gitaly. Consequently, setting an appropriate limit that functions effectively in most scenarios is a task that is not easy to accomplish.

Git repositories are inherently diverse, and significant variations in latencies can occur between repositories or even between different parameters within the same repository. Such variations lead to significant differences in Gitaly's ability to handle a load. Given that rate-limiting's success is highly dependent on highly granular configurations that must be continuously adjusted until they are satisfactory, a solution that works in the current context can become obsolete quickly. Furthermore, recent Gitaly incidents suggest that malicious activities never reach the desired RPS threshold and are often indistinguishable from legitimate requests.

It's worth noting that Gitaly is never exposed to the outside world. Even in cases where Gitaly is deliberately exposed to the internet, it is necessary to layer Gitaly behind robust protection, such as Cloudflare and a sophisticated load balancer. These layers are vastly superior in safeguarding against malicious activities and offer greater flexibility.

In light of the above, other protective measures such as adaptive concurrency limits or load-shedding may be more appropriate. Common subsystems discourage rate-limiting and prefer employing other methods such as better resource allocation, management, and concurrency limits. Thus, it makes sense to consider alternative protection strategies because removing rate-limiting seems more appropriate in this context.

Changelog: removed

Merge request reports