Implement Rate Limiter for GLQL's timing out requests

What does this MR do and why?

Please refer to this issue description https://gitlab.com/gitlab-org/gitlab/-/issues/517542 for more information on the problem we are trying to solve.

This MR implements a rate limiter for GLQL queries. Now, if a query fails twice within a 15‑minute window, further attempts are blocked.

We identify a slow/failed query by the SHA we generate from the query itself.

Important

Though originally the intention was to use GitLab::CircuitBreaker, we decided to switch to Gitlab::ApplicationRateLimiter. The way GitLab::CircuitBreaker creates Redis keys (key for every successful query + 3 keys for failed query) would not work for our use case because of high cardinality of possible GLQL queries. GitLab::ApplicationRateLimiter uses a single key per (only failed) query (e.g., application_rate_limiter:glql:<SHA>:<TIMESTAMP>) to maintain a counter. This approach seems more efficient.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Old flow

On every request:

image

New flow

First two requests:

image

Third and consecutive requests within next 15 minutes:

image

How to set up and validate locally

Click to expand testing steps
  1. Enable GLQL feature flag by running Feature.enable(:glql_integration) in rails console
  2. Find or create an issue in gdk and add there in a comment box or in a description any GLQL query and save it, for example:
```glql
display: table
fields: title, labels("workflow::*"), author, weight, assignees
query: group = "gitlab-org" and label != "bug::vulnerability"
```
  1. I tried simulating the timeout error by adding the raise in execute method to app/controllers/glql/base_controller.rb:
def execute
  raise ActiveRecord::QueryAborted
  super
rescue ActiveRecord::QueryAborted => error
  # ...
end
  1. Reload the page first time and you should now get the 503 timeout error with retry button like so:

image

  1. Check in rails console that the Redis key was created for the first failed attempt (with counter 1):
# rails c
pry(main)> keys = Gitlab::Redis::RateLimiting.with { |redis| redis.keys }
=> ["application_rate_limiter:glql:64ee2123a605ca3ddcbcaa2b16c13265fee137ca96ca27715f06634255094799:1934146"]
pry(main)> Gitlab::Redis::RateLimiting.with { |redis| redis.get(keys.first) }
=> "1"
  1. Reload it the 2nd time and you should get 503 error again, however, since it's the 2nd fail, the rate limit now applies. You can verify that the corresponding Redis key has counter 2:
# rails c
pry(main)> Gitlab::Redis::RateLimiting.with { |redis| redis.get(keys.first) }
=> "2"
  1. If you retry this query again, this time you get the GlqlQueryLockedError 403 (forbidden) error:

image

  1. After 15 minutes, the redis keys are cleaned up and you can retry the query again. For testing purposes, you can set the interval to, for example 60 seconds, in glql: { threshold: 1, interval: 15.minutes } in lib/gitlab/application_rate_limiter.rb.

Relates to https://gitlab.com/gitlab-org/gitlab/-/issues/517542

Edited by Alisa Frunza

Merge request reports

Loading