Skip to content

API endpoint /api/:version/groups/:id has n+1 across postgres, gitaly, and redis-cache

The /api/:version/groups/:id endpoint can be used to amplify traffic to both redis-cache and gitaly.

Example incident:

Example request:

{
  "route": "/api/:version/groups/:id",
  "status": 200,

  "duration_s": 4.48373,
  "cpu_s": 2.86766,

  "db_replica_count": 142,
  "db_replica_duration_s": 0.239,
  
  "gitaly_calls": 198,
  "gitaly_duration_s": 1.329452,
  
  "redis_cache_calls": 601,
  "redis_cache_duration_s": 0.529326,
}

https://log.gprd.gitlab.net/goto/7574c8f8102082b3c79ab87d67db3f7f

The above request shows ~150 calls on the postgres replica, ~200 calls to gitaly, and ~600 calls to redis-cache. This is a vector for amplifying traffic and poses a scalability and DoS risk.

Related: &3533 (closed).

Verification

We want to vastly reduce or eliminate requests to this endpoint with 100+ gitaly calls.

Screenshot_2021-08-06_at_10.38.03

https://log.gprd.gitlab.net/goto/a963966f642f7b9a18e667bb664aff31

Problem to solve

Reduce currently heavy usage of /api/:version/groups/:id when performing queries against the projects field. This field is already deprecated and will be removed in %15.0.

Proposal

Add a rate limit to this endpoint, which can be revisited when the projects field is removed in %15.0.

Edited by Sean Carroll