Skip to content

GET /api/:version/projects/:id/repository/branches/:branch can be very slow for empty repositories

During gitlab-com/gl-infra/production#4582 (closed), we spotted unusual activity on the GET /api/:version/projects/:id/repository/branches/:branch endpoint.

  • If the underlying repository does not exist, the request can spend upwards of 3s on CPU.
  • Because of the way Puma/Ruby threading works, this is likely to cause CPU contention in the process handling the request, slowing down other requests in the same process.
  • These requests for nonexistent repositories also consume large amounts of memory - over 400MB per request.
  • This endpoint is polled by tools such as gitlab-ci-pipelines-exporter so the problem is amplified by the polling action of the exporter (50k per day for one user alone)

Example calls https://log.gprd.gitlab.net/goto/4dfa22a51cfc98161f5fdbcb9407bb54

CPU Time Percentiles

image

https://log.gprd.gitlab.net/goto/f29c361409cbbb8f8e18cd9bb28a7d65

Request Latency Percentiles

image

https://log.gprd.gitlab.net/goto/6ab6367c106374237eae86687b84b997

What should change?

  1. All /api/:version/projects/:id/repository/* endpoints should return a 404 result when the repository does not exist. 500 errors mean that there is a server problem, this is not a server problem.
  2. Additionally, when returning a 500, operators could be alerted, and the issue counts against the stage group teams error budget.
  3. These requests should be fast, only taking a few milliseconds of CPU and overall being much faster than several seconds, additionally it should not use 400MB of memory.
Edited by Andrew Newdigate