Skip to content

Add throttle definition for unauthenticated HTTP Git operations

What does this MR do and why?

What? We propose to define an additional throttle definition throttle_unauthenticated_git_http for all unauthenticated git requests. Furthermore, we propose to exclude git-related http requests from the throttle definition throttle_unauthenticated_web. This allows GitLab instance admins to adjust the throttling parameters in a more fine-grained manner.

We also considered the approach to define ip addresses allowlist, e.g. the ip address of the ci runners. This allowlist could be used to excluded / ignored the ip address for the throttle definition throttle_unauthenticated_web. We disregarded this approach because it limits the flexibility of our ci runners even though our users could potentially DDoS ourselves 😅

Why?

We (Siemens-internal GitLab team managing a self-managed GitLab instance) have been observing spikes in 429 HTTP errors in our monitoring system. Upon investigation, we found that these errors are triggered by the GitLab::RackAttack throttle definition throttle_unauthenticated_web specifically when git-related HTTP requests are sent to the GitLab backend, such as GET "/namespace/project-repo.git/info/refs?service=git-upload-pack", etc.

Further investigation revealed a common scenario contributing to the spike:

  • CI pipeline of a project needs to access several git repositories of (internal) projects hosted on the self-managed GitLab instance, e.g. for scanning purposes or other reasons
  • When cloning the project repo (i.e. in the CI pipeline), the auth credentials (basic auth) are also integrated in the git command, e.g. git clone https://username:token@self-managed-gitlab/repository_url.git
  • The git clone command issues a series of web requests to the GitLab backend
  • Out of these requests, the first request does not include the auth credentials because git clients seems to work like this, see first info box in GitLab documentation, discussion thread in previous issue and here.
  • This means, the first request is considered an unauthenticated web request (normal unauthencated web traffic) and eventually throttled by the mentioned GitLab::RackAttack throttle definition throttle_unauthenticated_web (when a large number of projects are cloned in parallel)
  • During this throttling period, other unautheticated web requests (and the retries) are also throttled and accumulated which leads to the spike and to degraded experience. 💥
  • But still, the git requests are considered unauthenticated web requests and therefore throttled by the mentioned GitLab::RackAttack throttle definition throttle_unauthenticated_web

We tried to mitigate this issue by increasing the rate limits. But, we continue to experience spikes in 429 HTTP error as our user base and usage grow.

Why should the GitLab team integrate this MR? The motivation behind this MR is to:

  • Address these 429 HTTP error spikes caused by unauthenticated git-related web requests (the first unauthticated request of HTTP Git operations)
  • Ensure that our GitLab instance remains reliable and performant even as our user base and usage continue to expand
  • Maintain overall user experience and high quality of the self-managed GitLab service.
  • Potentially remove the info box By default, all Git operations are first tried unauthenticated. Because of this, HTTP Git operations may trigger the rate limits configured for unauthenticated requests. in the GitLab rate limits documentation.

🛠 with at Siemens

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

MR Checklist (@gerardo-navarro)

Screenshots or screen recordings

The following screencasts wants to illustrate the new behavior introduced by this MR. In both screencasts, the script bash ./git_clone_in_parallel.sh is used to clone three local git repostories in parallel.

Open the bash script `git_clone_in_parallel.sh`
#!/bin/bash

# Number of clones
num_clones=4

clone_output_dir_prefix="cloned-repos"
echo "Removing old cloned repositories"
rm -rf $clone_output_dir_prefix
echo "Removed old cloned repositories"

# Use a loop to clone the repository multiple times
for i in $(seq 2 $num_clones); do
  # Repository to clone
  repo="http://root:5iveL!fe12345@gdk.test:3000/root/test-space-unauthenticated-web-$i.git"

  # Create a new directory for each clone
  clone_output_dir="$clone_output_dir_prefix/test-space-unauthenticated-web-$i"

  # Clone the repository into the new directory
  (echo "Starting cloning $clone_output_dir" && git clone $repo $clone_output_dir && echo "Finished cloning $clone_output_dir") &
done

# Wait for all background jobs to finish
wait
Before (branch master) After (this MR branch)
MR Throttle unauthenticated Git HTTP requests / Screencast of existing behavior on branch master: https://www.loom.com/share/9365997ff833491aa60952b65686a5c6 MR Throttle unauthenticated Git HTTP requests / Screencast of new behavior on MR branch: https://www.loom.com/share/ea9050e74fea4492a2d574db1170f9d9

This MR adds the new throttle rate limits in the admin network settings, see Screenshot below.

image

How to set up and validate locally

  1. Migrate the database
rails db:migrate
  1. In the admin network settings, enable unauthenticated Git HTTP request rate limit and set the rate limit settings accordingly (<= you can also increase the period in seconds parameter to have more time to clone the git repos within the throttling period) image
  2. Do not forget to click the button Save changes
  3. Create three new blank projects (with README) through the web UI: http://gdk.test:3000/projects/new ; we will git clone the project's git repositories in parallel
  4. Try cloning the three git repositories in parallel, i.e. git clone http://gdk.test:3000/xxxx; NOTE: when you have quick hands 😄, you can do this manually; or, you can use the script added to the screencast section
  5. One git clone command should lead to a 429 HTTP error 💥
  6. Now, disable unauthenticated Git HTTP request rate limit in the admin network settings
  7. Wait or restart the server reset the throttling cache
  8. Again, try cloning the three git repositories in parallel => it should now be successful 🚀
Edited by Gerardo Navarro

Merge request reports