Skip to content

Request rate limiter can consume the HTTP body before retry

In gitlab-com/gl-infra/production#19438 (closed), we saw that the Runner got rate limited by GitLab's Rack Attack and saw messages such as:

Appending trace to coordinator... error couldn't execute PATCH against https://us-east1-c.ci-gateway.int.gprd.gitlab.net:8989/api/v4/jobs/9345524420/trace?debug_trace=false: Patch "https://us-east1-c.ci-gateway.int.gprd.gitlab.net:8989/api/v4/jobs/9345524420/trace?debug_trace=false": http: ContentLength=10175 with Body length 0

In reviewing the code in https://gitlab.com/gitlab-org/gitlab-runner/-/blob/b711db59343a6055673f88cf5f733ddef7c4926f/network/ratelimit_requester.go#L41-66, I suspect this happens because:

  1. Runner reads the HTTP request body and sends the PATCH request.
  2. GitLab sends a 429.
  3. Runner retries with the same request, but this fails because the body is now empty.

One way to fix this is to copy the buffer first and retry with that buffer.

Edited by Stan Hu