Refactor move retry 429 status code logic to one place

What does this MR do?

Refactor, move retry 429 status code logic to one place

  • Move 429 retry after handling from GitLab client to rate limit requester.
  • This will help us gather metrics for retries: #38654 (closed)
  • Updated the rate limit retry tests to be less flaky and test all edge cases.

The retry logic has changed a little bit. Earlier, on the rate limit requester, we would only fallback to the default wait time when we fail to parse the RateLimit-ResetTime header. For Retry-After, we would wait for the time, but won't retry.

This change combines the two and introduces enhancement to the retry logic.

The retry logic:

  1. Check RateLimit-ResetTime header

    • If present with valid HTTP Date format (RFC1123): Wed, 21 Oct 2015 07:28:00 GMT
    • The runner waits until the specified time, then retries
  2. Fallback to Retry-After header

    • If RateLimit-ResetTime is invalid or missing
    • Accepts seconds format: Retry-After: 30
    • Runner waits for the specified duration, then retries
  3. Default wait time

    • If both headers are missing or invalid
    • The runner waits for the default interval before retry

Why was this MR needed?

What's the best way to test this MR?

go test -timeout 30s -run gitlab.com/gitlab-org/gitlab-runner/network

What are the relevant issue numbers?

Issue-38654

Merge request reports

Loading