@@ -315,27 +315,25 @@ The list of semi-standard rate limiting response headers can be found [here](htt
-`Cloudflare` does not return rate limit response headers on any request.
-`RackAttack` returns rate limit response headers on throttled requests only.
-`ApplicationRateLimiter`will return rate limit response headers once [this issue](https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/25372) is implemented.
-`ApplicationRateLimiter`does not return rate limit response headers.
-`GraphQL` endpoints currently do not return rate limit response headers.
## Troubleshooting
See [this issue](https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/25372) for improvements to returning rate limiting response headers.
### Observability
## Avoiding Rate Limits
The below are internal links to support troubleshooting rate limiting related issues:
To minimize the risk of hitting rate limits, you can try the following:
- Stagger the execution of your automated pipelines.
- Configure [exponential back off and retry](https://docs.aws.amazon.com/prescriptive-guidance/latest/cloud-design-patterns/retry-backoff.html) for failed attempts.
#### Investigating RackAttack logs
## Troubleshooting
-`json.meta.user` field is set if a request is authenticated, and missing if it was anonymous.
-`json.env` will either be set to `throttle` or `blocklist`, the latter which comes from [failed authentication bans](https://docs.gitlab.com/ee/security/rate_limits.html#failed-authentication-ban-for-git-and-container-registry).
Please see [Rate Limiting Troubleshooting](/handbook/engineering/infrastructure/rate-limiting/troubleshooting/).
Troubleshooting rate limiting issues can be complicated,
particularly as requests can be throttled at different layers of our stack.
This page provides GitLab team members (who have the correct permissions) steps to follow
in order to find where a customer's request has been rate limited, and why.
## Has a request been rate limited?
Rate limited requests will return a `429 - Too Many Requests` response.
Following these troubleshooting guides for other status codes may still be beneficial.
## What layer is rate limiting the request?
All traffic to GitLab.com is subject to rate limiting,
there are different [limits](/handbook/engineering/infrastructure/rate-limiting/#limits) applied at Cloudflare and within the Application.
**Note:** If you are troubleshooting rate limiting issues for GitLab Pages or Registry,
see [other rate limits](/handbook/engineering/infrastructure/rate-limiting/#other-rate-limits) for details on how these are configured.
The following diagram should aid you in determining where to look first,
and for further detail scroll down to the related section.
```mermaid
flowchart TD
http[HTTP request] --> 429
429[Was there a 429 response?]
not-limited[The request was likely not rate limited]
header[Does the response contain RateLimit-* Headers?]
subgraph Cloudflare
c-status[Filter the Cloudflare Dashboard by status code]
c-http[Did you find the request in the Cloudflare Dashboard?]
end
subgraph Application
r-logs[Did you find the request in the RackAttack logs?]
a-logs[Did you find the request in the ApplicationRateLimiter logs?]
w-logs[Did you find the request in the workhorse logs?]
end
yay[You hopefully found what you were looking for!]
sre[Request SRE help]
429 -- no --> not-limited
429 -- yes --> header
not-limited --> c-status
header -- no --> c-http
header -- not sure --> c-http
header -- yes --> r-logs
c-status --> yay
c-http -- yes --> yay
r-logs -- yes --> yay
a-logs -- yes --> yay
c-http -- no --> r-logs
r-logs -- no --> a-logs
a-logs -- no --> w-logs
w-logs -- no --> sre
yay -- still require assistance?--> sre
```
### Rate Limit Response Headers
Sometimes users will see `RateLimit-*` response headers when a request has been rate limited;
this depends on the layer that has throttled the request.
For example, Cloudflare does not return a `RateLimit-*` response header.
This behaviour is better documented in the [Rate Limiting Headers](/handbook/engineering/infrastructure/rate-limiting/#headers) section of the handbook.
The presence (or absence) of these headers can be used to signal where to start your investigation,
as the `RackAttack` rate limits configured in the Application return these response headers on throttled requests.
## Cloudflare
GitLab team members with access can [use SSO to login to our Cloudflare account](https://dash.cloudflare.com/login).
To do so, enter your GitLab email and the `Log in with SSO` option will appear.
To request access, open an [access request](https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/new?issuable_template=Access_Change_Request) for the Cloudflare Analytics role.
Watch a [recorded walkthrough of the Cloudflare Dashboard](https://www.youtube.com/watch?v=7oW5WrlJWp0)(private to GitLab Team Members).
The [Network Analytics](https://dash.cloudflare.com/852e9d53d0f8adbd9205389356f2303d/network-analytics/all-traffic?dest-port=22) dashboard allows you to filter by destination port.
Setting a filter of `Destination port equals 22`
will allow you to do basic analysis on SSH traffic.
For more detailed investigation, logs are pushed to a Google Cloud Storage (GCS) bucket
where those with access to GCP can investigate further.
See the [Cloudflare runbook](https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/cloudflare/logging.md) for details on querying the Cloudflare logs,
or follow guidance to request further SRE assistance.
### Bots
The [Bot Analytics](https://dash.cloudflare.com/852e9d53d0f8adbd9205389356f2303d/gitlab.com/security/bots) dashboard (Administrator access only)
allows you to filter in the same way as other Cloudflare dashboards,
which can be useful if all other options have been exhausted
to determine the likelihood of automation versus human requests.
<details>
<summary>Click to see Cloudflare Bot Analytics</summary>
You can observe trends for both using the [Rate Limiting Overview](https://dashboards.gitlab.net/d/rate-limiting-rate-limiting_overview/rate-limiting3a-rate-limiting3a-overview?orgId=1) Grafana dashboard.
If a request is throttled by [RackAttack](/handbook/engineering/infrastructure/rate-limiting/#rackattack) it will contain `RateLimit-*` response headers.
You can filter the [RackAttack logs](https://log.gprd.gitlab.net/app/discover#/view/0026cc97-6b9a-445a-a364-7197e04053a2?_g=()) by:
- IP address using `json.remote_ip`
- Throttle using `json.matched`
- Path using `json.path`
### ApplicationRateLimiter
You can filter the [ApplicationRateLimiter logs](https://log.gprd.gitlab.net/app/discover#/view/2d2cf10e-b22a-4c07-bbda-45bb665c31ee?_g=()) by:
- IP using `json.meta.remote_ip`
- User using `json.meta.user` or `json.meta.client_id`
- Project using `json.meta.project`
- Throttle using `json.env`
- Path using `json.path`
### Workhorse
If you have not found the request in Cloudflare, RackAttack, or ApplicationRateLimiter,
then you can search for rate limited responses in the [Workhorse logs](https://log.gprd.gitlab.net/app/discover#/view/7b6dc396-5b27-4e86-b150-72b476255faf?_g=()) by:
- IP using `json.remote_ip`
- Path using `json.uri`
- Status using `json.status`
## Requesting further assistance
If you have followed this troubleshooting guidance
and have not found the results you were looking for,
you can request further assistance from a Site Reliability Engineer (SRE)