2025-08-06: Gitaly goserver SLI apdex violation in cny stage
Gitaly goserver SLI apdex violation in cny stage (Severity 3 (Medium))
Apdex for the Gitaly node began to drop since 2025-08-04, remaining at ~98.5%.
We identified the root cause to be significantly increased traffic originating from bots scraping the repository home pages of gitlab-org/gitlab forks, and potentially other repositories. When the repository page of a fork loads, it calls a getForkDetails GraphQL query whose resolver in turn calls Gitaly's CountDivergingCommits RPC.
The latency of the RPC increased by an order of magnitude due to increased traffic, but with little effect on other RPCs and the health of the Gitaly node itself.
It was decided to move CountDivergingCommits into the category of "slow" methods, so latency would be weighted differently in the apdex calculation. Merging this change into the Runbooks repository has restored apdex to the normal value.
We've also implemented Cloudflare rules to block the Bytespider crawler by its user agent, which was contributing a large portion of the bot traffic. Unfortunately, the same people also employ randomised user agents and IP addresses which are difficult to block. We're experimenting with Cloudflare's AI Bot Blocker feature to weed these out.
This ticket was created to track INC-3127, by incident.io