Investigate how https://gitlab.my.salesforce.com/00161000005h0vo caused so much damage to a single Gitaly server
Marquee customer https://gitlab.my.salesforce.com/00161000005h0vo are performing some PoC's on GitLab.com.
During one of their tests on 2020-01-23 22h05 and 22h19, they, as @cmiskell put it, "melted the Gitaly node down for slag".
In particular CPU and memory were completely saturated.
Gitaly stats during this period
Gitaly logs during that period
https://log.gitlab.net/goto/b2f26d25f8169d5ce591dd618cfeda6c
What were they doing?
Mostly PostUploadPack operations. The slowest took over 2 hours, but most were in the region of 5 to 10 minutes.
Did the concurrency limiter kick in?
Yes, up to about 15% of requests were rate limited
