Consider resizing file-cny-01 to add more CPUs
Background
file-cny-01 is the box that hosts gitlab-org/gitlab. We are seeing fetches getting queued at peak times.
This came up in the context of production#7472.
Limits
The current concurrency limits are configured as:
iwiedler@file-cny-01-stor-gprd.c.gitlab-production.internal:~$ sudo cat /var/opt/gitlab/gitaly/config.toml
[[concurrency]]
rpc = "/gitaly.SmartHTTPService/PostUploadPack"
max_per_repo = 40
max_queue_size = 300
max_queue_wait = "60s"
file-cny-01-stor-gprd is a c2-standard-30 with 30 vCPUs. The next-largest instance would be c2-standard-60 with 60 vCPUs.
Saturation
Here is a recent example of CPU saturation from git upload-pack / git pack-objects: production#7474.
We can also tell that we are frequently hitting the concurrency limit for upload-pack: production#7472. Raising max_per_repo from 40 may be possible, but given the above, it may be wiser to do so after adding more CPU capacity.
Proposal
I'd like to propose increasing the instance size from c2-standard-30 to c2-standard-60. This would allow us to more safely raise the max_per_repo concurrency limit, and thus hopefully be able to serve our peak traffic without queueing delays.
Logistics
Resizing this box requires stopping the VM, changing the instance type, and booting it again. This operation necessarily incurs some downtime for gitlab-org/gitlab and all of its forks.
The operation should be fairly quick. I'd expect it to take on the order of 5-10 minutes.
We may be able to do this during a low-traffic period (APAC timezone) to reduce impact.
We should find out if this requires a formal maintenance window, and how much notice and/or notification is required.