Resize file-cny-01 to add more CPUs
### Background `file-cny-01` is the box that hosts `gitlab-org/gitlab`. We are seeing fetches getting queued at peak times. This came up in the context of https://gitlab.com/gitlab-com/gl-infra/production/-/issues/7472. ### Limits The current concurrency limits are configured as: ``` iwiedler@file-cny-01-stor-gprd.c.gitlab-production.internal:~$ sudo cat /var/opt/gitlab/gitaly/config.toml [[concurrency]] rpc = "/gitaly.SmartHTTPService/PostUploadPack" max_per_repo = 40 max_queue_size = 300 max_queue_wait = "60s" ``` `file-cny-01-stor-gprd` is a `c2-standard-30` with 30 vCPUs. The next-largest instance would be `c2-standard-60` with 60 vCPUs. ### Saturation Here is a recent example of CPU saturation from `git upload-pack` / `git pack-objects`: https://gitlab.com/gitlab-com/gl-infra/production/-/issues/7474. We can also tell that we are frequently hitting the concurrency limit for `upload-pack`: https://gitlab.com/gitlab-com/gl-infra/production/-/issues/7472. Raising `max_per_repo` from 40 may be possible, but given the above, it may be wiser to do so after adding more CPU capacity. ### Proposal I'd like to propose increasing the instance size from `c2-standard-30` to `c2-standard-60`. This would allow us to more safely raise the `max_per_repo` concurrency limit, and thus hopefully be able to serve our peak traffic without queueing delays. ### Logistics Resizing this box requires stopping the VM, changing the instance type, and booting it again. This operation necessarily incurs some downtime for `gitlab-org/gitlab` and all of its forks. The operation should be fairly quick. I'd expect it to take on the order of 5-10 minutes. We may be able to do this during a low-traffic period (APAC timezone) to reduce impact. We should find out if this requires a formal maintenance window, and how much notice and/or notification is required.
issue