Skip to content

Split private and shared CI runner subnets and use separate Cloud NATs

At the time of writing, 1 Cloud NAT instance covers the default/us-east1 subnet in the default VPC in CI. All runners are provisioned in this subnet. Private runners are created without public IPs, and therefore use the NAT for egress, while shared runners are still using public IPs (until production#1199 (closed), which I believe is blocked on this issue).

Splitting public and shared runners into their own subnets would allow different Cloud NAT instances, with different IP pools, to serve the outbound traffic.

It's likely easier to keep everything in the default VPC as some things (runner managers, prometheus, firewall rules) rely on this assumption.

Checklist:

  • Terraform can provision a NAT scoped by subnet, not just by region. Do not actually scope it yet
  • New private runners are provisioned in new subnet, in the default network (also in us-east1)
  • Old CI NAT IP pool resized to deal with shared runners only
  • Old CI NAT IP re-scoped to only cover this network (only when there are no more private runners in the default subnet)
  • Look into splitting up the shared runner subnets before provisioning any NATs there - in case we need >50 IPs per NAT. Update: I think we're fine: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/merge_requests/1092#note_36271
  • Instances in omnibus-build-runners, provisioned by build-trigger-runner-manager and build-runners, are behind a Cloud NAT instance.
Edited by Craig Furman