Skip to content

Create zonal clusters in GCP to save money on cross-AZ traffic

This issue is to plan and execute creating multiple clusters so we can divide traffic by zone. Currently this being considered for the following services:

  • git
  • web
  • api
  • registry

The reason we are doing this is because we anticipate significant cost increases when we migrate git https to Kubernetes, due to additional cross-AZ network traffic, see #1150 (closed)

Current status

Tasks

  • The 4 clusters need to share the same environment, but we will have different chart configuration for the 4 different clusters.
  • Can we use have two environment, one for helmfile that would be something like gprd#us-east-1d and another for CI that would be gprd?

Cost impact

Cross AZ traffic Additional Clusters
Additional cost per month ~22k USD ~1.5k USD

Additional spend from cross-AZ egress with the regional cluster

So I believe we are looking at ~20k/month spend on cross-zone traffic for git HTTPs, which will be the largest service consumer of egress traffic. Git SSH by comparison will only cost us an additional ~2k month.

Additional spend from 3 zonal clusters

GKE charges a cluster management fee of $0.10 per cluster per hour. The following conditions apply to the cluster management fee: One zonal cluster (single-zone or multi-zonal) per billing account is free.

https://cloud.google.com/kubernetes-engine/pricing

  • Cluster fee: ~ +$150/month - This overhead is very small, that and we will also get 1 of the zonal clusters for free
  • Registry: This service does not use many nodes and does not cause a lot of cross-AZ traffic because pulls are directly from object storage. If we decide to move it into the zonal clusters, I assume we will also create a dedicated node-pool with a minimum of 1 node, in which case it will be cost-neutral as I don't see a reason to have more than one node in a pool since we will have AZ redundancy
  • Management nodes: We will want a default node pool for K8s management plus monitoring

Assuming 2 management nodes per zone: $200 (n1-standard-8 ) * 2 * 3 zones = $1200

I believe for production we are looking at a cost overhead ~ $1500.

Current configuration

We currently have a single GKE cluster per environment (production, staging, preprod). These clusters are regional clusters, meaning that every node pool as at least 3 nodes, spread across three zones.

Proposal

  • One regional cluster for sidekiq, exporters, beats.
  • Three zonal clusters for webservice/registry in us-east1-{b,c,d}
  • For helmfile environments, create multiple envs:
    • gprd
    • gprd-us-east1-b
    • gprd-us-east1-c
    • gprd-us-east1-d
  • Places where we reference .Environment.Name will need to be replaced with a value that spans all of these environments, something like .Environment.Values.name
  • For sharing common configuration we will create gprd-us-east1-{b,c,d}.yml.gotmpl with {{ readFile "gprd.yaml.gotmpl" | fromYaml | toYaml }}
  • For deployments we currently assume that the CI environment is the same as the helmfile enviornment. We will now need to add something like a HELM_ENV_SUFFIX=.. for 4 CI jobs per env, that will allow us to deploy to all clusters
  • For sequencing the deploys we can probably deploy to the regional/us-east1-b cluster first, then follow it with us-east1-c / us-east1-c
  • IF we wanted to, we could move us-east1-b into the regional cluster, but I think arrangement will be better because we may want to shift zones around for deployment.
Edited by John Jarvis