Validate Cilium as Calico replacement for GKE
There has been multiple issues related to the usage of Calico:
- tune calico-node-vertical-autoscaler ConfigMap ... (#16775 - closed)
- Investigate Calico connectivity issues during d... (#15128 - moved)
- Investigate CPU increase for redis ratelimiting... (scalability#1985 - closed)
Following discussion in scalability#1985 (comment 1176276704), we want to validate if Cilium could address these problems.
Mainly we are looking at:
- Reducing network processing overhead
- Remove calico pod rotation due to vertical auto-scaling events (which causes small drop of traffic)
- Improve node startup/shutdown connectivity (e.g. Consul)
Steps
-
Deploy a new cluster in pre
using GKE dataplane v2 https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/4568 -
Deploy basic services in Helmfiles gitlab-com/gl-infra/k8s-workloads/gitlab-helmfiles!1378 (merged) -
Deploy Redis service in Tanka gitlab-com/gl-infra/k8s-workloads/tanka-deployments!631 (merged) -
Test network performance
Edited by Filipe Santos