Determine how best to adjust the HPA for the Container Registry
The Container Registry is currently using CPU as a primary metric for determining when to scale up/down Pod counts. The value we rely on is 75% CPU usage across all Pods. When we scaled up from 2% of traffic to 20% of traffic, we hit our maximum Pod allowance, which a the time was default to 10. We doubled this and over the course of a few hours, we again hit our max of 20 Pods. This happened at roughly peak traffic time to GitLab on Friday. We need to scale more. We've decided to bump it to an arbitrary 100: gitlab-com/gl-infra/k8s-workloads/gitlab-com!39 (merged)
Utilize this issue to learn how to best monitor our HPA's and make deterministic decisions to upping Pod Count Maximums.
- Is there any harm in setting this to something egregiously high?
- Are there metrics that tell us where we are in terms of hitting our maximum Pod counts?
- Are there metrics that show us where we are in terms of the metric being utilized by the HPA?
Edited by 🤖 GitLab Bot 🤖