Alert fatigue remediation for HPA
For sidekiq, it is suspected we may start off our deployments in Kubernetes with an HPA set to the maximum due to the lengthy startup times of the Pods. Regardless of this, we rely on an alert on the ability for the HPA to scale up as a means for determining if we have room to grow. This alert is currently a high priority paging alert. For sidekiq, this will cause alert fatigue if we configure our HPA with no room to grow. How best can we modify this alert in this scenario? Our saturation alerts might be more meaningful here.
- Should we rid of the alert?
- Should we modify the alert to point to specific HPA's we care about?
Edited by John Skarbek