detect webhooks with a single endpoint
We have regularly observed case where our CI breaks (or takes time to converge) due to the unavailability of a webhook ; this typically happens for webhooks backed by a single pod (e.g. associated to a Service matching pods managed by a Deployment with replicas: 1). During CAPI node rolling updates, the pod eviction triggered by node drain results in a significant amount of time where a webhook can possibly not be available.
To avoid that, we should ensure in HA deployments that Webhooks are always backed by multiple pods.
One idea, discussed today with @feleouet, would be to introduce a Kyverno ClusterPolicy matching webhook definitions, with spec.validationFailureAction: Audit that would trigger if the service backing a webhook has no more than a single endpoint.
Then we could add something in CI that would trigger a CI failure if this Audit policy triggered something.