a PDB with default policy potentially prevents node draining
Having maxUnavailable or minAvailable properly set on a PDB is not a sufficient condition to allow draining one pod of a given set of pods.
With the default PDB settings, you also need to have this pod be Ready -- https://kubernetes.io/docs/tasks/run-application/configure-pdb/#healthiness-of-a-pod
Since 1.27, there is a new spec.unhealthyPodEvictionPolicy field that we can set to AlwaysAllow to let the eviction API consider unhealthy pods as ok for being evicted.
We need to ensure that a Sylva cluster is not subject to node draining being blocked by this case.
List of things to do, or discuss:
- for units that we package in Sylva and that define a PDB, set
spec.unhealthyPodEvictionPolicy: AlwaysAllow - to ensure that we cover them all and don't regress, introduce a Kyverno Audit policy to detect PDB which would not have this setting
- we'll possibly have to make exceptions (for instance, I'm not sure that it would be safe to let the eviction API allow the eviction of a longhorn PV instance that would not be ready, maybe we'll need an exception) -- the Kyverno policy should allow setting a label on a PDB to have it be ignore by the policy
- we should ensure that problematic PDBs are exposed to the monitoring layer so that cluster operators are aware of potential draining issues
(related issue: #1560 (comment 2085803168))