cluster-maxunavailable: don't rely only on status.unavailableReplicas (!28) · Merge requests · Sylva-projects / sylva-elements / misc-controllers-suite

Closes sylva-projects/sylva-core#2786 (closed)

The issue described in sylva-projects/sylva-core#2786 (closed) happens because the cluster-maxunavailable controller relies on status.unavailableReplicas to determine that a node rolling update is in progress. This misses one corner case during the short time window where a Machine has been removed, but it's replacing Machine hasn't been created ; at that moment, there is no unavailable Machine (yet), and unavailableReplicas is zero -- leading today to our controller wrongly concluding that everything is ready.

The idea that it would be better to not rely on status.unavailableReplicas had been discussed in !1 (comment 2650964739), but not identified as urgent to fix ; the motivation identified then was to avoid using a deprecated field and making the controller work for clusters that use maxSurge 1.

What this MR does is:

for MDs, as in a commit previously proposed by @feleouet, use spec.replicas - status.availableReplicas instead of status.unavailableReplicas
for the control plane, because there is no status.availableReplicas that we can use, use status.unavailableReplicas but if that field is zero use spec.replicas - status.readyReplicas (readyReplicas is a little less robust than availableReplicas, but this is the best we have to cover this corner case)

This MR was tested in pipelines of sylva-projects/sylva-core!5368 (closed)

Edited Sep 03, 2025 by Thomas Morin

cluster-maxunavailable: don't rely only on status.unavailableReplicas

Merge request reports