[Release 1.4] Upgrades of management-cluster from 1.3 may be stuck on a paused cluster

Summary

This issue has been observed several times on management cluster upgrades from release 1.3 to 1.4.

The upgrade was started with apply.sh, but it didn't complete successfully.

  • Management cluster was paused by cluster-pause unit, and resumed by s-c-c pre-upgrade hook, the cluster has been upgraded, but deployment has been stuck on another dependent unit.

After eventually changing some values (or not) apply.sh has been executed once again.

  • Since first reconciliation didn't reach the end of the deployment, sylva-unit-status has not been reconciled
  • consequently, cluster-pause unit has been ran once more even if it is a one-shot unit
  • but since there were no changes in cluster values, s-c-c helmrelease has not been reconciled, so cluster has not been reconciled.

Proposed change: we should just avoid to pause the management cluster, as this is not required.

Edited Jul 21, 2025 by Francois Eleouet
Assignee Loading
Time tracking Loading