cluster-machines-ready: more info on missing post-commands-executed annotation
When cluster-machines-ready times out because some Node is missing the post-commands-executed annotation, we need to easily tell which is that node (or nodes).
This MR improves cluster-machines-ready so that it will give this information.
This MR also improves the output so that we get detailed information on Machines and Nodes only if the script did not succeed.
Testing
I simulated failing case with (on a setup where I had removed the post-commands-executed annotation on a Node with kubectl edit):
$ WAIT_TIMEOUT=10m CONTROL_PLANE=rke2controlplane CLUSTER_NAME=management-cluster timeout -v 25s charts/sylva-units/scripts/cluster-machines-ready.sh
result:
======================================================
failure waiting for all Machines and Nodes to be ready
--- summary of resources
-- Control plane:
NAME AGE
management-cluster-control-plane 448d
Complété
-- Machines:
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
management-cluster-control-plane-4tchz management-cluster management-cluster-cp-dfac864565-mht2z openstack:///9ab62912-6540-48ea-92c0-4416b7a991e8 Running 8d v1.33.7+rke2r1
management-cluster-control-plane-j9c66 management-cluster management-cluster-cp-dfac864565-f898z openstack:///b6b165fc-89a2-42e7-91c0-3d310c56fda2 Running 8d v1.33.7+rke2r1
management-cluster-control-plane-v7vbk management-cluster management-cluster-cp-dfac864565-w97bs openstack:///3efd1f67-df37-4cac-8fd4-31ea37bf31dc Running 7d22h v1.33.7+rke2r1
management-cluster-md-ubuntu-bfc4b-6l5mp management-cluster management-cluster-md-ubuntu-bfc4b-6l5mp openstack:///838cffac-1f7c-4e19-be4a-56cc06124b22 Running 6d22h v1.33.7+rke2r1
management-cluster-md-ubuntu-bfc4b-cwnzk management-cluster management-cluster-md-ubuntu-bfc4b-cwnzk openstack:///98833d47-efce-4af0-a0e4-00f36fdca419 Running 6d22h v1.33.7+rke2r1
management-cluster-md0-2rwbh-7kgr6 management-cluster management-cluster-md0-2rwbh-7kgr6 openstack:///79b98ae7-ee16-437d-862b-5ed0ac71e9fc Running 6d22h v1.33.7+rke2r1
management-cluster-md0-2rwbh-gmxjd management-cluster management-cluster-md0-2rwbh-gmxjd openstack:///0aac0a67-3da3-443d-ada4-41b4c343cd3f Running 7d22h v1.33.7+rke2r1
management-cluster-md0-2rwbh-v4hf6 management-cluster management-cluster-md0-2rwbh-v4hf6 openstack:///59c6c8cd-09a9-4bf8-a1c4-5d74f6095fea Running 8d v1.33.7+rke2r1
-- Nodes:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
management-cluster-cp-dfac864565-f898z Ready control-plane,etcd,master 8d v1.33.7+rke2r1 172.20.136.234 <none> openSUSE Leap 15.6 6.4.0-150600.23.81-default containerd://2.1.5-k3s1
management-cluster-cp-dfac864565-mht2z Ready control-plane,etcd,master 8d v1.33.7+rke2r1 172.20.136.111 <none> openSUSE Leap 15.6 6.4.0-150600.23.81-default containerd://2.1.5-k3s1
management-cluster-cp-dfac864565-w97bs Ready control-plane,etcd,master 7d22h v1.33.7+rke2r1 172.20.136.20 <none> openSUSE Leap 15.6 6.4.0-150600.23.81-default containerd://2.1.5-k3s1
management-cluster-md-ubuntu-bfc4b-6l5mp Ready <none> 6d22h v1.33.7+rke2r1 172.20.136.113 <none> Ubuntu 24.04.3 LTS 6.8.0-90-generic containerd://2.1.5-k3s1
management-cluster-md-ubuntu-bfc4b-cwnzk Ready <none> 6d22h v1.33.7+rke2r1 172.20.136.13 <none> Ubuntu 24.04.3 LTS 6.8.0-90-generic containerd://2.1.5-k3s1
management-cluster-md0-2rwbh-7kgr6 Ready <none> 6d22h v1.33.7+rke2r1 172.20.136.53 <none> openSUSE Leap 15.6 6.4.0-150600.23.81-default containerd://2.1.5-k3s1
management-cluster-md0-2rwbh-gmxjd Ready <none> 7d22h v1.33.7+rke2r1 172.20.136.79 <none> openSUSE Leap 15.6 6.4.0-150600.23.81-default containerd://2.1.5-k3s1
management-cluster-md0-2rwbh-v4hf6 Ready <none> 8d v1.33.7+rke2r1 172.20.136.147 <none> openSUSE Leap 15.6 6.4.0-150600.23.81-default containerd://2.1.5-k3s1
-- Some nodes do not have the 'post-commands-executed' annotations
(ie. they did not successfully reach end of the post commands execution)
management-cluster-cp-dfac864565-f898z
CI configuration
Below you can choose test deployment variants to run in this MR's CI.
Click to open to CI configuration
Legend:
| Icon | Meaning | Available values |
|---|---|---|
| Infra Provider |
capd, capo, capm3
|
|
| Bootstrap Provider |
kubeadm (alias kadm), rke2, okd, ck8s
|
|
| Node OS |
ubuntu, suse, na, leapmicro
|
|
| Deployment Options |
light-deploy, dev-sources, ha, misc, maxsurge-0, logging, no-logging, cilium
|
|
| Pipeline Scenarios | Available scenario list and description | |
| Enabled units | Any available units name, by default apply to management and workload cluster. Can be prefixed by mgmt: or wkld: to be applied only to a specific cluster type |
|
| Target platform | Can be used to select specific deployment environment (i.e real-bmh for capm3 ) |
-
🎬 preview☁️ capd🚀 kadm🐧 ubuntu -
🎬 preview☁️ capo🚀 rke2🐧 suse -
🎬 preview☁️ capm3🚀 rke2🐧 ubuntu -
☁️ capd🚀 kadm🛠️ light-deploy🐧 ubuntu -
☁️ capd🚀 rke2🛠️ light-deploy🐧 suse -
☁️ capo🚀 rke2🐧 suse -
☁️ capo🚀 rke2🐧 leapmicro -
☁️ capo🚀 kadm🐧 ubuntu -
☁️ capo🚀 kadm🐧 ubuntu🟢 neuvector,mgmt:harbor -
☁️ capo🚀 rke2🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capo🚀 kadm🎬 wkld-k8s-upgrade🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update-no-wkld🛠️ ha🐧 suse -
☁️ capo🚀 rke2🎬 sylva-upgrade🛠️ ha🐧 ubuntu -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.6.x🛠️ ha,misc🐧 ubuntu -
☁️ capo🚀 rke2🛠️ ha,misc🐧 ubuntu -
☁️ capo🚀 rke2🛠️ ha,misc,openbao🐧 suse -
☁️ capo🚀 rke2🐧 suse🎬 upgrade-from-prev-tag -
☁️ capm3🚀 rke2🐧 suse -
☁️ capm3🚀 kadm🐧 ubuntu -
☁️ capm3🚀 ck8s🐧 ubuntu -
☁️ capm3🚀 kadm🎬 rolling-update-no-wkld🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🎬 wkld-k8s-upgrade🛠️ ha🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capm3🚀 rke2🎬 upgrade-from-prev-release-branch🛠️ ha🐧 suse -
☁️ capm3🚀 rke2🛠️ misc,ha🐧 suse -
☁️ capm3🚀 rke2🎬 sylva-upgrade🛠️ ha,misc🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 suse -
☁️ capm3🚀 ck8s🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capm3🚀 rke2|okd🎬 no-update🐧 ubuntu|na -
☁️ capm3🚀 rke2🐧 suse🎬 upgrade-from-release-1.5 -
☁️ capm3🚀 rke2🐧 suse🎬 upgrade-to-main
Global config for deployment pipelines
- autorun pipelines
- allow failure on pipelines
- record sylvactl events
Notes:
- Enabling
autorunwill make deployment pipelines to be run automatically without human interaction - Disabling
allow failurewill make deployment pipelines mandatory for pipeline success. - if both
autorunandallow failureare disabled, deployment pipelines will need manual triggering but will be blocking the pipeline
Be aware: after configuration change, pipeline is not triggered automatically.
Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.