ceph-csi-rbd failing because nodes are not ready before deployment
In job https://gitlab.com/sylva-projects/sylva-core/-/jobs/12923907675 we can see that after the first node is Ready cluster-reachable reconciles successfully very early
crustgather-job-12923907675 ~> kubectl get no
NAME
mgmt-2295591558-ck8s-capm3-virt-management-cp-2
since this is the only dependency for ceph-csi-rbd, the unit is also reconciled, but it failes after 5 minutes because a single node is available and the deployment has replicas: 3
crustgather-job-12923907675 ~> kubectl -n ceph-csi-rbd get deploy ceph-csi-rbd-provisioner -o yaml | yq .status
availableReplicas: 1
conditions:
- lastTransitionTime: "2026-01-30T04:54:24Z"
lastUpdateTime: "2026-01-30T04:54:24Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2026-01-30T04:54:24Z"
lastUpdateTime: "2026-01-30T04:55:30Z"
message: ReplicaSet "ceph-csi-rbd-provisioner-775c95486f" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 1
replicas: 3
unavailableReplicas: 2
updatedReplicas: 3
and the pods have anti affinity:
crustgather-job-12923907675 ~> kubectl -n ceph-csi-rbd get deploy ceph-csi-rbd-provisioner -o yaml | yq .spec.template.spec.affinity
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- ceph-csi-rbd
- key: component
operator: In
values:
- provisioner
topologyKey: kubernetes.io/hostname
This was observed randomly in quite a lot of jobs, more often on ck8s and kubeadm. A fast solution would be to change the depends_on from cluster-reachable to cluster-machines-ready for the ceph-csi-rbd-init unit.
Edited by Cristian Manda