ceph-csi-rbd failing because nodes are not ready before deployment

In job https://gitlab.com/sylva-projects/sylva-core/-/jobs/12923907675 we can see that after the first node is Ready cluster-reachable reconciles successfully very early

crustgather-job-12923907675 ~> kubectl get no
NAME
mgmt-2295591558-ck8s-capm3-virt-management-cp-2

since this is the only dependency for ceph-csi-rbd, the unit is also reconciled, but it failes after 5 minutes because a single node is available and the deployment has replicas: 3

crustgather-job-12923907675 ~> kubectl -n ceph-csi-rbd get deploy ceph-csi-rbd-provisioner -o yaml | yq .status
availableReplicas: 1
conditions:
  - lastTransitionTime: "2026-01-30T04:54:24Z"
    lastUpdateTime: "2026-01-30T04:54:24Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2026-01-30T04:54:24Z"
    lastUpdateTime: "2026-01-30T04:55:30Z"
    message: ReplicaSet "ceph-csi-rbd-provisioner-775c95486f" is progressing.
    reason: ReplicaSetUpdated
    status: "True"
    type: Progressing
observedGeneration: 1
readyReplicas: 1
replicas: 3
unavailableReplicas: 2
updatedReplicas: 3

and the pods have anti affinity:

crustgather-job-12923907675 ~> kubectl -n ceph-csi-rbd get deploy ceph-csi-rbd-provisioner -o yaml | yq .spec.template.spec.affinity
podAntiAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
          - key: app
            operator: In
            values:
              - ceph-csi-rbd
          - key: component
            operator: In
            values:
              - provisioner
      topologyKey: kubernetes.io/hostname

This was observed randomly in quite a lot of jobs, more often on ck8s and kubeadm. A fast solution would be to change the depends_on from cluster-reachable to cluster-machines-ready for the ceph-csi-rbd-init unit.

Edited Jan 30, 2026 by Cristian Manda
Assignee Loading
Time tracking Loading