fleet-agent pod eviction prevented toleration to unschedulable taint
I run into the following problem in my dev env:
- CAPI node rolling update stuck failing to drain a node
- drain was prevented because a fleet-agent pod kept reappearing
- this is because the
fleet-agentDeploymentis being set byfleetcontrollera toleration to the unschedulable taint used for draining
$ k get -n cattle-fleet-local-system deployment fleet-agent -o yaml
...
tolerations:
- effect: NoSchedule
key: node.cloudprovider.kubernetes.io/uninitialized
operator: Equal
value: "true"
- effect: NoSchedule
key: cattle.io/os
operator: Equal
value: linux
- effect: NoSchedule
key: node.kubernetes.io/unschedulable
operator: Equal
...
I noticed this in my management cluster, but I presume this can also impact workload clusters.