fleet-agent pod eviction prevented toleration to unschedulable taint

I run into the following problem in my dev env:

  • CAPI node rolling update stuck failing to drain a node
  • drain was prevented because a fleet-agent pod kept reappearing
  • this is because the fleet-agent Deployment is being set by fleetcontroller a toleration to the unschedulable taint used for draining
$ k get -n cattle-fleet-local-system deployment fleet-agent -o yaml
...
      tolerations:
      - effect: NoSchedule
        key: node.cloudprovider.kubernetes.io/uninitialized
        operator: Equal
        value: "true"
      - effect: NoSchedule
        key: cattle.io/os
        operator: Equal
        value: linux
      - effect: NoSchedule
        key: node.kubernetes.io/unschedulable
        operator: Equal
...

I noticed this in my management cluster, but I presume this can also impact workload clusters.

Assignee Loading
Time tracking Loading