longhorn-instance-manager-cleanup fails if there are several instance-manager on a node

Summary

This issue was observed in https://gitlab.com/sylva-projects/sylva-core/-/jobs/9200283673

(Nightly upgrade from 1.2.1 to 1.3.x)

longhorn-instance-manager-cleanup are failing with following error:

==== START logs for container run-script of pod longhorn-system/longhorn-instance-manager-cleanup-29001425-zcmns ====
=== Checking for nodes being drained ===
=== Node mgmt-1681104299-kubeadm-capm3-virt-management-cp-2 is being drained, checking if it has volumes attached ===
There are no volumes attached to node mgmt-1681104299-kubeadm-capm3-virt-management-cp-2
There are no replicas from single-replica storageclass hosted on node mgmt-1681104299-kubeadm-capm3-virt-management-cp-2
Instance-manager pod is still running on node mgmt-1681104299-kubeadm-capm3-virt-management-cp-2 with following status:
error: there is no need to specify a resource type as a separate argument when passing arguments in resource/name form (e.g. 'kubectl get resource/<resource_name>' instead of 'kubectl get resource resource/<resource_name>'
==== END logs for container run-script of pod longhorn-system/longhorn-instance-manager-cleanup-29001425-zcmns ====

Indeed if execute the same command as the script:

crustgather-job-9200283673 ~> kubectl get pods -n longhorn-system -l longhorn.io/component=instance-manager -l longhorn.io/node=mgmt-1681104299-kubeadm-capm3-virt-management-cp-2 -o name
pod/instance-manager-44fd5f0f444ef6d0f4a105e8b94a6fb2
pod/instance-manager-e35f862636496f42d845552187561dbb

related references

Details

Assignee Loading
Time tracking Loading