Inconsistent CNI plugins versions between the nodes due to competition between Calico and Multus
Summary
It was observed in Sylva 1.4.1 that depending on the node, some CNI plugin had different versions across the cluster. For example host-local or portmap (probably others).
The actual workload is impacted by the discrepancy for host-local because on some nodes the priority of static routes is not taken into account.
The identified cause is a competition between Calico and Multus that install same plugins and there is an execution order impact (the last in time overwrites the binaries).
What is the current bug behavior?
Depending on the node, the CNI plugins in /opt/cni/bin don't all have the same version. It creates different behavior depending on the node where the workload is hosted.
Detected because on some nodes host-local CNI does not honor the priority parameter of static routes.
What is the expected correct behavior?
On all node, the CNI plugins in /opt/cni/bin have the same version, clearly determined and controlled by the Sylva version used.
Relevant logs and/or screenshots
It is observed that there are 2 sources for same CNI plugin, one source is docker.io/rancher/mirrored-calico-cni:v3.29.3 from calico-node, the other one is rancher/hardened-cni-plugins:v1.7.1-build20250509 from cni-plugins in multus. Depending on which pod runs last, the CNI plugin in /opt/cni/bin is overwritten.
Example of host-local plugin being different binaries on 2 nodes:
workerX:/opt/cni/bin # ./host-local -v
CNI host-local plugin v1.7.1 <<<<< set by Multus
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0, 1.1.0
workerY:/opt/cni/bin # ./host-local -v
CNI host-local plugin v3.29.3 <<<<< set by Calico
Possible fixes
Multus package should be the prefered source because most recent versions and often updated. It would be nice if we could prevent Calico to manage the conflicting CNI plugins in cases where Multus is enabled.