Failed LACP negociation during inspection
Summary
Platform having Dell R650 servers with OCP E810 NIC and Juniper QFX5120-48Y-8C Switches (Junos: 23.4R2-S2.1).
Sylva version: 1.3.10
During IPA process of collecting baremetal information bond0 is UP, but not sending any traffic on the network due to bad LACP negociation.
This only happens when booting from IPA image, after the server is provisioned in the SUSE OS the LACP negociation is working properly.
Steps to reproduce
For DHCP-less configuration we normally don't need to configure "lacp force-up" on the switch, on the first interface of the aggregate(bond).
Try to provision a node and get it to boot into IPA.
What is the current bug behavior?
When the server boots the IPA image, bond0 and bond0.vlan are configured with IP, but traffic is not sent to management cluster VIP.
What is the expected correct behavior?
The IPA should be able to communicate on the network and transfer the collected inspection result to the ironic on the cluster VIP.
Relevant logs and/or screenshots
Network configuration inside IPA with bond0 and bond0.2500 both UP, IP configured on bond, route is configured correct:
Checking connectivity to cluster VIP - result: no route to host:
Checking bond0 LACP status parameters:
We can observe that "port state: 77" or "port state: 69" indicates a bad negociation, and also the "system mac address" of the switch is not determined, the partner port state being displayed as "1".
Here is the same bond after node is provisioned. We can see "port state: 63" and "port state: 61" indicates a good LACP negociation and also the partner mac address is displayed correct:
Here I found some information regarding LACP negociation in detail:
https://hareshkhandelwal.blog/2022/07/28/lets-understand-lacp-state-machine-using-linux-bond/
Possible fixes
The current workaround was to set the "lacp force-up" on the first member interface of the bond/aggregate on the switch for each server.



