Solved

Nutanix CVM Bandwidth on LACP Netwok

  • 4 November 2022
  • 8 replies
  • 355 views

CVM On LACP networks, Physical switches use MLAG technology,CVM bandwidth performance cannot be improved, and the maximum can only be 10Gb.

 

icon

Best answer by JeroenTielen 7 November 2022, 16:10

View original

This topic has been closed for comments

8 replies

Userlevel 5
Badge +6

Hello @Aihua 

If you select the Active-Active bond type, you must enable LACP and LAG on the corresponding ToR switch for each node in the cluster one after the other, and this is for AHV only.

To configure Active-Active bond type on AHV:

https://portal.nutanix.com/page/documents/details?targetId=Prism-Central-Guide-vpc_2022_6:wc-enable-lag-and-lacp-on-tor-switch-t.html​​​

Userlevel 6
Badge +8

CVM On LACP networks, Physical switches use MLAG technology,CVM bandwidth performance cannot be improved, and the maximum can only be 10Gb.

 

 

This is correct. Here some explanation: Will configuring LACP increase throughput on the link?

Userlevel 2
Badge +1

I was also discussing this with support for many months, we tried multiple configurations of AHV and Cisco LACP algorithms, the result is still 10gbit between two mac addresses, 

hosts were able to achieve higher bandwidth (out of 4x10gbit lacp) when multiple sources and streams, but usually 10gbit was the most a host can saturate

After this case i had reduced all my hosts from 4x10 to 2x10 to avoid wasting precious ports

HTH

Userlevel 5
Badge +6

Hello @DenisF 

  • A single user VM with multiple TCP streams could use up to 20 Gbps of bandwidth in an AHV node with two 10 GB adapters. so no need to agg. 4 ports.
  • It is recommended to enable LACP fallback on the switch used to connect the AHV nodes.

Cisco Nexus: no lacp suspend-individual

Cisco Catalyst: no port-channel standalone-disable

  • In the AHV host CLI and on most switches, the default OVS LACP speed configuration is slow, or 30 seconds. This value—which is independent of the switch timer setting—determines how frequently the AHV host requests LACPDUs from the connected physical switch. The fast setting (1 second) requests LACPDUs from the connected physical switch every second, which helps you detect interface failures more quickly. Failure to receive three LACPDUs—in other words, after 3 seconds with the fast setting—shuts down the link in the bond. Nutanix recommends setting lacp-time to fast on the AHV host and physical switch to decrease link failure detection time from 90 seconds to 3 seconds.

nutanix@CVM$ ssh root@192.168.5.1 "ovs-vsctl set port br0-up other_config:lacp-time=fast"

The LACP of the physical NIC is no problem, and the MLAG configuration of the corresponding switch is also no problem. I suspect it's an issue with the iperf test script, or the vNIC issue with CVM because it's only 10Gb.

Hello @DenisF 

  • A single user VM with multiple TCP streams could use up to 20 Gbps of bandwidth in an AHV node with two 10 GB adapters. so no need to agg. 4 ports.
  • It is recommended to enable LACP fallback on the switch used to connect the AHV nodes.

Cisco Nexus: no lacp suspend-individual

Cisco Catalyst: no port-channel standalone-disable

  • In the AHV host CLI and on most switches, the default OVS LACP speed configuration is slow, or 30 seconds. This value—which is independent of the switch timer setting—determines how frequently the AHV host requests LACPDUs from the connected physical switch. The fast setting (1 second) requests LACPDUs from the connected physical switch every second, which helps you detect interface failures more quickly. Failure to receive three LACPDUs—in other words, after 3 seconds with the fast setting—shuts down the link in the bond. Nutanix recommends setting lacp-time to fast on the AHV host and physical switch to decrease link failure detection time from 90 seconds to 3 seconds.

nutanix@CVM$ ssh root@192.168.5.1 "ovs-vsctl set port br0-up other_config:lacp-time=fast"

 

 

It should still be a virtualization problem, and the virtual NIC driver should only support 10Gb.

Hello @Aihua 

If you select the Active-Active bond type, you must enable LACP and LAG on the corresponding ToR switch for each node in the cluster one after the other, and this is for AHV only.

To configure Active-Active bond type on AHV:

https://portal.nutanix.com/page/documents/details?targetId=Prism-Central-Guide-vpc_2022_6:wc-enable-lag-and-lacp-on-tor-switch-t.html​​​

TKS,These configurations are no problem !

Userlevel 6
Badge +8

Hello @DenisF 

  • A single user VM with multiple TCP streams could use up to 20 Gbps of bandwidth in an AHV node with two 10 GB adapters. so no need to agg. 4 ports.
  • It is recommended to enable LACP fallback on the switch used to connect the AHV nodes.

Cisco Nexus: no lacp suspend-individual

Cisco Catalyst: no port-channel standalone-disable

  • In the AHV host CLI and on most switches, the default OVS LACP speed configuration is slow, or 30 seconds. This value—which is independent of the switch timer setting—determines how frequently the AHV host requests LACPDUs from the connected physical switch. The fast setting (1 second) requests LACPDUs from the connected physical switch every second, which helps you detect interface failures more quickly. Failure to receive three LACPDUs—in other words, after 3 seconds with the fast setting—shuts down the link in the bond. Nutanix recommends setting lacp-time to fast on the AHV host and physical switch to decrease link failure detection time from 90 seconds to 3 seconds.

nutanix@CVM$ ssh root@192.168.5.1 "ovs-vsctl set port br0-up other_config:lacp-time=fast"

 

 

It should still be a virtualization problem, and the virtual NIC driver should only support 10Gb.

It is not a problem. This is how lacp works.