AHV and CVM fails after restarting the AHV nodes

Question

Hello People,
Facing strange issues on deploying Nutanix 3 node cluster.
Everything is completed without any issues deploying Nutanix. After restarted the nodes we are not able to reach AHV Node IPs after logging into ILO manually restarted the network services and then the AHV and CVM are reachable and all the cluster is normal. These happen only I restarted the nodes or shutting down AHV.
Before restarting, network interfaces are up only.
From Switch side we have configured access port VLAN tagged 1010.
Here eth4 is connected on switch-1 port 1 and same eth6 is connected on switch-2 port 2
Note : Both switch-1 and switch-2 is stacked
Model : Cisco SG350XG
AOS : 5.20.1.1 LTS
Here my config

nutanix@NTNX-CVM:101.101.0.13:~$ allssh manage_ovs show_uplinks
================== 101.101.0.14 =================
Bridge: br0
Bond: br0-up
    bond_mode: active-backup
    interfaces: eth6 eth4
    lacp: off
    lacp-fallback: false
    lacp_speed: slow
================== 101.101.0.15 =================
Bridge: br0
Bond: br0-up
    bond_mode: active-backup
    interfaces: eth6 eth4
    lacp: off
    lacp-fallback: false
    lacp_speed: slow
================== 101.101.0.13 =================
Bridge: br0
Bond: br0-up
    bond_mode: active-backup
    interfaces: eth6 eth4
    lacp: off
    lacp-fallback: false
    lacp_speed: slow
nutanix@NTNX-CVM:101.101.0.13:~$ allssh manage_ovs show_interfaces
================== 101.101.0.14 =================
name mode link speed
eth0 1000 False None
eth1 1000 False None
eth2 1000 False None
eth3 1000 False None
eth4 10000 True 10000
eth5 10000 True 10000
eth6 10000 True 10000
eth7 10000 True 10000
================== 101.101.0.15 =================
name mode link speed
eth0 1000 False None
eth1 1000 False None
eth2 1000 False None
eth3 1000 False None
eth4 10000 True 10000
eth5 10000 True 10000
eth6 10000 True 10000
eth7 10000 True 10000
================== 1101.101.0.13 =================
name mode link speed
eth0 1000 False None
eth1 1000 False None
eth2 1000 False None
eth3 1000 False None
eth4 10000 True 10000
eth5 10000 True 10000
eth6 10000 True 10000
eth7 10000 True 10000

[root@NUTANIX-AHV1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-br0
# Auto generated by phoenix
DEVICE=br0
NM_CONTROLLED=no
ONBOOT=yes
TYPE=OVSIntPort
DEVICETYPE=ovs
BOOTPROTO=none
IPADDR=101.101.0.10
NETMASK=255.255.255.0
GATEWAY=101.101.0.1
OVSREQUIRES="eth6 eth4"

[root@NUTANIX-AHV1 ~]# ovs-appctl bond/show br0-up
---- br0-up ----
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
lacp_status: off
lacp_fallback_ab: false
active-backup primary: <none>
active slave mac: cs:ds:sd:e3:ds(eth4)

slave eth4: enabled
active slave
may_enable: true

slave eth6: enabled
may_enable: true

rohan.saksena-55595 · Answer

Hello Senthil_P,Thanks for reaching out to us. In absence of cluster access or any logs, we would say that we shouldfollowing config on NICs.    lacp: off    lacp-fallback: true    lacp_speed: offTry changing below mentioned details on all hosts and check if it fixes reported issue. If not, I would recommend opening a case with Nutanix Support for more detailed investigation of reported issue.Happy Troubleshooting !!

AHV and CVM fails after restarting the AHV nodes

1 reply

Reply

Reply

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded