We have had a couple of instances recently when making network changes that have affected our clusters. This caused a restart on the lead host due to it detecting a network loss and then resulted in system outages. The cluster is configured with dual networks ports in active and passive mode and the understanding was that it would switch if any change or failure was detecetd without producing error events and systems down.
One other thing, the AHV version should be relatively new (for example AHV-20170830.300 or above 300) as many issues were addressed with the later releases of the AHV.
You can involve nutanix support to review any logs or re-test with them on line (if possible), but I would probably review switch logs and configuration before going that route.
We recently updated the clusters to AHV VERSION NUTANIX 20170830.200, earlier this year and would look to do so again soon ( typo on your reply .300?)
I had run diagnostics on the cluster and a seperate one on network components which didnt flag any faults or errors inany config or associated components.
The network team would now need to verify the switch configuration is ok.
As far as I know nothing has been changed from when the clusters were setup by the hardware providers technical support team
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.