Host services restart when network changes resulting in vm down

Question

We have had a couple of instances recently when making network changes that have affected our clusters. This caused a restart on the lead host due to it detecting a network loss and  then resulted in system outages. The cluster is configured with dual networks ports in active and passive mode and the understanding was that it would switch if any change or failure was detecetd without producing error events and systems down.

sbarab · Accepted Answer

@roberthwl This should have definitely been the case “IF” the switches involved were configured correctly. I recently dealt with an issue were the core switch has some issues causing this, it was found after involving the switch vendor.

One other thing, the AHV version should be relatively new (for example AHV-20170830.300 or above 300) as many issues were addressed with the later releases of the AHV.

You can involve nutanix support to review any logs or re-test with them on line (if possible), but I would probably review switch logs and configuration before going that route.

roberthwl · Answer

We recently updated the clusters to AHV VERSION NUTANIX 20170830.200, earlier this year and would look to do so again soon ( typo on your reply .300?)

I had run diagnostics on the cluster and a seperate one on network components which didnt flag any faults or errors inany config or associated components.

The network team would now need to verify the switch configuration is ok.

As far as I know nothing has been changed from when the clusters were setup by the hardware providers technical support team

Sign up

Login to the community