Nodes reset/reboot events | Nutanix Community
Skip to main content
We have had two instances where a node detected/reported a fault event and reset rebooting vm on each occasion. There seems no reason for this to have happened.



Details



Host 192.168.xx.x4 appears to have failed. High Availability is restarting VMs on hosts throughout the cluster. 08-17-16, 02:01:41am





Host 192.168.xx.x4 appears to have failed. High Availability is restarting VMs on hosts throughout the cluster.08-11-16, 07:19:48am



We updated the AHV and NCC and since had a repeat last night from the first instance last week



Is there a potential hw fault with the host that has not yet been detected or checked?
Could be some sort of hardware NMI or other issue that's causing this. Support can dig in from a diagnostics and log perspective.



I know you had another thread where I recommended opening up a case, please either piggyback on that one, or open a secondary one to cover this off. If Dell sees some sort of hardware issue, they'll do what they do, and if not, they'll pass the case to us to dig into it.
If you check ~/data/logs/sysstats/ping_hosts.INFO we keep a log of pings between all nodes in a cluster.



We throw the error you saw if the node is inaccessable over the network, this could be networking or hardware failure.



If you check uptime on all nodes, what does it come back with, if they are uniform, it was likely a networking interuption