How Nutanix Handles Failures | Node Failure

  • 29 March 2021
  • 0 replies
  • 3310 views

Userlevel 4
Badge +2

Failures are part of everything and Nutanix Clusters is not immune to it. But how we plan for failures determines the versatility of the product or a person for that matter!!

Nutanix categorizes the type of failures into availability domains essentially based on type of failure. Nutanix provides the ability to tolerate rack failure for extended data availability, in addition to drive, node, block and network link failure. 

Node Failure

A Nutanix Node comprises Physical host and a controller VM. Both these components can fail without any impact to the Nutanix cluster.

CVM failure

When a CVM fails, an alert is generated in Prism and another CVM redirects the storage path on the related host to another CVM. Read and writes will occur over the 10GbE network until the CVM comes back online.

It is business as usual for the end customer with maybe a slight performance decrease.

8KGvi5IMn8hdpWsaVehNp8No7Q2F3n2oTaoh4AtvAy_j1bBkQs4wU6fogSFBJjvlfh6ZMgsit46QQO4HQbe0UUzp-kJxik7XMWxfqEUOaWSXRsB401u5jVEDEPAl27APF7dhjLyu

Controller VM Failure

Physical Host failure

If a node fails, all HA-protected VMs can be automatically restarted on other nodes in the cluster. End users will see that their application is unavailable during the time that the VMs are restarted on other hosts.

8evC0KyHU4Y95emQPXOV2BX-DLdWwrvJawT796KXU_AbonqBcGenSfpT9hg8uAqHTFMtZjVsni6nIW1im_yefpU60I7KsTu7oBBCjx3Wx9qP57uMbQ4rRJt2af2G-qbTcafWqF3J

Node Failure

 

For More Info:

  1. Availability Domains from Prism Web Console Guide
  2. Rack Awareness
  3. Block Awareness

This topic has been closed for comments