How many nodes can fail in a nutanix cluster? | Nutanix Community
Skip to main content
Solved

How many nodes can fail in a nutanix cluster?

  • August 25, 2014
  • 5 replies
  • 2671 views

Forum|alt.badge.img+8
  • Adventurer
  • 3 replies
Hi,

in the platform admin guide is mentioned that there can be only 1 node failure at one time.
What happens if a complete block get's lost?

thanks and best regards

Best answer by christie

if you have rf=2 containers, then you can only sustain a single node outage within your cluster.
if you have 4 nodes per block, a single block failure will mean you hvae 4 nodes down therefore you will likely experience storage unavailability.

if you have rf=3 containers, you can sustain a two node outage.
however, similar math applies.

this is why we offer power supply redundancy per block so you should only experience complete block failure in rare circumstances.
View original
Did this topic help you find an answer to your question?
This topic has been closed for comments

5 replies

dlink7
Forum|alt.badge.img+19
  • Moderator
  • 107 replies
  • August 28, 2014
Guess thats a depends question. If you have 5 nodes, you can config the cluster to lose 2 nodes with RF3. If you have 3 uniform blocks and you lose a block the cluster will keep running. That feature called availablity domains.


If you only have 1 block and you lose power to the block, once you restore power everything should come back fine. All writes are synced and knowledged to the guess vms.

Does that help?

christie
Nutanix Employee
Forum|alt.badge.img+6
  • Nutanix Employee
  • 4 replies
  • Answer
  • August 29, 2014
if you have rf=2 containers, then you can only sustain a single node outage within your cluster.
if you have 4 nodes per block, a single block failure will mean you hvae 4 nodes down therefore you will likely experience storage unavailability.

if you have rf=3 containers, you can sustain a two node outage.
however, similar math applies.

this is why we offer power supply redundancy per block so you should only experience complete block failure in rare circumstances.

Forum|alt.badge.img+8
  • Author
  • Adventurer
  • 3 replies
  • August 29, 2014
thank you for the information.
is it possible to get a two rack "rack aware" config? (considering a 50% compute reservation)

Forum|alt.badge.img+8
  • Author
  • Adventurer
  • 3 replies
  • August 29, 2014
thank you for the information!
best regards
Manfred

Forum|alt.badge.img+5
  • Voyager
  • 2 replies
  • February 6, 2015
It is worth pointing out than although the cluster can only handle 1 (or 2 with higher RF settings) node failures at one time, you can lose Node #1 and the cluster will start to heal. After that process finishes, you can lose another node and still up and running. You are only "down" if you lose more nodes than your RF settings can handle if the failures happen before the cluster is healed. That's not very helpful if you lose a block, but for something that affects only 1 node (hdd or bad RAM for example), you are good to go.