Question

Nutanix Storage capacity and resiliency(three questions)

  • 8 July 2019
  • 3 replies
  • 3920 views

Userlevel 1
Badge +1
I have three questions.
Please answer.

1.For example, we are using three nodes.
There's a resiliency warning for prism because of the high utilization of storage.
Then there was a problem in use and one node failed.
I wonder what happens in this case. What kind of error is caused? Can we keep the service?

2.What's the maximum storage utilization that Nutanix can continue with its services?

3.What happens when storage utilization is 100%?

This topic has been closed for comments

3 replies

Userlevel 3
Badge +5
Hi @Hong

If data resiliency is reporting that cannot tolerate one node failure and guarantee rebuild capacity, that means that you can tolerate one node down but some data will not have two copies, but your VMs will continue running.

If your cluster utilization reach 95% space usage, the Stargate process will stop to accept new write. For additional details on how to get the maximum utilization for your cluster, please refer to the following KB: https://portal.nutanix.com/kb/6633

I recommend you run a full NCC health check and check if NCC is reporting any errors and if yes, follow the KB attached to each check and in case of any additional questions, open a support ticket.

There are some scenarios where cluster is using more space than normal cause Curator scans is failing or maybe because you have orphan snapshots and/or high amount of garbage data and NCC will report that.
Userlevel 1
Badge +1
Thank you for your answer.
I'd like to ask you more questions.

For example, we are using three nodes. Let's assume that the nodes are using 1TB each.
When 3 node cluster storage usage was 2.5TB, 1 node was down.

In this case...VM on down node is not serviceable?
Can VMs on two normal nodes continue to serve?

I think the cluster will malfunction or shut down.
In this case, can you save the data for future recovery?

Please answer me once.
Thank you.


Userlevel 3
Badge +5
Hi @Hong

In case of one physical node down, the CVM from failed node will be down and the other two remaining CVMs from alive hosts will provide the cluster services.

Also, not sure which hypervisor are you using, but if HA (High Availability) is enable in the cluster the UVMs (user VMs) from failed node will automatically restart on other nodes in the cluster. In that case the user will may notice only a single UVM reboot.