Installation & Configuration

Welcome to the Nutanix NEXT community. To get started please read our short welcome post. Thanks!

cancel
Showing results for 
Search instead for 
Did you mean: 

about data resilience question

SOLVED Go to solution
Highlighted
Adventurer

about data resilience question

Hi

I have a nutanix cluster which setting RF=3, according to the documents, It can tolerate 2 nodes broken at the same time. but in the data resilience page, it display that the failure tolerable for Oplog and Extent Group is 1 and all the message said it can tolerate 1 node failure maximum.

The node in the cluster are all standalone node, and block is same as node.

The AOS is 5.1.3 and hypervisor is esxi6.5

 

I checked the url:http://next.nutanix.com/t5/Installation-Configuration/Data-Resiliency-Status-shows-error/m-p/1114

But the full scan will be done every 6 hours automatically, and this tolerance problem has last for almost 1 week.

 

Do you have any comments for this problem. Does this state can let the cluster tolerate 2 nodes failures?

微信图片_20171222205129.jpg

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Adventurer

Re: about data resilience question

I got where the problem is.

Although the cluster set RF=3, but there is also a container which setting is RF=2 and there is a VM running in this container.

That is why oplog and extent store display 1 in the data resilience page.

2 REPLIES
Trailblazer

Re: about data resilience question

Lets try to check a couple of things first. Can you verify the cluster redundancy factor first, via NCLI?

How to check RF on a cluster;

  1. Putty (or use tool of choice) to a CVM via IP
  2. log in with the proper credentials
  3. when you are properly logged in you should see a command prompt like this; ncli>
  4. Type this command, without the quotes "cluster get-redundancy-state" and press enter
  5. you should get an output that shows something like this;
  6. ncli> cluster get-redundancy-stateCurrent Redundancy Factor : 2

    Desired Redundancy Factor : 2
    Redundancy Factor Status : kCassandraPrepareDone=true;kZookeeperPrepareDone=true
    ncli>NCLICheckRF.PNG

 

 

 

 

 

 

You can also force a curator full scan as described here; http://next.nutanix.com/t5/Installation-Configuration/Data-Resiliency-Status-shows-error/m-p/1114

 

This is also a great overview of Redundancy Factor vs. Replication Factor. https://youtu.be/tVPhl52thDY

 

I would also suggest opening a support ticket on this question; the support team would be able to iron it out for sure.

Matthew Gauch explains the meaning of Redundancy Factor and Replication Factor for Nutanix clusters. Learn how you can configure your clusters with the right level of data protection for your system. Further reading on redundancy factor and replication factor: ...
Adventurer

Re: about data resilience question

I got where the problem is.

Although the cluster set RF=3, but there is also a container which setting is RF=2 and there is a VM running in this container.

That is why oplog and extent store display 1 in the data resilience page.