What is the difference between the Redundancy Factor and Replication Factor?
Redundancy factor 3 is a configurable option that allows a Nutanix cluster to withstand the failure of two nodes or drives in different blocks. By default, Nutanix clusters have redundancy factor 2, which means they can tolerate the failure of a single node or drive. The larger the cluster, the more likely it is to experience multiple failures.
Redundancy Factor 3 requirements:
-
Min 5 nodes in the cluster.
-
CVM with 32GB RAM configured.
-
For guest VM to tolerate a simultaneous failure of 2 nodes or 2 disks in different blocks, VM data must be stored on a container with replication factor 3.
NOTE: Nutanix cluster with FT2 enabled, can host storage containers with RF=2 and RF=3.
Redundancy Factor 2 requirements:
-
Min 3 nodes in the cluster.
-
CVM with 24GB RAM configured.
Some background to understand Redundancy Factor.
Cassandra
-
Key Role: Distributed metadata store
-
Description: Cassandra stores and manages all of the cluster metadata in a distributed ring-like manner based upon a heavily modified Apache Cassandra. The Paxos algorithm is utilized to enforce strict consistency. This service runs on every node in the cluster. The Cassandra is accessed via an interface called Medusa.
Source: Nutanix Bible by Steven Poitras
Zookeeper
-
Key Role: Cluster configuration manager
-
Description: Zookeeper stores all of the cluster configuration including hosts, IPs, state, etc. and is based upon Apache Zookeeper. This service runs on three nodes in the cluster, one of which is elected as a leader. The leader receives all requests and forwards them to its peers. If the leader fails to respond, a new leader is automatically elected. Zookeeper is accessed via an interface called Zeus.
Source: Nutanix Bible by Steven Poitras
So, what is the impact when you change Redundancy Factor from 2 to 3. The Redundancy Factor 2 cluster keeps 3 copies of metadata (Cassandra) and Zookeeper. With Redundancy factor 3 enabled, cluster keeps 5 copies of Metadata and Zookeeper (configuration data).
How to change Nutanix cluster Redundancy
Go to Prism Element, click on the Gear button –> Redundancy State
From a drop down menu choose Redundancy Factor 3 and Save configuration
NOTE: changing Redundancy Factor from 2 to 3 does not affect storage capacity on the cluster.
What is Replication factor (RF)?
Nutanix Replication Factor states for a number of data copies on Nutanix cluster (VM data and Oplog). For example, if you have Replication Factor 2 (RF=2) set on the container, meaning every VM data block has 2 copies on the Nutanix cluster (including data in Oplog). With RF=3, all VM data has 3 copies of data on different blocks.
NOTE #1: You can have a container with RF=2 and another container with RF=3 configured on the same Nutanix cluster.
NOTE #2: Changing container replication Factor from RF=2 to RF=3, consumes more storage space on Nutanix cluster (because systems have to keep 3 copies of the VM data which are on containers with RF=3).
To read more about VM data placement on containers with RF=2, RF=3, Zookeeper and Metadata placement see the following article:
Below table should bring more clarity what is supported and what is the impact on the clusters.
Replication factor | Fault tolerance | compute cluster settings | Supported RF on containers | Description |
RF2 | FT1 | N+1 | RF2 | A cluster can sustain the failure of single Nutanix node or disk (simultaneously) |
RF3 | FT2 | N+2 | RF2 and RF3 | A cluster can sustain the failure of two Nutanix nodes or disk (simultaneously)
|