Two node clusters

  • 7 December 2019
  • 4 replies
  • 773 views

Userlevel 2
Badge +3

A traditional Nutanix cluster requires a minimum of three nodes, but Nutanix also offers the option of a two-node cluster for ROBO implementations and other situations that require a lower cost yet high resiliency option. Unlike a one-node cluster (see Single-Node Clusters), a two-node cluster can still provide many of the resiliency features of a three-node cluster. This is possible by adding an external Witness VM in a separate failure domain to the configuration (see Configuring a Witness (two-node cluster)). Nevertheless, there are some restrictions when employing a two-node cluster. The following links will provide you guide lines and information abut configuring the two node clusters:

 

https://portal.nutanix.com/#/page/docs/details?targetId=Web-Console-Guide-Prism-v511:wc-cluster-two-node-guidelines-r.html

and

https://portal.nutanix.com/#/page/docs/details?targetId=Web-Console-Guide-Prism-v511:wc-cluster-two-node-c.html

The second link contains a video about the topic.

 

Ask any questions to clarify any concerns about the the two node clusters.

 

 

 


4 replies

Userlevel 3
Badge +17

How can I create a 2 node cluster?

Userlevel 2
Badge +3

@hienle You can review the links above. Below is a snip of what you will see under the title “Two node cluster guidelines”

==

A two-node cluster is configured and upgraded like a regular (three-node or more) cluster in most ways, but note the following for a two-node cluster:

  • Size your implementation for N + 1 so that in the event of a node loss (50% loss of resources) the remaining node will have sufficient resources to allow the cluster to continue functioning.
  • There is a heartbeat check (ping) between the nodes every two seconds. If a successful ping does not occur within 10 seconds (5 consecutive failed tries), a failover is initiated (see Failure and Recovery Scenarios). When the cluster recovers, it must remain in healthy status for at least 15 minutes before it will failback.
  • The upgrade process in a two-node cluster may take longer than the usual process because of the additional step of syncing data while transitioning between single and two node state. Nevertheless, the cluster remains operational during upgrade.
  • Witness VM considerations
    • A Witness VM for two-node clusters requires a minimum of 2 vCPUs, 6 GBs of memory, and 25 GBs of storage.

..

..

==

 

hope this helps a little or ask more questions.

 

Regards,

 

-Said

 

Userlevel 2
Badge +1

Be VERY careful when considering a two node environment.  We’ve run into a number of caveats one of which I haven’t seen documented.  When going through maintenance tasks (AOS upgrade, firmware upgrades, etc) we’ve seen what looks to be the data uncompressing when transitioning from a two node to single node cluster.  I’ve attached a screenshot showing the storage usage of our container as it transitioned between two and one node clusters.

 

I have a ticket opened with Nutanix to get clarification, but from what we see enabling compression is not a good idea when using a two node deployment.  After having our deployment live for a few months I feel like it’s not ready for prime time yet.  There are a lot of caveats vs. a 3+ node cluster and for what what you pay I feel like other HCI players do a much better job.

Userlevel 2
Badge +1

I wanted to provide an update on some testing I’ve done in our two node environment (running ESXi).  From what we were seeing during previous maintenance tasks there was an increase of usage during two node → one node transition so I wanted to test this further.  We have our container configured with “Advertised Capacity” at the usage threshold recommended by Nutanix in a two node cluster (40% which takes into account a node failure).  I then filled the datastore to ~95% capacity and applied AOS 5.11.2.1 (coming from 5.11.2).  During the upgrade process we did reach max capacity (the Advertised Capacity size) briefly, but it didn’t affect the VMs running.  So I can’t really explain what’s happening behind the scenes, but I do feel better about leaving compression enabled.

 

 

Even though the two node cluster has some caveats, Nutanix providing the ROBO option licensing on a per VM basis makes this an enticing option for a very small site.  If you were to go with the full core/capacity licensing you might-as-well do a three node to get the full feature set of AOS.

Reply