Scheduled power outage? Relocating cluster hardware? If you need to shut down all the nodes in your AHV cluster, here's how.

  • 26 March 2020
  • 0 replies
  • 2414 views

Userlevel 3
Badge +4

For most maintenance tasks and upgrades we can keep the cluster up and VMs running, but in some cases the whole cluster will need to be shut down. If you just need to power off a single node, a cluster of three or more nodes won’t need to stop. 

To stop a single-node cluster please see the section “Shutting Down a Single-node Cluster” in the NX and SX series hardware administration guide.

To stop a single node in a larger cluster see the section “Shutting Down a Node in a Cluster (AHV)” 

If there's going to be a site power outage, a full network outage, or physical relocation of the whole cluster you're going to want to gracefully shut down the whole cluster.

The full procedure is covered in the article Shutting Down an AHV Cluster for Maintenance or Relocation.

 

In summary the procedure will be as follows:

Update NCC and perform a health check, then address any items of concern.

Shut down all the user VMs.

Stop any Nutanix Files cluster, if applicable. At this point no VMs other than the CVMs should be running.

Stop the Nutanix cluster. At this point the data hosted on the cluster becomes unavailable, and Prism won't be accessible.

Shut down each node in the cluster.

 

At this point, from the Nutanix cluster perspective, it's OK to cut power or networking. you're ready to proceed with maintenance. The article linked above goes into more detail on a few different methods to shut down all the user-VMs on AHV.

After maintenance, you'll want to bring up the individual hosts and wait five minutes. This automatically starts the CVMs, but you won't see Prism coming up just yet. Give it five minutes and then SSH or console into a CVM and start the cluster. 

As always, after maintenance it's a good idea to run a full health check. You can expect to see a warning that CVMs were recently restarted, and you may see a warning about NTP. Either of these should test OK after another hour.

If you need to shut down an ESXi or Hyper-V cluster, the CVM cluster portion of the process is essentially the same. VMs offline first, then ‘cluster stop’ from within a CVM, then node shutdown procedures as appropriate for your hypervisor. 


This topic has been closed for comments