Cluster Maintenance or Relocation

  • 9 October 2019
  • 1 reply

Userlevel 3
Badge +3
Occasionally you need to temporarily power off your cluster hardware for maintenance, relocation or any other necessary task.
To do this you will need to stop your cluster and then start it back after the required task was done.

But you will need to perform the following tasks before and after the cluster stop/start commands:

1- Disable any “Protection Domains” in the cluster and make sure there is no ongoing replication occurring.
2- Make sure no “Fail” message appears when performing the ncc health check on the cluster
3- Power off all the Vms running in the cluster
4- Stop “Files” (previously AFS) if you are using it in the cluster
5- Run "cluster stop" command
6- power off all CVM
7- Put all hypervisors in maintenance mode
8- Power off the hardware

After the maintenance and powering on the hardware, you will need to undo the above steps.

Put nodes out of maintenance mode and make sure all Cvm are powered on, to run the command “cluster start”; you could then go ahead with starting the “Files”, powering on cluster VM and finally enabling the “Protection Domains”. Always run ncc health check after the above and make sure no Fail messages are reported

To view how this is done on an ESXi cluster please view the link “Shutting down Nutanix cluster running VMware vSphere for maintenance or relocation”.
Refer to "How to shutdown a Cluster and Start it again" for all supported hypervisors.

This topic has been closed for comments

1 reply

Userlevel 3
Badge +6
If you're tracking port statistics (external tools) or have other vlan tags/security on certain ports in your Top of Rack (managed) Switch, I'd recommend saving the CDP or LLDP output, then compare a before/after result to confirm all uplink ports are connected to the same ports on the switch.

allssh 'ssh root@ lldpctl'

ESXI (5.x and later): VMWare KB 1007069
vim-cmd hostsvc/net/query_networkhint --pnic-name=vmnic[xx]