Get Started

Welcome to the Nutanix NEXT community. To get started please read our short welcome post. Thanks!

cancel
Showing results for 
Search instead for 
Did you mean: 

Maintenance of One ESXi Host in Nutanix Cluster

SOLVED Go to solution
Pathfinder

Maintenance of One ESXi Host in Nutanix Cluster

I have the Nutanix Cluster with 4 ESXi Host. I would like to do the maintanance for all the host one by one so that I can complete the Maintannance with out downtime for VM. Could you please share the process to do maintanance for one ESXi Host.

 

End to end maintanance of One ESXi host in nutanix cluster.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Wayfarer

Re: Maintenance of One ESXi Host in Nutanix Cluster

Steps prior to bring down host:
1) Ensure that the “Data Resiliency – Status” is Normal in PRISM Portal for the target Cluster.
2) Migrated all the user VMs (except CVM) residing in the target ESXi host to other healthy nodes part of the cluster.
3) Connect the CVM via SSH and find its UUID using below mentioned command.

     ncli host ls | grep -C7 [IP address of CVM]

4) Place the CVM in maintenance mode using below mentioned command with UUID which we have traced using previous step.

    ncli host edit id=[UUID] enable-maintenance-mode="true"

5) Verify that the CVM has been placed in maintenance mode using following command.

    cluster status | grep CVM

6) Perform shutdown of CVM using below mentioned command.

    cvm_shutdown –h now

7) Once the CVM made down.
8) Place the ESXi host in maintenance mode and do your maintenance activity.

 

Steps post bringing the host Online:
1) Exit the host from maintenance mode and Power ON the CVM.
2) Connect any one of neighbor CVM in cluster using SSH.
3) Check the status of CVM which we have Powered ON using following command. In this stage CVM should be reported as it is in maintenance mode.

     ncli host ls | grep -C7 [IP address of CVM]

4) Exit the CVM from maintenance mode using below mentioned command.

    ncli host edit id=[UUID] enable-maintenance-mode="false”

5) Verify that the CVM has been removed from maintenance mode using following command.

    cluster status | grep CVM

6) Ensure that the “Data Resiliency and Meta-data sync status” came normal post completing the maintenance activity in PRISM portal. It may take 5 to 10 minutes to reflect.

 

NOTE: In the given commands, parameters in brackets [ ] should be replaced with correct value.
For example –
ncli host ls | grep -C7 [IP address of CVM]  ncli host ls | grep -C7 169.254.10.1

7 REPLIES
Vanguard

Re: Maintenance of One ESXi Host in Nutanix Cluster

1)Put host in maintenance mode and let guest VM's evacuate.
2)After guest VM's evacuate. Shutdown CVM by ssh'ing or console and issuing "sudo shutdown -h now" command.
3) do what you need to do to host
4) Reboot host
5) ensure CVM restarts and cluster returns to healthy status.
Wayfarer

Re: Maintenance of One ESXi Host in Nutanix Cluster

Steps prior to bring down host:
1) Ensure that the “Data Resiliency – Status” is Normal in PRISM Portal for the target Cluster.
2) Migrated all the user VMs (except CVM) residing in the target ESXi host to other healthy nodes part of the cluster.
3) Connect the CVM via SSH and find its UUID using below mentioned command.

     ncli host ls | grep -C7 [IP address of CVM]

4) Place the CVM in maintenance mode using below mentioned command with UUID which we have traced using previous step.

    ncli host edit id=[UUID] enable-maintenance-mode="true"

5) Verify that the CVM has been placed in maintenance mode using following command.

    cluster status | grep CVM

6) Perform shutdown of CVM using below mentioned command.

    cvm_shutdown –h now

7) Once the CVM made down.
8) Place the ESXi host in maintenance mode and do your maintenance activity.

 

Steps post bringing the host Online:
1) Exit the host from maintenance mode and Power ON the CVM.
2) Connect any one of neighbor CVM in cluster using SSH.
3) Check the status of CVM which we have Powered ON using following command. In this stage CVM should be reported as it is in maintenance mode.

     ncli host ls | grep -C7 [IP address of CVM]

4) Exit the CVM from maintenance mode using below mentioned command.

    ncli host edit id=[UUID] enable-maintenance-mode="false”

5) Verify that the CVM has been removed from maintenance mode using following command.

    cluster status | grep CVM

6) Ensure that the “Data Resiliency and Meta-data sync status” came normal post completing the maintenance activity in PRISM portal. It may take 5 to 10 minutes to reflect.

 

NOTE: In the given commands, parameters in brackets [ ] should be replaced with correct value.
For example –
ncli host ls | grep -C7 [IP address of CVM]  ncli host ls | grep -C7 169.254.10.1

Highlighted
Pathfinder

Re: Maintenance of One ESXi Host in Nutanix Cluster

can we ignore the point as these VMs will be migrated while putting the ESXi host in maintannace.

2) Migrated all the user VMs (except CVM) residing in the target ESXi host

 

And what is the difference of Under maintnanace status - Null and False

Pathfinder

Re: Maintenance of One ESXi Host in Nutanix Cluster

Hello, thank you for this interresting post.

Do we know what is the Nutanix position about that ? Because many companies VMware admin may no be Nutanix admin to perform SSH command on a CVM. Is there an official Best Practice / guide for common task on ESXi + Nutanix ?

 

Regards,

Jean-Philippe

Pathfinder

Re: Maintenance of One ESXi Host in Nutanix Cluster

Hi, nope, the CVM musn't be migrated when putting the ESXi in maintenance mode. However I woukd like to heard a word from nutanix about just shutting down the CVM via the vCenter without the need to put also the CVM in maintenance with an SSH session.

 

When you get an outage, the CVM don't switch to maintenance mode I guess, you just lost the service and get it back when the host come online.

Vanguard

Re: Maintenance of One ESXi Host in Nutanix Cluster

If you have a support contract, I would suggest logging into the support portal and review the vSphere Admin Guide (link below).

 

https://portal.nutanix.com/#/page/docs/details?targetId=vSphere-Admin6-AOS-v51:vSphere-Admin6-AOS-v5...

 

In that doc, there is the following:

 

Shutting Down a Node in a Cluster (vSphere Web Client)

 
Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.
 
  1. Log on to vCenter Server by using vSphere Web Client.
  2. If DRS is not enabled, manually migrate all the VMs except the Controller VM to the another host in the cluster or shut down any VMs other than the Controller VM that you do not want to migrate to the another host.

    If DRS is enabled on the cluster, you can skip this step.

  3. Right-click the host and select Maintenance Mode > Enter Maintenance Mode.
  4. In the Confirm Maintenance Mode, click OK.

    The host gets ready to go into maintenance mode, which prevents VMs from running on this host. DRS automatically attempts to migrate all the VMs to another host in the cluster.

    Note: If DRS is not enabled, you need to manually migrate or shut down all the VMs excluding the Controller VM. The VMs that are not migrated automatically even when the DRS is enabled can be because of a configuration option in the VM that is not present on the target host.

     

  5. Log on to the Controller VM with SSH and shut down the Controller VM.

     

    nutanix@cvm$ cvm_shutdown -P now

     

     

    Note: Do not reset or shutdown the Controller VM in any way other than the cvm_shutdown command to ensure that the cluster is aware that the Controller VM is unavailable

     

  6. After the Controller VM shuts down, wait for the host to go into maintenance mode.
  7. Right-click the host and select Shut Down.

    Wait until vCenter Server displays that the host is not responding, which may take several minutes. If you are logged on to the ESXi host rather than to vCenter Sever, the vSphere Web Client disconnects when the host shuts down.

Shutting Down a Node in a Cluster (vSphere command line)

Before you begin
If DRS is not enabled, manually migrate all the VMs except the Controller VM to another host in the cluster or shut down any VMs other than the Controller VM that you do not want to migrate to another host. If DRS is enabled on the cluster, you can skip this pre-requisite.
 
Caution: Verify the data resiliency status of your cluster. If the cluster only has replication factor 2 (RF2), you can only shut down one node for each cluster. If an RF2 cluster would have more than one node shut down, shut down the entire cluster.

You can put the ESXi host into maintenance mode and shut it down from the command line or by using the vSphere Web Client.

 
  1. Log on to the Controller VM with SSH and shut down the Controller VM.

     

    nutanix@cvm$ cvm_shutdown -P now

     

  2. Log on to another Controller VM in the cluster with SSH.
  3. Shut down the host.

     

    nutanix@cvm$ ~/serviceability/bin/esx-enter-maintenance-mode -s cvm_ip_addr

     

    If successful, this command returns no output. If it fails with a message like the following, VMs are probably still running on the host.

    CRITICAL esx-enter-maintenance-mode:42 Command vim-cmd hostsvc/maintenance_mode_enter failed with ret=-1

    Ensure that all VMs are shut down or moved to another host and try again before proceeding.

    nutanix@cvm$ ~/serviceability/bin/esx-shutdown -s cvm_ip_addr

    Replace cvm_ip_addr with the IP address of the Controller VM on the ESXi host.

    Alternatively, you can put the ESXi host into maintenance mode and shut it down using the vSphere Web Client.

     

     

    If the host shuts down, a message like the following is displayed.

    INFO esx-shutdown:67 Please verify if ESX was successfully shut down using ping hypervisor_ip_addr

     

  4. Confirm that the ESXi host has shut down.

     

    nutanix@cvm$ ping hypervisor_ip_addr

     

    Replace hypervisor_ip_addr with the IP address of the ESXi host.

    If no ping packets are answered, the ESXi host is shut down.

Pathfinder

Re: Maintenance of One ESXi Host in Nutanix Cluster

Thanks for the link patrbng !