Solved

Node is Down after firmware upgrade - Host X.X.X.X appears to have failed

  • 8 February 2023
  • 6 replies
  • 416 views

Badge

Hello,

 

after a firmware upgrade, one host is locked DOWN in maintenance mode :

CVM: 192.168.131.132 Down

I can run a command to exit maintenance mode but it is not working and it is "Removed from metadata store" :

nutanix@:~$ ncli host edit id=7 enable-maintenance-mode=false

Id : …

Hypervisor Address : 192.168.131.122

Host Status : NORMAL
Oplog Disk Size : 394 GiB (423,054,278,649 bytes) (3.9%)

Under Maintenance Mode : false (ncli_manual)

Metadata store status : Node is removed from metadata store

...

So I tried to recover it but the script fails :

nutanix@:~$ python /home/nutanix/cluster/bin/lcm/lcm_node_recovery.py 192.168.131.122
Recovering node 192.168.131.122
Checking if the node 192.168.131.122 is in phoenix
Current node status host Node 192.168.131.122 out of phoenix mode
Bringing host None out of maintenance mode Successfully put host None out of maintenance mode Bringing CVM 192.168.131.122 out of maintenance mode
Traceback (most recent call last): File "/home/nutanix/cluster/bin/lcm/lcm_node_recovery.py", line 255, in <module> if not main(): File "/home/nutanix/cluster/bin/lcm/lcm_node_recovery.py", line 231, in main if not obj.recover_node(): File "/home/nutanix/cluster/bin/lcm/lcm_node_recovery.py", line 195, in recover_node svm_ip=self.svm_ip): File "build/bdist.linux-x86_64/egg/cluster/client/genesis_utils.py", line 2006, in cvm_set_maintenance_mode_status

 

Any idea ? Thank you by advance!

icon

Best answer by GuiEIVP 14 February 2023, 14:18

View original

This topic has been closed for comments

6 replies

Userlevel 6
Badge +8

Just involve support. They are great and will help you. 

Badge

Thanks for your fast answer. Yes support is great but, this happens on my previous Nutanix, on which the support ended… a few month ago only.

So I ask a little help from community ;)

Badge

Hi,

  Not sure if this would help

https://portal.nutanix.com/page/documents/kbs/details?targetId=kA0600000008ZBnCAM

  However, if you plan to run along without any valid support entitlement, probably I would consider reimage the node at some point if there is not much direction and it needs to bring back up in a timely manner.

 

Badge

Thanks Elroy,

 

yes I already tried this fix but it did not work…

I tried to reimage but Phoenix did not want, indicating there is a valid installation.

 

Is there a way to force this with “phoenix” command lines ? I found no documentation about this this, every doc I have says that I should get on the installer, but I cannot I get on phoenix shell directly.

Badge

Hi,

  Did you try with standalone foundation VM?

Badge

I finally managed to get things up. I totally shut down, and disconnected all the Nutanix and the Cisco switch. EVentually the CVM woke up !

Unfortunatly I cannot get it back to the cluster, but will create a new post...