Solved

Update Failure

  • 10 September 2016
  • 5 replies
  • 2760 views

Badge +2
Hi all,

today I wanted to upgrade NOS on one of our clusters from 4.1.x to 4.6.3 which worked perfectly fine for another cluster.
The first CVM was updated successfully but the next CVM Update failed. (Maybe due to connectivity issues to the host.)
upgrade_status says, that the node is upgrading right now, but in fact the CVM is powered off and won't boot. After it displays the following information it reboots fails again and after the second reboot it stays shut down.


Is there a simple fix for this or do I have to open a support ticket?

EDIT: it seems like the update process tried to shut down this cvm but got a timeout and now the update is somehow stuck. Can I somehow restart the upgrade process or manually update this CVM?

Thank you.
icon

Best answer by Eric-The_Viking 29 June 2018, 15:20

View original

This topic has been closed for comments

5 replies

Userlevel 4
Badge +18
It looks like CVM is failing to restart post upgrade of CVM.
This would need support intervention. I would suggest to open a support ticket and we will assist you with the situation.

-NP
Badge +2
Hi,

thank you for your reply.
I guess that would mean to get in touch with DELL support first as it's a XC-630 cluster.
So I will contact DELL support and they will get in touch with you or how does that work?

I don't really know what update steps have already been performed before it broke.
On the other host (that has been updated), the CVM was moved from admin user profile to Program Files folder. This has not yet happend.
The only log information I could find is in svm_update.out. It repeatedly states "Waiting for the CVM to shutdown" for over an hour.
svm-error.out states "PM Timed out waiting for the CVM to shutdown_x000D__x000A_" that's where it is stuck.


Thank you
Userlevel 4
Badge +18
Yes please contact Dell support and if required they will get in touch with us.

CVM AOS upgrade pocess is simple, automated process. Every CVM AOS gets updated in a rolling manner and each CVM needs a reboot to finish the upgrade process before next in queue CVM AOS is upgraded. Going by the screenshot shared by you, it looks like CVM got stuck while booting up.

It is unable to find the active partition from where it can boot. There are few internal KB's which points out to this issue but support person on call/remote session will be able to assist you.

upgrade_status will continue to show that CVM is upgrading because other nodes are waiting for this node to complete the AOS upgrade and pass on the token to next CVM.

'the CVM was moved from admin user profile to Program Files folder. This has not yet happend.'

If I remember it correctly, in the newer AOS version, the path has changed. so I think it's fine.
Userlevel 2
Badge +3
I have the same problem, one Dell node has upgraded and the other two have stalled and no apparent way of solving the problem myself - it looks from other forums that this is a common problem - what is Nutanix doing to fix it? As someone who is coming from a VMware environment this sis not good...
Userlevel 2
Badge +3
I have the same problem, one Dell node has upgraded and the other two have stalled and no apparent way of solving the problem myself - it looks from other forums that this is a common problem - what is Nutanix doing to fix it? As someone who is coming from a VMware environment this sis not good...
Hi All, it seems it was a known problem in v5.5.3 and Nutanix issued a fix on the 22nd June 2018 - it is fixed in AOS 5.5.3.1
I hope this helps.

Regards

Eric