While configuring LACP, rolling hypervisor reboot stuck at 90% | Nutanix Community
Skip to main content

Good evening Nutanix’ers,

 

After I configured an active-active bundle on my virtual switch in PE, and modifying my upstream switch with LACP, all seemed well. However, I noticed that the hypervisor rolling reboot never finished, and its been stuck at 90% for ~3 hours.

 

When viewing incomplete tasks in ecli, I noticed that the virtual switch update and hypervisor rolling restore are both stuck at “kRunning.”

 

Screenshot

 

I am thinking about killing the rolling restart task, however doing so warns me that “Using this command can cause database corruption and complete system failure, if used improperly.

 

I admit that I may have not fully appreciated a step about maintenance mode when following this kb here for this specific configuration, so if I’m borked, then I’ll chalk it up to a learning experience. However, I’d like to see what my options are 🙂.

 

Despite all of this, I’ve confirmed LACP is negotiated, and the rest of the system seems normal despite the stuck task. What do you guys think?

Is this production? If so: Involve support. 

 

Is this a test environment? Stop the stuck task. Try to find if the shutdown token is taken by a node, if so, reboot that specific node. Do the change on the virtual switch again (enable lacp). 


I belive shutdown token is with the one of the Node , and next one is waiting for token to proceed with reboot. 

 

I would suggest to involve support on this to investegate the issue.