How It Works

Welcome to the Nutanix NEXT community. To get started please read our short welcome post. Thanks!

cancel
Showing results for 
Search instead for 
Did you mean: 

Nutanix Alerts

Trailblazer

Nutanix Alerts

We had a worrying senario this am on our main production cluster where events reported that two of the four nodes were being put into maintenace mode - ie downed the cluster. This was not the case though it caused a major panic and the start of a full dr process.

 

What we saw was as follows

 

Curator job Full Scan with id 6220 has been running for a long time i.e. 21625 seconds

Controller VM xx.xx.xx.x2 is put in maintenance mode due to Cluster conversion.

Controller VM xx.xx.xx.x4 is put in maintenance mode due to Cluster conversion.

 

The nodes were checked and the only real errors of any note picked up using ncc seem to be related to Genesis

 

WARN: Error while executing cluster status: 2016-10-06 10:50:35 WARNING genesis_utils.py:842 Failed to reach a node where Genesis is up. Retrying... (Hit Ctrl-C to abort)
2016-10-06 10:50:36 WARNING genesis_utils.py:842 Failed to reach a node where Genesis is up. Retrying... (Hit Ctrl-C to abort)
2016-10-06 10:50:37 WARNING genesis_utils.py:842 Failed to reach a node where Genesis is up. Retrying... (Hit Ctrl-C to abort)
2016-10-06 10:50:38 WARNING genesis_utils.py:842 Failed to reach a node where Genesis is up. Retrying... (Hit Ctrl-C to abort)
2016-10-06 10:50:39 WARNING genesis_utils.py:842 Failed to reach a node where Genesis is up. Retrying... (Hit Ctrl-C to abort)

 

I have done some searching and come up with no real answers atm

 

FYI there is already a support case logged with our hw vendor

 

 

4 REPLIES
Trailblazer

Ref: Nutanix Alerts

The issue with genesis has been resolved - had hung/stopped on one node the other reported issues are still outstanding with no cause/solution

Moderator Moderator
Moderator

Re: Ref: Nutanix Alerts

Robert - I don't see any cases filed with Nutanix about this under your account. Having Nutanix support dig into this is the absolute best way to get resolution on this one. 

 

Looks like you are a Dell customer and you mentioned you logged a ticket. Please have your Dell support rep escalate the ticket to Nutanix.

Jon Kohler | Principal Architect, Nutanix | Nutanix NPX #003, VCDX #116 | @JonKohler
Please Kudos if useful!
Moderator Moderator
Moderator

Re: Ref: Nutanix Alerts

@roberthwl - Following up, were you able to get through these issues?

Jon Kohler | Principal Architect, Nutanix | Nutanix NPX #003, VCDX #116 | @JonKohler
Please Kudos if useful!
Highlighted
Trailblazer

Ref: Nutanix Alerts

Hi Jon

    This was investigated by Dell and Nutanix support.

I thought I had already responded to this post.

 

We had a VMWare cluster which was converted to AHV - pre production

The errors we saw were legacy from the conversion and then subsequent AHV update.

A problem with genesis on one of the nodes was thought to be the root cause in flagging the events as current

 

Regards

Robert