Solved

Failover Testing Failed

Forum|Forum|10 years ago
May 26, 2016
4 replies
1223 views

+5

charlie_chuhak
Adventurer

Hello we have a brand new cluster (4 nodes) and did some failover testing over the weekend prior to placing the new gear into production. We have 2 cisco 4500x 10GB switches and have the nodes split between the two switches for redundancy. We simulated a switch failover by pulling the plug on one of them and experiened some very bad results. Some CVMs became unresponsive and we lost connectivity to most of our VMs. Has anyone else experienced a similar issue? Or can anyone point me in the right direction for configurations to double check? Would appreciate any insight!

Best answer by charlie_chuhak

Worked with support to get this sqaured away. We ended up upgrading the hypervisor which solved the issue.

This topic has been closed for replies.

+29

Jon
Nutanix Employee
Forum|Forum|10 years ago
May 26, 2016

First thing - have you put in a support ticket with Nutanix yet?

thats the best first step here, as we can help you validate your configuration on the Nutanix and hypervisor side. We can also help guide the conversation on the network side (bunch of our support staff are ex Cisco, and some are even CCIEs)

past that, it would be key to know how you have your vSwitches setup and how you have your Cisco switches setup.

when you make the support ticket, if you could attach "show run" from both switches, appropriately censored (don't need SSH keys or anything like that), that would really help.

feel free to CC me on the ticket, Jon at Nutanix dot com

Jon Kohler | Technical Director, Engineering, Nutanix | Nutanix NPX #003, VCDX #116 | @JonKohler | Please Kudos if useful!

Like

C

+5

charlie_chuhak
Author
Adventurer
Forum|Forum|10 years ago
May 27, 2016

Thanks for the reply, it is appreciated! I have a support ticket open but have had a hard time connecting with the engineer as we are in differnt time zones. I just copied you in to the ticket.

We are using a distributed switch with each of the uplinks split between two Cisco 4500-X switches. I've confirmed via CDP that the all uplinks are properly connected to each switch.

I'll be sure to add in the sh run from the swithces to the ticket. Thanks again for chiming in. Looking forward to getting the cluster into production.

Like

C

+5

charlie_chuhak
Author
Adventurer
Answer
Forum|Forum|10 years ago
June 2, 2016

Worked with support to get this sqaured away. We ended up upgrading the hypervisor which solved the issue.

Like

+29

Jon
Nutanix Employee
Forum|Forum|10 years ago
June 4, 2016

Thanks for the update here.

For everyone else, all was well on VDS and Cisco config AFAIK, was hitting a VMW Bug, upgraded to 6.0u2 and all is well now

Jon Kohler | Technical Director, Engineering, Nutanix | Nutanix NPX #003, VCDX #116 | @JonKohler | Please Kudos if useful!

Like

Sign up

Login to the community