VMotion feature re-enabling on original vmkernel


Badge +2
Hello all,

I'm running into an issue intermittently where vMotion's fail between my Nutanix hosts. I have two vmkernels on each of our nodes, one for management and one for vMotion (separate VLANs for each). What I've seen is that after disabling vMotion on the original vmkernel, after an indeterminate amount of time, the option re-enables. I'm not sure if the trigger is a reboot of the ESXi host during patching, but this has happened multiple times so I can confirm it's not an issue of missing it during initial configuration. Has anyone else ran into this?

James

23 replies

Userlevel 7
Badge +30
Thats odd. Have you opened a ticket with Nutanix support yet?
Badge +2
I haven't yet. Since it happens so erratically I didn't think I could give support good data on when it occurs or which specific nodes it happens on. I have pending patch installs for ESXi 6 and Acropolis (to go to 4.6.1 from 4.6.0.2), so I'll test out vMotions again after those updates are complete and see if either of those operations can elicit the behavior.

James
Badge +3
We have seen this regularly too. Did you find a fix?

We're using ESXi 6.0 U2
Userlevel 7
Badge +30
hey TM-Nut - I haven't seen this anywhere else myself. Please open a support ticket if this keeps happening with you.
Badge +3
Seeing same issue on a brand new install. Default Nutanix ESXi puts vmotion on management. Create a completely separate VMK and enable vmotion on it, disabling on management - come back the next day and it's back to original. Now one question i have - is anyone running a VMotion from the WEB GUI for this ? Latest 6.0 web gui allows you to vmotion without some of the pre-req's - like shared storage. Seems more likely VMware "enhanced feature" Haven't gone to support yet.
Userlevel 7
Badge +30
I've personally done vMotion from the web GUI more times than I can count, works great. Shared Nothing vMotion is a life saver for migrations, makes the "plumbing" between old environment and nutanix simple.

Definitely file a support ticket with us when you can so we can dig into this vMotion enablement issue
Badge +3
does appear to be something specific to Nutanix, as I can't find any reports of such behaviour using anything else other than Nutanix. VMWare have never heard of it either.
Userlevel 7
Badge +30
I haven't heard of this behavior anywhere, nutanix or otherwise. I flipped through our bug database quickly this morning, and haven't seen any similar reports at the engineering level yet either.

Next time this happens, please get support engaged and we can look into it.
Badge +1
I see that this is an old post but was wondering if there are any updates. i have a new installation of Nutanix with ESXi 6 U2 and am seeing the same issue
Badge +3
...log a ticket with Nutanix support, uploading the logs. It looks like a bug with Nutanix and v6 U2. Would be great if you could post their findings after
Badge +1
We've seen this issue as well across multiple Nutanix clusters in our environment and only with Nutanix and not our other hardware. I opened a ticket as requested but the Nutanix tech has asked me to open another with VMware to investigate further since we use distributed switches. Is anyone else using distributed switches and experiencing this issue? Is anyone seeing this behavior on standard switches?
Badge +1
We have standard switches. i recently recreated the vmotion vmkernel to a new (vmotion) tcp/ip stack as opposed to the "default" with good results. The management network has stayed vmotion disabled through several migrations. Doing this requires vcenter 6 for gui support although I have heard that you can do it in 5.5 with esxi console cli.
Badge +3
We get this issue and only use standard switches.

Everyone here getting this only with v6 U2 or other versions?

Our specific build is 3620759. Interested to know others'
Badge +1
After hearing back from VMware's US based support I was told that this is a bug which is fixed in ESXi 6.0 Update 3.
Badge
Thanks for the update.

Do you have any VMware KB for this issue or some more detail on the BUG.

Thanks
Salman Siddiqi
Badge
Didn't find any related item in ESXi 6 U3 release notes
http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-esxi-60u3-release-notes.html#rserverissues


A related topic on VMware forum but only a single reply
https://communities.vmware.com/thread/545510?start=0&tstart=0
Badge +1
All I got from VMware support was the response below:

Resolution: Known issue. Resolved in ESXi 6.0 Update 3. Appears there is an issue with virtualNicManager not correctly tagging/untagging the service on the management vmkernel port.
Badge
Thank you. Did you upgrade it and has it resolved the issue?
Badge
In hostd log, you will see entries similar to this

hostd.0:2017-04-17T02:27:42.965Z info hostd[7FAC2B70] [Originator@6876 sub=Vimsvc.TaskManager opID=HB-host-10@282-4b28ee67-7f-e4b1 user=vpxuser] Task Created : haTask-ha-host-vim.host.VirtualNicManager.selectVnic-1066hostd.0:2017-04-17T02:27:42.969Z info hostd[7FE81B70] [Originator@6876 sub=Vimsvc.TaskManager opID=HB-host-10@282-4b28ee67-7f-e4b1 user=vpxuser] Task Completed : haTask-ha-host-vim.host.VirtualNicManager.selectVnic-1066 Status success
As the task is associated with vxpuser, this is vCenter initiated. When I enable/disable vMotion from vCenter, my AD account is logged in hostd in the form user=vpxuser:domainuser.

When its done through ssh or locally, user=root. Just "user=vpxuser]" looks to me non-human vCenter triggered tasks.

Badge
Hello all,

anything new on this strange issue ?
I have the issue on a brand new installation (vSphere 6.0U2 / AOS 5.0.2).
But on another environnement, I am running 6.0U2 vSphere version without this issue : I am asking me if the issue really comes from vSphere version or from Nutanix.
Did somebody fix it definitively, and how ?
thx
Badge +1



Refer to agibson posts.

Remove and recreate vm kernal for vmotion by using specific tcp/ip stack, this solution is work fine for me.

thank you

remark: AOS 5.0.2, ESX6.0u2





Badge
It is ok for us now, thx for help !!
Badge
I fixed it by disable the vMotion setting on vmk0, then reboot the ESXi.

Reply