Solved

VMware/Unidesk/XenDesktop/Nutanix

  • 5 February 2014
  • 14 replies
  • 9652 views

Badge +3
Hi Everyone,

We currently run our XenDesktop 5.6 Environment with Unidesk 2.1 and have only one main issue that we would like to see resolved. With Unidesk the VMDX disks (which contain the layer information) are classed as non-persistent drives. As a result when re reboot the Virtual Destkop VM the power on process can talk anywhere between 10-15 minutes - if these disks were persistent - power on is considerably quicker. I have read (ut unable to find the link) that this has been accepted as a VMware issues and currently being worked on - Can anyone shead some light on this as to when VMware will be resolving the problem

Cheers
Damon
icon

Best answer by dlink7 8 February 2014, 21:53

Hi CampbellNZ

I haven't heard of any update from VMware. The only work around today is to create small containers for the VM's.

Here are two links explaining the current issue for people that might not know.

http://www.unidesk.com/support/kb/unidesk-configuration-considerations-nfs-based-storage-including-nutanix-your-boot-images

http://nutanix.blogspot.ca/2013/10/unidesk-on-nfs-taking-about-minute-to.html
View original

14 replies

Userlevel 4
Badge +21
Hi CampbellNZ

I haven't heard of any update from VMware. The only work around today is to create small containers for the VM's.

Here are two links explaining the current issue for people that might not know.

http://www.unidesk.com/support/kb/unidesk-configuration-considerations-nfs-based-storage-including-nutanix-your-boot-images

http://nutanix.blogspot.ca/2013/10/unidesk-on-nfs-taking-about-minute-to.html
Badge +6
I've been working with VMware/Nutanix/Unidesk with this issue for the past few weeks and currently have open cases with VMware/Nutanix (14445911002 / 00011020).

I can confirm that the issue is with VMware and that they are aware of it (PR# 913980). Unfortunately the support engineer I was working with denied it was a VMware issue and did not seem eager to assist me or provide more information (stated there was confidential information involved). I offered him both of the links provided here, as well as information and responses from Nutanix engineers.

At this point, the issue is there, and Nutanix is stepping up in order to resolve this issue for its customers (and kudos to them for doing so). I was informed that the next NOS release (3.5.3) will mitigate the issue and ultimately design around it in 4.x.

EDIT:
Just received an official reply from VMware on this issue.

Sorry for the delay in responding. I wanted to provide you with the most up to date information. The issue which you are experiencing has been identified as a design issue within our NFS stack.
The resolution of this problem requires a full re-write which likely will not occur until at least the next major release, which has a ballpark 2015. I'm sorry the update is not more favorable, but I wanted you to have the VMware's official stance on the issue instead of second hand information from a blog. If you have any further concerns or questions, please let me know.
Userlevel 1
Badge +9
NOS 3.5.3 is now available
Userlevel 4
Badge +21
From some quick testing of 3.5.3 if you split up the Bootlayersarchive into seperate containers you can get around a 50% improvment. My testing was with vSphere 5.5 and 128 desktops. It will be your mileage may vary. It will really depend on how many files that vSphere has to scan.
Badge +7
What other design issues are in their NFS stack I wonder? I notice snapshots on Nutanix under NFS are way slower to delete than on my old iSCSI SAN, and stun times can reach 6 seconds - which causes my VMware 5.5 update manager to disconnect from vcenter during a backup (as they have to be on different VMs if you use the VC appliance).

That is a scary implication for customer database servers and their related application servers, and I'm seriously considering implementing iSCSI again (as I believe you can do as well with Nutanix) because I think VMware NFS is flaky as hell.
Userlevel 4
Badge +21
I would encourage you to use Nutanix based snapshots and stay on Nutanix NFS where possiable.

There are limitations around hypervisor based snapshots. The below KB states hypervisor based snapshots "Negatively impacts the performance of a virtual machine."

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009402
Badge +6
Just curious but wouldn't it perform better to just create vdisks and create datastores on iscsi for Unidesk?

Create an iscsi lun called unidesk-ds01 in a container called unidesk-ds01 that is 500G in size

ncli vdisk create name=unidesk-ds01 ctr-name=unidesk-ds01 max-capacity=500
Badge +7
Nutanix snapshots are great (and very fast) for protecting whole VMs, but when you need more granular recovery like file then it's often better to use third party backup tools that rely on VMware snapshots.
Looking forward to NOS4.0 as there appear to be big improvments in the Nutanix Protect features, and I note Veeam have an increased convergence with Nutanix, so maybe I try that product as well to see if it solves my snapshot blues on VMware.
Userlevel 4
Badge +21
Veeam is great, We have a joint webinar this Wednesday if you have time,

http://itbloodpressure.com/2014/04/13/blue-yellow-green-nutanix-and-veeam-paint-by-number-disaster-recovery/

-DL
Badge +5
Is there a screencast of this somewhere by chance?
Userlevel 4
Badge +21
The reply is here

http://www.veeam.com/videos/select-vmware-hyper-v-protection-data-center-4076.html
Badge +2
I've run into this same issue. I worked for months with VMware and Nutanix with stability issues. If I did a Vmware snapshot on any VM it would spike the host CPU, statistics in Vcenter would disappear and anyone connected to the VM being snapshotted would be disconnected. I also use Unidesk, and the cachepoints would become corrupt when this happens.

In my Unidesk environment, I was careful to not have more than 25 VM's per datastore, but I have a legacy non-persistent view environment all on one datastore. It had accumulated hundreds of orphaned folders from failed reprovisioning. I cleaned up all of the folders and noticed a significant difference, but still not perfect.

I updated to Nutanix 3.5.3 and things were better, but still not perfect. I had issues with the Cassandra service running out of memory, so I've had to manually restart it.
During a maintenance period last week, I took the opportunity to update ESXi. I updated to 5.1.0, 1743533 and the snapshot issues disappeared. Now they behave much like I'm used to and the host doesn't freak out like before. I'm actually going to use the Nutanix snapshots, but since View uses snapshots to provision, I had to solve the issue. I do agree that NFS on VMware still has some issues.

I'm still awaiting 3.5.4 to help with the Cassandra issues, but my environment is much more stable now.

I'm also running Bitdefender using Vshield which also seems to cause sporadic issues. I don't know if anyone else out there is using a vshield product.
Userlevel 4
Badge +21
Unidesk & Nutanix believe to have found a resolution to the NFS bug. Currently Unidesk is using scripting to fix the issue. If you're facing a problem contact Unidesk support and then can address it. Sounds like you will have reboot the desktops to get this fixed however.
Badge +4
FYI, Nutanix and Unidesk recently published a best practice based on extensive testing both within our own labs and in existing customer environments. We now have a best practice that when followed gives us consistent timings for operations such as power on and vMotion. Please review KB1208 within our support portal, or use the link below.

https://portal.nutanix.com/#/page/kbs/details?targetId=kA0600000008XgCCAU

This KB will continue to evolve as future NOS and ESXi versions are released that improve upon what we know today.

Reply