Citrix MCS and PVS on Nutanix: Enhancing XenDesktop VM Provisioning with Nutanix Part 1
This post was written by Kees Baggerman and Martijn Bosschaart
Citrix XenDesktop now offers a fully integrated desktop virtualization suite for both virtual desktop infrastructures (VDI) and hosted shared desktops (HSD)—the latter is better known as XenApp. Add to this the powerful HDX protocol and the Flexcast stack and you’ve got the world’s most advanced and widely used desktop virtualization product. From a single broker console, Citrix Studio, you can deploy all types of desktop and application workloads, either persistent or non-persistent, and each of these can be derived from master images and cloned on the spot.
In this blog we will focus on XenDesktop provisioning methods, and, in particular, how Nutanix simplifies and enhances both storage infrastructure configuration and overall deployment.
Although MCS and PVS have their own unique features and advantages, which we discuss below, their primary function is to enable the rapid deployment of copies of a golden master virtual machine to multiple users simultaneously, which saves administrators lots of time and ensures a consistent and reliable environment. Instead of patching hundreds, or even thousands, of PCs, admins now update a handful of master images that are then rolled out almost instantaneously. Another advantage is that, because you are managing single images, a rollback to a previous version is basically a matter of managing snapshot file versions. Like Nutanix, Citrix likes to keep it simple. As we will show, though, the Nutanix Distributed File System takes the simplicity of XenDesktop to another level with streamlined management, reduced rollout time, and enhanced performance.
Origins of Machine Creation Services
The first provisioning technique we want to discuss is Machine Creation Services. MCS is, in terms of Citrix’s offerings, still the new kid on the block, even though it has been around since the 5.0 release of XenDesktop, back in 2010. With the release of XenDesktop 5.0, Citrix moved towards a simpler, scalable, and agile architecture, one that would lead to the end of the Independent Management Architecture (IMA). The older XenDesktop versions prior to 5.x were still based on IMA, and this proved to be a huge problem in large environments because IMA was meant to run in XenApp-only environments with a maximum of 1,000 servers. At the time this scenario was fairly unusual, and so it was not considered a problem. Before long, however, XenDesktop customers began trying to run thousands of desktops and quickly hit IMA’s limits. This required a new architecture, and thus the Flexcast Management Architecture (FMA) was born.
FMA was at first only available for VDI workloads, as XenApp was still a separate product and would continue to have another three full versions (5.0, 6.0, and the final 6.5) based on IMA. Only with the release of XD 7.0 did the XenApp workload make its way to FMA, first as a hosted shared desktop option in XD, and then brought back as XenApp 7.5 exclusively for that specific workload. When XD 5.0 was released, MCS became available as well, and its design was focused on simplifying XenDesktop and lowering setup time. The new version significantly simplified both broker installation and group rollouts of desktop VMs. With XD 5.0, an admin could now do the entire setup of a XenDesktop farm in just a couple of hours and still have plenty of time to drink coffee while doing it.
However, when MCS became available, storage infrastructure did not have the advantages that Nutanix makes possible. While Provisioning Services in XenDesktop is partially network based, MCS is a storage-centric provisioning technology. Its adoption was therefore slowed by the state of the technology four-to-five years ago, as the SANs of that age couldn’t handle the increased IOPS requirement, and they still can’t do it well today.
This is where Nutanix comes in. We manage these challenges on multiple fronts. To give you a better understanding of what Nutanix offers XenDesktop users, we’ll first deep dive a bit into MCS.
MCS – the inner workings
When non-persistent environments use MCS, the broker will copy the master image to each configured datastore specified by the Studio host connection. This can either be a local datastore on each host or a shared datastore on a SAN or NAS. The admin can then select the available datastores, which are read from the hypervisor cluster (through a VMware vCenter, Microsoft SCVMM, or XenCenter interface). After this copy is complete (which can take a while depending on the number of datastores configured), all the VMs in the catalog are then pointed to these local copies.
MCS works roughly as shown in the figure below. I say “roughly” because each supported hypervisor has its own specific MCS implementation in terms of disk management, but the net effect is the same.
To make each VM unique, and to allow for the data to be written, MCS uses two additional disks in addition to the master disk.
The ID disk is a very small disk (max 16 MB) that contains identity information; this information provides a unique name for the VM and allows it to join Active Directory (AD). The broker fully manages this process; the admin only has to provide or create AD accounts that the VMs can use. The broker then creates a unique ID disk for every VM.
The difference disk, also known as the write cache, is used to separate the writes from the master disk, while the system still acts as if the write has been committed to the master disk.
From a VM perspective, these “chains” act as a pane of glass. While the base OS disk, the ID, and the difference disk are separate, the end user’s perspective will be that of working on a unique, writeable VDI VM.
In the screenshot below, there are two virtual disks in use by the VM (just after a shutdown) in addition to the base disk. The identity disk is about 7 MB and the delta.vmdk disk is a difference disk mounted to the VM. These are the two disks mounted in the VM config (vmx).
In VMware environments, however, this is not the actual difference disk file the changes are written to. MCS on VMware utilizes a VMDK disk chain with multiple child disks. On Hyper-V and XenServer, MCS utilizes VHD-chaining, quite similar to VMware, but slightly different in execution and disk naming.
The delta.vmdk file you see in the above screenshot is actually just a disk descriptor file that references the golden master and diverts writes to a child disk, or REDO log. This is not to be confused with a snapshot.
So MCS more or less sets up a read-only “disk-in-the-middle,” which is not much more than a redirector. However, it allows Citrix to effectively control the persistent or non-persistent behavior from within Studio, as it does not have to empty or delete the disk configured in the VMX file (ESX cleans up the REDO file on reboot).
When using a persistent desktop, this redirector disk does not exist; differential writes are written directly into the configured difference disk in the VMX file that is the only child to the master disk. This disk is not altered on reboots, which allows user changes to persist.
The master disk used in both persistent and non-persistent scenarios is not stored inside the VM’s folder, but is placed in its own folder in the datastore root. There is a reference to the base disk in the delta file’s VMDK file in case of using non-persistent mode (see below):
The folder the base disk is placed in gets its name from the catalog name in Studio, followed by the actual filename, which is taken from the date and timestamp when the disk was created.
When Citrix Studio gives the command to boot a non-persistent VM, you can see more files popping up as the REDO file is created:
The hypervisor redirects all writes within the VM to the delta (diff) disk, which in turn get redirected to a separate REDO log file. The REDO log file is not a snapshot, but a child disk chained to the diff disk, which is a child of the master image.
If you copy a 1 GB file to the desktop of the VM, you can see the child disk get bigger:
Now that I have explained the basics of MCS on the disk level, let’s take a look at how MCS manages disk distribution.
Rolling out Master Disks
When you create a new catalog of VMs that use a non-persistent disk, you are first asked to select a snapshot of the master VM, which will then become the master disk. If you select a chained snapshot, Studio will first flatten the snapshot chain into a new master disk.
This disk will be copied to all the datastores configured in the Studio host connection. These are full copies, and so the more datastores you use to spread the IOPS load, the longer this process takes.
This is the first thing we can solve with Nutanix.
To overcome the IOPS burden, Citrix admins around the world have resorted to implementing local-storage-based architectures, which avoided the SAN when placing the write cache. While this solves a part of the problem, it also creates new issues, as you lose centralized storage management and have to deal with decentralized islands of disks.
For XenDesktop, this means that you now have to configure these local datastores in the host configuration option in Studio. If you have a big farm this means a lot of clicking (or scripting).
Configuring five hosts is not a problem, but in a bigger enterprise environment with, perhaps, dozens of hosts, this can become cumbersome. This is especially true if you need to take out a host for maintenance mode, or if a host is down during deployment of an image, because you would first have to deselect it to prevent the deployment phase from erroring out.
The problem is that when you roll out a new image, the flattened snapshot is going to be copied to all the datastores selected. The more datastores, the longer the copy is going to take—up to hours, depending on the size of your master disk.
Nutanix Distributed File System
Nutanix solves both the IOPS and configuration hassle with the Nutanix Distributed File System.
While the NDFS is made up of local storage, the disks of all the nodes in the cluster are drawn together into a storage pool. This storage pool is then presented back to the hosts as shared NFS containers.
This means that every host sees the same central datastore and XenDesktop no longer requires local storage configuration.
Within Studio, you select only the single container to which you want to rollout your VDI VMs.
Instead of using the option “Local,” you select “Shared” and choose the NFS datastore that has been mounted on the hosts in the cluster you configured in the Studio host connection. Studio gets this information from vCenter.
Now when you rollout a new master disk, the copying process takes place only once, and at most will take several minutes. The NDFS file system will make sure the data gets distributed across the cluster.
Nutanix also brings “data locality” to the table, which enhances MCS even further. The fastest way for a VM to access its data is to make sure it’s being served locally. Data locality ensures that the data always follows the VM and stays as close as possible. This is not only a performance booster, but it also prevents unnecessary traffic across the network between nodes.
This system works great for VMs that have their own VMDK for both reads and writes. With MCS, however, the VMs no longer have a local disk (besides the write cache)—rather, they read from a centralized image. This would cause extra reads to traverse the network.
Studio creates the flattened snapshot copy of the master VM on the host it is currently placed on, so the master disk might only be local to a subset of VMs, as the VMs are most likely spread out. This would mean not every VM would benefit from local data. Those VMs not local to the master disk would need to grab reads over the network, although writes to the difference disk would still remain local.
This is where Shadow Clones come in. Shadow Clones are a mechanism that automatically detects if multiple VMs are using a single virtual disk file. Once detected, the Shadow Clone will mark the master disk read-only, which allows the system to cache the disk locally on all hosts in the cluster, and thus serve all read IOs locally.
As we can see, the Nutanix Distributed File System drastically simplifies the Studio configuration by offering all the benefits of shared storage to XenDesktop Studio. Using only a single datastore target reduces rollout time and enhances performance by utilizing local storage for speed. And NDFS offers Shadow Clones to optimize data placement, removing the need for local storage management.