The Wait is Over: AHV Turbo is Here!

  • 7 December 2017
  • 4 replies
The Wait is Over: AHV Turbo is Here!
Userlevel 7
Badge +34
This post was authored by Felipe Franciosi, Software Engineer at Nutanix

One of the hot new features available in the next major Nutanix OS 5.5 release is the AHV Turbo Technology that you may have heard about at our .NEXT events. A key enabler of this solution is a component internally referred to as Frodo. This article discusses how it fits in the storage datapath and what differentiates it from any other storage virtualisation engine available on the market today.

Traditionally, AHV presents virtual machines with a single queue Virtio-SCSI PCI controller. This means that, independent of the number of disks or vCPUs, a guest can only submit up to 128 requests at a time. Additionally, this single data structure can only be managed by one thread from the hypervisor side. This model is similar to other hypervisors.

While this architecture works really well on most scenarios, it shows signs of abate at higher throughput or IOPS rates. This will be a concern as modern hardware such as NVMe and RDMA gain in adoption. Other hypervisors circumvent this limitation by allowing the configuration of more virtual controllers to each guest. That solution, however, doesn't improve the performance of single virtual disks and just adds complexity to the VM configuration. Additionally, it doesn't reflect what has been done to overcome this limitation for real hardware.

Starting with NVMe controllers, a single drive can deliver hundreds of thousands of IOPS. In order to achieve such levels of performance, each one of these drives presents multiple hardware queues. Operating systems, in turn, introduced support for multi-queue block layers. This allows better scalability all across the stack: applications can submit IO in parallel over multiple queues and, at the same time, the hardware can process the queues also in parallel.

If it works well for real hardware, why can't hypervisors follow suit? AHV does.

Frodo is a new Acropolis Hypervisor component which handles the Virtio-SCSI PCI controller presented to virtual machines. As far as the guest is concerned, the only difference is that this controller is now multi-queue. From the hypervisor side, however, Frodo is designed to be more efficient and also to process the different queues in parallel using multiple threads.

In order to achieve this, Frodo was specifically designed to work with Nutanix. On top of being multi-threaded to process the request queues in parallel, each thread is also extremely efficient. For starters, Frodo knows that the requests are SCSI commands. It can therefore pass them, without any translation, directly to the CVM which already supports this protocol. Next, the virtual queues to threads mapping is done in such a way that allows for intelligent request batching, making the communication also more efficient. Finally, it gives AHV complete control over the datapath, laying the foundation for many other future possibilities.

The graph above compares the IOPS obtained for 4k random read requests on a NX-3060-G5 AF setup. It shows how the performance improves as the number of outstanding requests is pushed higher on a single virtual machine with 8 vCPUs, 16GB of RAM and 6 virtual disks. The VM which is powered by Frodo can deliver IOPS as high as 180k while the standard VM using Qemu will not go above 80k.

The development of Frodo is a great example of why Nutanix is investing in AHV. When Nutanix has control over the entire stack, the possibilities are endless.

In conclusion, Nutanix AHV is an awesome hypervisor today. AHV Turbo builds on that by enabling the Nutanix Enterprise Cloud OS to take advantage of the next generation technologies like RDMA, NVMe and 3D XPoint.

If you have thoughts or questions, let's keep continue the conversation on our forum. Be sure to label/tag your post with AHVTurbo.

Forward Looking Statements
This blog includes forward-looking statements, including but not limited to statements concerning our plans and expectations relating to product features and technology that are under development or in process and capabilities of such product features and technology and our plans to introduce product features in future releases. These forward-looking statements are not historical facts, and instead are based on our current expectations, estimates, opinions and beliefs. The accuracy of such forward-looking statements depends upon future events, and involves risks, uncertainties and other factors beyond our control that may cause these statements to be inaccurate and cause our actual results, performance or achievements to differ materially and adversely from those anticipated or implied by such statements, including, among others: failure to develop, or unexpected difficulties or delays in developing, new product features or technology on a timely or cost-effective basis; delays in or lack of customer or market acceptance of our new product features or technology; the introduction, or acceleration of adoption of, competing solutions, including public cloud infrastructure; a shift in industry or competitive dynamics or customer demand; and other risks detailed in our Form 10-K for the fiscal year ended July 31, 2017, filed with the Securities and Exchange Commission. These forward-looking statements speak only as of the date of this presentation and, except as required by law, we assume no obligation to update forward-looking statements to reflect actual results or subsequent events or circumstances.

Disclaimer: This blog may contain links to external websites that are not part of Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.

© 2017 Nutanix, Inc. All rights reserved. Nutanix, the Enterprise Cloud Platform, the Nutanix logo and other Nutanix products mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand and product names mentioned herein are for identification purposes only and may be the trademarks of their respective holder(s).

This topic has been closed for comments

4 replies

@aluciani Hey that was a great briefing on the Frodo component and Virtual multiple queuing system!

I would like to clarify the fact that, with the introduction of Frodo, queue depth in IO tool configuration no longer holds its place? Since we could process much more IO requests in parallel per disk ?.

Userlevel 7
Badge +34
Hi @Bala_88

Thanks for the feedback - I'll see if the author can get me a reply and I will share here.
Userlevel 7
Badge +34
Hi @Bala_88

Here is the reply from the author

With Frodo, the virtual SCSI controller on the virtual machine has a request queue for each vCPU. Each request queue can hold up to 128 requests, and that’s irrespective to the number of virtual disks. The number of requests admitted by Stargate (to be processed in parallel), though, can still restrict the total amount of inflight operations at any one time. In any case, the Nutanix stack is always configured for best performance and resiliency. Having said that, we strongly encourage customers to experiment with different benchmarks which are representative of their workloads to find the best VM configuration to suit their needs. For cases where a single VM needs more storage power than what can be provided by a single node, we recommend checking out Volume Group Load Balancing.

Let me know if this helps 👍
Thanks @aluciani for syncing with the author and updating me on this!

Yes, it clarifies!