HCI and the Art of Performance Measurement

  • 10 March 2017
  • 0 replies
HCI and the Art of Performance Measurement
Userlevel 7
Badge +35
This post was authored by Andy Daniel, Senior Technical Marketing Engineer at Nutanix

When I joined Nutanix as part of the PernixData acquisition in the summer of 2016, it was the first time that I had an opportunity to explore “under the hood” of the Acropolis Distributed Storage Fabric. I had worked for the past three and half years solving storage performance issues using server-side software and flash so I was keen to see how well Nutanix understood data locality and flash. I certainly knew of Nutanix in the marketplace, but to me, they were best known for “one-click” ease of management, not performance.

Although there were reference architectures for business critical apps and the commensurate blog work by various teams including Michael Webster, Josh Odgers, and others, there seemed to be little official performance related material available. This left me wondering, was great performance just a given on Nutanix? Furthermore, was my confusion an indication that my head had been buried in the sand during the seismic shift to HCI and the rise of the enterprise cloud?

As a precursor, I’ll admit that assessing realistic storage performance can be difficult. Over the years, I’ve been pulled into countless engagements where customers or prospects have fired up fio or IOMeter, followed a well-intentioned blog post step-by-step, but weren’t seeing the results they had expected. Benchmarking tools are difficult to configure even when used in appropriate situations. What these customers didn’t know is that they were doomed from the start. Unfortunately, most systems properly designed for real-world virtualized workloads just aren’t appropriate candidates for testing with simple I/O generation tools. But if that’s the case, where do we turn?

It turns out that when I joined Nutanix, a similar question was being debated across the organization. Since underlying resources are shared throughout the infrastructure, evaluating performance is even more difficult with hyperconverged offerings. There were differing perspectives and opinions on how to best answer my original Nutanix performance question from the experts in their respective fields.

The solutions and performance engineers who create and turn software knobs during the day and dream of byte-addressable persistent memory at night easily argued my same benchmarking explanation above. At the same time, product marketers were pointing to the competitive landscape.

The truth was, with flash storage hardware and its relative abundance of IOPS now on the scene, competitors were proudly thumping their chests using synthetically generated “hero numbers.” It wasn’t hard to argue that this was distracting customers and creating considerable confusion in the marketplace. It’d be easy to simply participate in the madness and demonstrate jaw-dropping numbers for attention.

Luckily, around the same time that I joined the conversation, Nutanix decided to turn to a third-party expert for answers. That expert was Enterprise Strategy Group (ESG) and specifically, Senior Analyst, Mike Leone. With my background, newcomer perspective, and penchant for a challenge, I grabbed the baton and worked directly with Mike as the liaison between the teams on the project. Not only did I want to successfully arbitrate the internal discussion, but I secretly wanted a front-row seat to answer my own performance questions.

Today, I’m proud to announce the final result of our collaborative work together presented in a report that can be found here. With ESG’s help, we tested the performance consistency, predictability, and scalability of four enterprise-class, mission-critical, application workloads (Microsoft SQL Server database, Oracle Database, Microsoft Exchange, and Citrix XenDesktop VDI) on the Nutanix Enterprise Cloud Platform.

Unlike the “hero number” tests from other vendors that simply generate synthetic I/O, we focused on testing realistic workloads using industry standard application testing tools like SLOB, Jetstress, and Login VSI. Mike explains it best:

“In all cases, compute and storage resources were exercised so that the testing emulated the real-world performance of HCI solutions containing shared compute and storage resources. All tests and results are meant to present meaningful application performance data that is likely to be achieved in a production environment.”

At the same time, we were keenly aware of the generic IOPS and latency claims from others in the industry. So, for comparison, in the report, you’ll also find industry leading, referenceable, storage IOPS and latency results generated during the tests on the Nutanix Enterprise Cloud Platform. We think that these results, provided in the context of the application results that matter most, prove that performance is a given on Nutanix.

Stay tuned as we deep-dive into each of the tests over the next several weeks and give additional details about how and why each of the workloads were chosen. With my head firmly rescued from the sand, I’m ready to help elevate the performance conversation. Grab a copy of the ESG report, let us know what you think, and continue the conversation on the Nutanix Next Community.

You can download the report here.

Let's Discuss: Keep the conversation going in our community forums

Disclaimer This blog may contain links to external websites that are not part of Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site. Please send us a note through the comments below if you have specific feedback on the external links in this blog.

2017 Nutanix, Inc. All rights reserved. Nutanix is a trademark of Nutanix, Inc., registered in the United States and other countries. All other brand names mentioned herein are for identification purposes only and may be the trademarks of their respective holder(s).

This topic has been closed for comments