@NutanixNext
Visit Nutanix

Nutanix Connect Blog

Welcome to the Nutanix NEXT community. To get started please read our short welcome post. Thanks!

Showing results for 
Search instead for 
Do you mean 

11 Reasons why Nutanix is the Best All-Flash Platform

by Community Manager ‎12-12-2016 12:10 AM - edited ‎12-12-2016 05:05 AM (9,928 Views)

AFA-Nutanix.png

This post was authored by Steve Kaplan, VP of Client Strategy at Nutanix

 

All Flash Arrays: Dead Men Walking

All Flash Array (AFA) manufacturers may be rejoicing in the inevitable demise of spinning disk, but hyperconverged infrastructure (HCI) is increasingly upending the entire storage category. While an AFA may be faster and easier to manage than a traditional array, it’s still a SAN. Nutanix Enterprise Cloud is not only a better platform for flash than the AFAs, but also than other HCI solutions. Here are the 11 reasons why:

 

1) Dramatically reduced network latency effects

Nutanix HCI already bests AFA performance by eliminating network latency (see @vcdxnz001 post, Your Network is Too Slow and What to Do About it). Innovations such as NVMe and 3D XPoint amplify the advantage of storing data on flash or other Storage-Class Memory (SCM) next to the compute in a hyperconverged environment. Accessing data in a traditional model from an All Flash Array over a slower network negates the benefits of the faster flash/SCM.

 

Picture1.jpg

 

Putting flash in a proprietary array at the end of a network designed for the latency of magnetic media instead of next to the compute intuitively makes no sense. This really boils down to simple physics where proximity matters. Flash should be directly connected, not remotely attached where it requires multiple hops, protocols, and performance-constraining controllers to be accessed. It’s just physics!

 

Picture1.png

I/O path length for AFAs versus Nutanix

 

AFA vendors will often suggest faster networks and NVMe over fabrics that will offer lower latency and higher bandwidths. Nutanix enables customers to optimize the benefits of flash without the requirement to purchase expensive new storage fabrics that perpetuate legacy complexity.

 

FlashvsNetwork.png

Image from Long Virtual White Clouds by Michael Webster

 

2) Density advantage

Nutanix enables packing 92TB of flash, in addition to all of the server resources, into just 2U. AFAs require not just the array, but also the compute, the storage fabric and possibly lower-cost disk storage. All of this requires more power, rackspace, and cooling.

 

3) Commodity Hardware

Most AFAs such as Pure utilize proprietary hardware, but this is a roadblock to quickly deploying new hardware innovations. All-flash arrays risk technological leaps that leave customers with obsolete products and facing forklift upgrades the next time they need more capacity. In today’s fast-paced technology environment, it’s the companies who leverage global economies of scale that succeed by leveraging the innovation driven by the world’s largest commodity hardware manufacturers.


Take the case of Sun Microsystems. Sun bet on proprietary hardware while the industry shifted to commodity servers utilizing the more cost-effective Intel-compatible microprocessors popularized in personal computers. Sun lost 80% of its value before being acquired by Oracle in a fire sale.


Violin Memory is another example. Violin was one of the first companies to introduce all-flash memory solutions to the marketplace. This was very cool and fast tech, with great engineering when they launched a decade ago.


But consumers had another idea. They loved the speed and reliability of solid-state drives (SSDs) which now can be found in almost every laptop, desktop and memory array. Even as the price of SSDs plummeted, Violin preferred to design its own proprietary field-programmable gate arrays. A sophisticated solution, perhaps, but no match for the rapid improvement of SSDs; Violin’s proprietary hardware quickly fell behind and the company has been delisted from NYSE.


The hyperconverged business is hardly the only example of a thriving enterprise technology built upon commodity hardware. All of the leading cloud providers also utilize commodity servers. Proprietary hardware, while once essential to protecting a company’s innovations, now hinders, or even destroys, a manufacturer’s ability to compete.


4) Distributed storage controllers

Most AFAs have physical, non-distributed, storage controllers that are easily saturated with traffic. Since the controllers are the bottleneck, adding more shelves of SSDs does not increase performance.


If we assume a single enterprise SSD is capable of delivering ~500MB/s of throughput, then a controller with dual 4Gb FC adapters is bottlenecked with only two SSDs. Even upgrading to dual 16Gb FC adapters only accommodates eight SSDs.


To overcome these limitations, AFAs must accommodate multiple adapters resulting in complex fabric configurations. But this inevitably hits the controller limits, forcing customers to purchase more AFA systems and creating more silos.


Contrast this with Nutanix where every time a node is added to a cluster, it’s also adding a virtual storage controller - enabling immediate enhanced performance. Resiliency is massively improved as loss of one controller has very little impact. This is why Nutanix can do non-disruptive 1-click upgrades and maintenance with very low impact.


5) Data locality

Imagine what would happen if 75% of the cars in Los Angeles were suddenly removed from the roads. Not only would traffic congestion quickly dissipate, but the city would benefit from other benefits such as fewer accidents, less road maintenance, reduced pollution, and so on.


Nutanix data locality similarly affects the data center environment by pulling the majority of read traffic off of the network; reads instead come from the local SSD within the node. Available network bandwidth is effectively increased for writes/end user applications improving not just storage performance, but also the application performance that the storage is servicing.

 

Picture1.png

 

6) Scalability

Capacity Performance: AFAs, which are typically limited to two physical storage controllers, hit a metadata bottleneck in scaling capacity that is limited by the amount of RAM/NVMRAM they have in a system. Adding SSDs, in most cases, does not improve performance.

 

At some point, the AFA customer must either upgrade to a bigger unit with more processing power, add complex fabric interconnection, or start creating silos. AFA manufacturers will say they can replace existing controllers with new faster ones, but despite the disruption and expense, that shifts the bottleneck to the network or possibly even to the existing flash medium.

 

Contrast this with Nutanix which, unlike AFAs, are not bottlenecked by two physical storage controllers. The VMs on every node are serviced by the Controller Virtual Machine (CVM) on that node. Every time a node is added to the cluster, a CVM is also added, thereby linearly scaling not just capacity, but also performance and resiliency, as well as expanding the management stack capabilities. Acropolis Block Services (ABS) and Acropolis File Services (AFS) enable Nutanix customers to scale physical and virtual workloads as well as file serving from the same Nutanix cluster, thereby eliminating silo inefficiencies.

 

Picture1.png

 

Dedupe/Compression Performance: Nutanix’s unique implementation of dedup and compression ensure that performance overhead is minimized. Nutanix does not brute force dedupe/compress of all data as this requires more physical resources and impacts all IO regardless of the outcome.

 

Resiliency: Both resiliency and high availability are built-in across the entire Nutanix stack. Replication Factor 2 (RF2) or RF3 along with erasure coding (EC-X) enables superior fault tolerance for disk. Block awareness mitigates node failure, while sync and async-replication provides resiliency for entire datacenters.

 

All-Flash Storage-Only Nodes: Storage-Only nodes provide Nutanix customers the ability to scale compute and storage separately, thereby minimizing costs of their all flash environments.

 

7) Simplicity

Nutanix one-click upgrades reduce both complexity and risk involved with the upgrade process - there is no complex interoperability matrix or operational guidelines. Nutanix also simplifies the flash-based architecture by eliminating LUNs and their presentation by focusing on VMs rather than on storage constructs, and by including both centralized management and capacity planning.

 

Picture1.png

Nutanix’s Simple and Intuitive Prism Management Dashboard

 

8) Workload Consolidation

AFAs must send information from the flash array across the network to the compute for processing. Beyond adding the aforementioned latency, this also requires additional queue management and overhead. CPUs can quickly become overloaded when simultaneously receiving small block, high IOPS and large block, high throughput application requests. To ensure consistent performance, AFA administrators must frequently separate OLTP & OLAP workloads from running on the same platform.

 

Nutanix gives the compute direct access to the storage. Servicing requests with limited overhead and consistent low latency enables mixing of workloads. And with Nutanix Acropolis Block Services, Nutanix becomes the storage backplane for bringing together different types of applications. Customers can even consolidate both physical workloads and virtualized workloads in the same cluster.

 

Additionally, AFAs tend to have block storage devices for blocks and flash arrays for blocks. With Nutanix, the storage is shared between block and file.

 

Picture1.png

 

9) Proven Mission-Critical Application Deployment

Nutanix enables optimal performance for critical apps right out of the box, even with multiple workloads. It eliminates the single point of failure challenge with storage access failover, self-healing, and ongoing data integrity checks. Storage performance is predictable, and no complex configuration or tuning is needed.

 

Non-disruptive software updates eliminates planned downtime, enhancing Nutanix’s appeal for hosting mission-critical applications. Maintenance windows for software upgrades and scaling become a thing of the past. Unlike almost all other HCI solutions, Nutanix has years of proven maturity and success in enterprise deployments of Splunk, Oracle, SAP, SQL Server, Exchange, and many other mission-critical applications (only Nutanix and VxRack are SAP-certified).

 

Picture1.png

 

10) Lower Total Cost of Ownership (TCO)

AFAs eventually run out of controller capacity, technology advances to the point where the existing AFA solution is comparatively uneconomical, or the equipment just gets old. In any of these cases, the AFA owner faces a forklift upgrade - a process that is typically expensive, complex and time-consuming. As a result, AFA owners typically purchase more capacity than required initially in hopes of having enough resources available to meet requirements four or five years down the road.

 

Nutanix owners never face a forklift upgrade, and therefore do not require purchasing more nodes than needed at any point in time. As technology changes, newer nodes can simply be added to the cluster with a mouseclick, and the software takes care of everything else. Nutanix eliminates the risk of under-buying.

 

Completely eliminating the need for storage arrays and storage fabric along with excess capacity up-front helps lower the CapEx cost for Nutanix. As the project footprint expands over the next few years, increasingly fewer nodes are required to run the same workload due to an increased density of VMs per node driven by both Moore’s Law and by performance enhancements in Nutanix software.

 

Picture2.png

 

The CapEx for the project lifetime is thereby further reduced along with the associated rack space, power and cooling. Administrative requirements for Nutanix are also slashed - an IDC study found an average 71% reduction in administration time required for organizations migrating to Nutanix.


11) The Advantage of an Enterprise Cloud Platform

At the end of the day, it’s not just about the work, it’s about how you do it. Nutanix’s utilization of Web-scale architecture is a unique differentiator incorporating hyperconvergence as part of an Enterprise Cloud Platform. Distributed technologies such as Cassandra, NoSQL, MapReduce and Curator enable significantly higher performance and efficiency when optimizing all-flash environments.

 

Data Access: Classic tree-structured metadata query architectures (Btree & R&B) that work well in an array environments where metadata is stored in each physical controller are not optimal in all-flash HCI environments. In HCI, the metadata is distributed across many nodes - making tree-structured lookup inefficient. To combat this inefficiency, Nutanix utilizes big-data technologies such as Cassandra and NoSQL to enable very fast look-up and very high fault tolerance. No single point of failure exists.


Data Sorting: Unlike legacy 3-tier and other HCI approaches which sort data in the IO path, Nutanix evaluates it in the background, enabling better performance. The system scales as nodes are added allowing faster dedupe and compression with increased data locality. Seamless scalability enables rapid evaluation of whether to promote or demote data depending upon memory and storage tiers available.


Analytics: Even all-flash environments have different tiers of flash (performance & endurance). Metadata continues to grow and it can be difficult to cost-effectively keep it in memory or on the fastest tier.


Nutanix has again utilized a big data approach to solve this challenge. A custom-written version of MapReduce/Curator is used to determine key elements of the data including what is hot, compressible, and dedupeable. The same framework similarly determines what data needs to move to another node for data locality, what data has been deleted, and what data needs to be relocated or rebalanced – particularly in the event of failure.


These analytics enable deeper insight including trending, real-time analysis, proactive monitoring and root cause analysis, and alerting.


Timing In contrast to other solutions that rely solely on sub-optimal, inline compression and proprietary hardware for dedupe, Nutanix enables offline sorting with MapReduce/Curator. This enables more writes before deciding to compress or dedupe and avoids the requirement for a performance limiting centralized database.


Unified Cache Cache enables data locality. Deduplication makes it possible to store more data in this performance tier and maximize local cache hit potential. To maximize efficiency without limiting performance, Nutanix performs in-line local deduplication of context cache. 

 

Picture1.png

 

NVMe: Dead Man Running?

At least one of the legacy storage manufacturers is promoting NVMe as the future. But migration to NVMe is going to further amplify the advantages of putting the compute next to the data rather than across the network. It will accelerate the journey to extinction of all the fabric stretched monoliths - including AFAs.


Thanks for content and edits to @joshodgers @briansuhr @_praburam @Priyadarshi_Pd  @sudheenair @binnygill @vcdxnz001 @RohitGoyal

Learn More

Nutanix Flash Forward Website Landing Page & eBook
Ten Things You Need to Know About Nutanix Acropolis Block Services
Ten Things You Need to Know About Nutanix Acropolis File Services


Disclaimer: This blog contains links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.

 

 

Nutanix .NEXT 2016 Europe - Opening Keynote

by Community Manager on ‎11-23-2016 12:32 PM (3,456 Views)

Thank you for joining us in Vienna for an EPIC .NEXT conference. We had a great sense of community building and discussing topics such as the Enterprise Cloud. In case you missed the livestream, I have posted the day one keynote session for you to re-live the exciement from Vienna. Enjoy!

 

Nutanix CEO Dheeraj Pandey, along with Nutanix Chief Product & Development Officer Sunil Potti and special guests from Citrix, Puppet and Docker, to learn about the new enterprise cloud platform that is radically simplifying virtualization, containers, compute and storage for all workloads.

  

 

 

 

 

 

 

 

Recap of .NEXT Announcements

Nutanix Announces the Industry’s Only Hyperconverged Solution for Cisco UCS Blades

Today, we are at the .NEXT On-Tour event in Boston and are bringing more exciting news about continuous innovation. A number of vendors, including Nutanix™, offer hyperconverged solutions running on Cisco® UCS® C-Series rackmount servers.

 

Nutanix Enterprise Cloud Platform Just Got Better

The ambition of the Nutanix Enterprise Cloud Platform is bringing cloud-like operational simplicity to enterprise datacenters. If the enterprise datacenter behaves and operates like a public cloud, then application demands will drive private cloud vs public cloud (aka “buy vs rent”) decisions, instead of vendor-driven requirements that are imposed on enterprise IT users.

 

Nutanix Unveils Powerful One-Click Networks to Broaden Enterprise Cloud Platform

Built-in Network Orchestration and Microsegmentation will Deliver Seamless Visibility and Control Over Entire Infrastructure Stack.

 

Continue the conversation on our forums or share any blogs you have on these topics. 

 

Disclaimer: This blog contains links to external websites that are not part of Nutanix.com. Nutanix does not control these sites, and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.

 

Assessing Hyperconverged Performance: The Numbers that Matter (Part 1)

by Community Manager ‎10-31-2016 07:26 AM - edited ‎10-31-2016 12:56 PM (6,238 Views)

This post was co-authored by Gary Little, Sr. Performance Engineer/Sr. Manager at Nutanix and John Charles Williamson, Solutions Technical Writer and Editor at Nutanix

 

With the Nutanix Enterprise Cloud Platform, assessing performance for your enterprise applications is much easier than you might think. In this blog series we approach the topic one facet at a time, beginning with IOPS. Our tests demonstrate conclusively that the NX-3060-G4 delivers far more IOPS than even the most demanding enterprise applications require. Rather than simply offer a series of charts and numbers, however, we explain:

 

  • The many advantages that hyperconverged infrastructures have over traditional architectures.
  • The key metrics for evaluating performance for virtualized enterprise applications.
  • Nutanix Enterprise Platform delivers increasing performance as the application demands it.

When Nutanix created hyperconverged infrastructure (HCI), it not only created a new technology, it created a new market. While there are many advantages to being a first mover, there are also unique challenges, such as a lack of publically available tools and techniques to help users properly evaluate your product’s performance.

 

This blog series aims to help remedy this gap. We explain how to assess performance on the Nutanix Enterprise Cloud Platform, with special attention to the most crucial metrics for mission-critical enterprise applications. Performance is a top priority at Nutanix; a large number of the new workloads our customers are deploying include performance-sensitive enterprise applications such as SQL Server, Oracle databases, and SAP Business Suite.

 

Performance is not just raw speed measured in IOPS, however. It’s speed, stability, and scalability. In this first installment we are going to tackle speed, the most widely used metric, and discuss what’s relevant and what isn’t when thinking about IOPS for enterprise applications on HCI. We chose a transactional database (SQL Server), typically the most I/O-demanding enterprise application, to test our popular all-purpose model, NX-3060-G4.

 

Our testing demonstrates conclusively that the NX-3060-G4 can handle far more IOPS than most enterprise workloads need. We’ll not only show you the numbers that matter for enterprise application performance, we’ll tell you what they mean.

 

Platform Configuration

The per-node configuration of the NX-3060-G4 includes:

 

  • Nutanix AOS 4.7.1
  • 2x E5-2680v3 2.5GHz Intel Xeon processors
  • 512 GB RAM
  • Hybrid—2 SSDs and 4 HDDs

 

Comparing HCI and Traditional Storage Architecture

Hyperconverged infrastructure locates a software-based storage controller close to the hypervisor, either as a virtual machine or as a kernel module. This design is a radical shift from what most enterprise architects are familiar with—many of whom still believe that you need dedicated storage hardware to get high performance I/O capabilities.

 

Traditional storage architecture centralizes I/O from all the hypervisor hosts in one location, usually with a pair of storage controllers connected to a set of disk shelves, which contain either SSDs or HDDs. The storage controllers are configured so that if one storage controller fails, the hypervisor hosts can still access data. To maintain performance, each storage controller should run at 50 percent utilization so that if one storage controller fails, the remaining controller can serve 100 percent of the I/O.

 

IOPS Obsession

Due to traditional storage architecture design, the storage controller’s I/O capacity has been a preeminent concern. This stands to reason, as the controller’s I/O capacity determines on average how many hypervisor hosts the storage can support, which in turn determines not only the cost (how many storage controllers one needs to buy) but also the complexity of managing the storage. Moving hypervisor hosts between storage devices is time-consuming and risky.

 

In fact, the complexity of managing multiple storage islands increases geometrically. In other words, managing three storage controller pairs is far more complex than managing one. It comes as no surprise, then, that IOPS continue to be a big concern for organizations moving to hyperconverged architecture.

 

The Hyperconvergence Difference

In an HCI environment, there are subtle but crucial differences that reduce the dependence on maximum IOPS as the primary performance metric. In HCI, every node in the cluster serves I/O, which means that as the number of hypervisor hosts grows, so does the I/O capability.

 

This design eliminates reliance on a single pair of storage controllers. The figures below illustrate this difference, where you can see that, with traditional storage architecture, storage controller capacity determines your consolidation ratio (and also leads to overprovisioning).

 

Picture1.png
Picture2.png

 Figure 1 Traditional Storage Architecture vs. HCI

 

Another issue that fuels our fixation with IOPS is the need to satisfy different levels of demand from different types of workloads. Some hosts run workloads with relatively little I/O and others require quite a bit. VDI requires little I/O, but a lot of CPU and memory, while databases require a lot of CPU, memory, and storage IOPS. It turns out, though, that even database workloads have fairly modest I/O requirements relative to the high I/O capacity of modern SSDs, which have driven the HCI revolution.

 

For example, a single Intel S3700 supplies up to 45,000 IOPS at 8 KB. Most hybrid HCI platforms contain two or more such SSDs. All-flash platforms provide up to 24 SSDs. It used to take racks of spinning hard disk drives to generate enough IOPS for enterprise apps, but the industry has moved well past these constraints. We no longer need or want massive deployments of monolithic and modular network storage.

 

Measuring I/O for Database Applications

How much I/O do today’s workloads need? To answer this, we measured the I/O characteristics of a commonly virtualized workload by modeling a Microsoft SQL Server 2014 workload with HammerDB OLTP preset as the driver. HammerDB is a free, well-documented, open-source tool that you can use to reproduce our results if you’d like.

 

We chose the TPC-C schema to build a roughly 550 GB database of 5,000 warehouses and used 750 concurrent users to drive the database transactions with no think-time, resulting in 1,200,000 SQL Server transactions per minute. We went with 750 users because if you don’t have enough users, the database tends to sit in cache; this number was large enough to drive a significant amount of I/O to storage. The transactions per minute are simply a measure of work being done.

 

The results of our test may surprise you. We found that running a transactional workload with a Microsoft SQL Server database, on eight cores, generated only around 20,000 IOPS before reaching 100 percent CPU on the database VM.

 

If we assume 24 cores on the host, the total I/O consumption on the entire host would max out around 60,000 IOPS before the host ran out of CPU for the database VMs. As these numbers indicate, a platform with two Intel S3700 SSDs, generating 90,000 IOPS, could support I/O-heavy database VMs with room to spare.

 

Assessing_Hyperconverged_Performance_Part_1.jpg

Table 1 I/O Requirements for Transactional DB Applications

 

The table above plainly shows that the point where the database VM reaches 100% CPU is what determines the IOPS requirement per host. When the database CPU is at 100%, the database cannot go any faster, even with an infinite amount of I/O capacity from the storage tier. In short, for enterprise applications running on HCI, CPU capacity imposes the constraint, not IOPS capacity.

 

With the Enterprise Cloud Platform, storage architects working on a uniform cluster no longer need to plan for the aggregate I/O requirement for all the hosts. They only need to measure for the most demanding application host, knowing with certainty that the remaining hosts have more than enough capacity for the less demanding applications.

 

Provisioning enough nodes to meet the compute requirements is the more pressing question. The Enterprise Cloud Platform radically simplifies performance capacity planning, another major benefit of an HCI system that provides data locality.

 

Understanding Microbenchmarks with HCI

The only truly reliable way to predict application performance is to run the application on your system. In lieu of this, however, the industry has come to rely on various microbenchmarks as proxies that allow us to compare systems. Following are some common benchmark patterns for assessing storage architectures. In each case, we illustrate that the available I/O performance on an NX-3060-G4 is greater than the required I/O performance for our example workload.

 

Random Read Performance

The random read performance metric is important in database environments where the total active dataset exceeds the database cache. Because read operations cannot be deferred, random read performance often determines the maximum performance of large database workloads.

 

Of course, the data that databases access is not "random," but is instead a function of the incoming requests. For cases like transactional databases (which may service medical records or credit card transactions) the workload is unpredictable and, as such, appears "random" to the storage system.

 

The chart below shows the performance of an NX-3060-G4 node running a random-read benchmark. The amount of read-concurrency that we observe from the database is between 40 and 80.

 

Picture1.png

 Figure 2 IOPS Demanded and Generated for Random Read 8 KB

 

For this test we used 8 KB I/O because it is the minimum that most databases use. In reality, the I/O sizes are varied. Using a uniform 8 KB I/O size is commonly accepted practice when measuring 8 KB “small block” I/O. We also included queue depth in our tests, which is a key variable often missing from discussions of storage performance.

 

Queue depth is a proxy for the amount of I/O the application demands, and it stands in for such things as multiple threads, database users, and concurrent queries. We measure the total queue depth across the entire host, which could be many virtual machines with many virtual disks attached to each one.

 

The chart demonstrates that, no matter the queue depth, the NX-3060-G4 provides more I/O than either an 8vCPU or 12vCPU database demands for random read operations. Notice especially that the performance increases as the application demands it.

 

There is a great deal of latent capacity in the system; as you add more work, you get more IOPS. Finally, the database runs out of CPU resources long before it can consume all of the I/O capacity, particularly with greater queue depth.

 

Burst Write Performance

Burst write performance is most applicable to database updates and inserts. Databases typically write to a sequential log and then periodically synchronize the data in the main filesystem, which generates write bursts. In the Prism screenshot below, we show the bursty I/O pattern from a SQL Server running the DB workload discussed above.

 

Picture1.png

Figure 3 Prism View of SQL Server Workload

 

Picture1.png

Figure 4 IOPS Required and Generated for Burst Random Write at 8 KB

 

The platform achieves high levels of burst write performance with the “Oplog,” which is specifically designed to handle incoming write bursts. SQL Server “file flush” operations are short bursts with high degrees of concurrency (queue depth). With the HammerDB OLTP workload, we observe around 256 outstanding I/Os when measuring the datastore’s “Active” I/O using esxtop.

 

The Oplog resides on SSD and is replicated to other Nutanix nodes to ensure no data loss in the event of a failure. In fact, the Oplog is itself somewhat similar to a database transaction log, in that it is a write-optimized on-disk structure that asynchronously drains into a read-optimized datastore. 

 

Sustained Small Random Write IOPS

Database workloads also require sustained write performance. This can be thought of as the background write rate sustained in-between the bursts. Although writes to the DB log file are sequential, they also use the Oplog since they are small in size and require low latency.

 

The “Sustained Random write” requirements account for both DB log writes and continuous background writes to the main DB files. The concurrency factor for background sustained write I/O demand for HammerDB with 750 users was around 30. The I/O demand is around 8,000 - 9,000 IOPS.

 

Simulating an “Average DB” workload

A reasonable simulation of the HammerDB SQL workload is to use a 70:30 read/write mix with an 8 KB block size. Although this scenario does not simulate bursty behavior, it does approximate the I/O size and I/O mix. Concurrency between 64 and 128 simulates both sustained and burst I/O in a single workload. The database was around 600 GB on-disk and the working-set size was around 400 GB.

 

Picture1.png

Figure 5 Sustained and Burst I/O for 8K 70:30 Read Write Mix

 

What These Numbers Mean to You

For the first blog in this series, we focused on the most commonly cited performance metric—IOPS capacity. Here are a few quick takeaways:

 

  • The Nutanix NX-3060-G4 platform easily exceeds the I/O requirements of even IOPS-hungry enterprise database workloads, with room to spare for workload consolidation.
  • Nutanix radically simplifies performance capacity planning. Once you’ve accounted for your most I/O-demanding application, it’s then only a matter of determining how many nodes you need.
  • CPU and I/O capacity grow linearly on demand to service your enterprise applications—by design, your applications get more as they need more.

Although we’ve proven that we can satisfy today’s I/O-intensive database applications, we also now offer even more powerful Broadwell-based G5 models. We keep pushing the boundaries of performance so that we are ready to serve your enterprise workload needs now and in the future.

 

As happy as we are with our test results, making infrastructure invisible for our thousands of customers, including more than 300 Global 2000 companies, is what really counts. Enterprises like Excelitas Technologies, Lion Group, Hallmark Business Connections, UCS Solutions, Empire Life, and Jabil all run enterprise applications such as SQL Server and Oracle on Nutanix; their real-world accounts of achieving great performance and availability via the Nutanix platform are why we do what we do.

 

Be sure to check out our best practices guides and reference architectures to see our validated designs for running your enterprise apps on Nutanix with confidence.

 

Stay tuned for our next installment, where we will explain the role of stability when assessing performance for HCI

 

Disclaimer: This blog may contain links to external websites that are not part of Nutanix.com. Nutanix does not control these sites, and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.

 

Announcements

One of the features you see a lot of on social media sites are @Mentions. With @Mentions you can acknowledge other community members.

Read More: Did you know: You can @Mention other members
Labels