@NutanixNext
Visit Nutanix

Nutanix Connect Blog

Welcome to the Nutanix NEXT community. To get started please read our short welcome post. Thanks!

Showing results for 
Search instead for 
Do you mean 

"D... is Forever": ROBO Series

by Community Manager on ‎01-30-2017 02:22 PM (2,613 Views)

 

This blog was authored by Amit Jain, Principal Product Manager at Nutanix.

 

 

Picture1.png

 "Diamond is forever”! The brilliance of this slogan lies in its emphasis on both eternity and the sentiment, whether you wish one for yourself or for your loved ones. 

 

 

Image Source: DeBeers.com

 

Likewise, “Data is forever”! We at Nutanix understand how critical the data is for your business and how important it is for us to provide the comprehensive Data Protection. So, if you engage us, we promise to keep that part of the bargain. Not only that, we make the ROBO (Remote/Branch Office) Backup provisioning and management both simple and convenient, so you can get your weekend back to have fun with your loved ones and truly enjoy the sentiment!

 

Full Stack ROBO Solution

 

Data Protection is one of the top business challenges for ROBO admins (ESG Report). You often have to deal with different solutions for compute, storage, backup, and disaster recovery, increasing both CapEx and OpEx. Now, while that variance can perhaps be handled for the Enterprise datacenter but not for ROBO, as typically there is no IT staff on site. ROBO sites could constitute multiple retail locations or insurance/sales offices or oil rigs or large manufacturing plants etc. You may have to manage 10s, 100s or perhaps 1000s of such ROBO sites remotely from the centralized datacenter. So, Simplicity is the key, and here is the cost-effective, Full Stack ROBO solution from Nutanix - we cover Physical Compute/Storage infrastructure (choice of NX, Lenovo HX, Dell XC platform), Virtual infrastructure (Nutanix native hypervisor, AHV) and Data Protection solutions, all managed through our consumer-grade Prism Central.

 

Picture1.png

 

Integrated ROBO Data Protection: Over-the-WAN, On-Prem

 

Over the WAN, you can replicate snapshots of virtual disks and VMs from the ROBO sites to the Nutanix cluster in the centralized Enterprise Data Center. Using Nutanix Cloud Connect, you can create long-term backups from ROBO to public cloud services – Amazon Web Services or Microsoft Azure. The functionality is seamlessly integrated into Nutanix data protection capabilities, allowing you to backup to and recover from the cloud with just a few clicks – just as you would with a remote Nutanix cluster.

 

You can dramatically improve RTO (Recovery Time Objective) and backup storage efficiency with the industry’s first redirect-on-write algorithm for Enterprise Cloud Platforms based on hyperconvergence.

 

Picture1.png

 

We understand that WAN bandwidth is typically limited and could be very expensive as well at the ROBO sites, so you may also want a local, on-prem backup solution at ROBO for faster recovery. There is an option of leveraging Nutanix TimeStream feature, which enables on-cluster (on the Primary ROBO cluster) snapshots to provide easier, faster restores in case of software misconfigurations or virus attacks.

 

You may also feel the need for an additional local backup appliance with a separate failure domain. So, you can now leverage our newly introduced 1-Node Replication Target, which can take Nutanix native snapshots and is designed specifically for ROBO environments. It is very cost effective with built-in resiliency and provides upto 40TB raw capacity. It runs Nutanix native Hypervisor (AHV) and is ready to be fired up as shipped from the factory. Leveraging the Foundation tool, the cluster association and container mappings between the Primary ROBO cluster and 1-Node Target are set up automatically, thus making the provisioning simpler for you!

 

So, aforementioned choices from Nutanix allow you to effectively and efficiently protect your data without introducing another backup vendor, either software or hardware, and increasing the complexity of your ROBO environment!

 

Backup/DR Nutanix Cluster in the Enterprise datacenter

 

Backup/DR (Disaster Recovery) cluster in the Enterprise datacenter is required to consolidate the replicated snapshots sent over WAN from multiple ROBO sites. If needed, you can add the $/GB optimized, capacity-heavy Nutanix nodes to your existing web-scale Nutanix systems in the datacenter to support nearly unlimited retention of snapshots. We understand that the long term retention may require integration with other software as well, so we do support VADP (VMware vStorage API for Data Protection) and application-level consistent snapshots using Microsoft Volume Shadow Services (VSS).

 

Global Distributed De-duplication

 

The Global Distributed De-duplication feature will ensure that if one of the ROBO site has already sent a data block (say ‘A’, as in the figure below) to the Nutanix Backup/DR cluster in the Enterprise datacenter, then any other ROBO sites having the same data block (‘A’) will not be sent over the WAN for backup again.

 

Picture1.png

 

This is ensured by doing fingerprinting check over the wire before the data is exchanged between two Nutanix clusters. Now, such a feature reduces backup traffic sent over WAN, thus efficiently utilizing the already bandwidth-constrained WAN links at ROBO. Moreover, this feature reduces the storage requirement in the Data center and thus can be very cost-effective for you.

 

File-Level Restores

 

An interesting aspect is that with this feature you can recover independent files inside a VM without having to recover the entire VM. This makes the recovery process extremely efficient without any need for a backup administrator intervention.

 

Prism Central: 1-Click Centralized Management

 

Picture1.png

 

You can conveniently manage all the branch or retail or regional locations and the corresponding Data Protection policies from a single pane of glass using the simple and intuitive Nutanix Prism Central. It provides centralized infrastructure management, one-click simplicity and intelligence for everyday operations and insights into Capacity planning and forecast.

 

Picture1.png

 

In a nutshell, when it comes to Data Protection and a simple, cost-effective, Full-stack ROBO solution, we are the “diamond standard” and your data is simply protected on Nutanix!

 

 

 

Picture1.pngTechy Tidbit: Scientists have discovered a planet that they believe is composed mostly of carbon, and is one-third pure diamond and is  named “55 Cancri e” (wonder how they came up with such a name?!). If that’s not enough - scientists have also discovered a star that is essentially a diamond of ten billion trillion trillion carats!! It is named the Star Lucy after the Beatles song - “Lucy in the Sky with Diamonds” (now, that’s nice!) 

 

 

 

Disclaimer: This blog contains links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.

AOS 5.0 New Feature: Life Cycle Management

by Community Manager ‎01-16-2017 09:42 AM - edited ‎01-16-2017 10:09 AM (3,045 Views)

This blog is authored by Deepa Pottangadi, Instructional Designer, Nikhil Bhatia, Staff Engineer, Manoj Sudheendra Member Technical Staff and Viswanathan Vaidyanathan, Member Technical Staff at Nutanix

 

Life Cycle Management (LCM) is a new feature in AOS 5.0 that enables you to update the software and firmware of your Nutanix clusters. LCM can be installed separately, and going forward, it will have its own release cycle. AOS 5.0 includes the first version of Life Cycle Management, or LCM 1.0.

 

The decoupling of updates from the core Prism functionality will enable you to more efficiently plan your infrastructure upgrade and update cycles.

 

The LCM modules released along with AOS 5.0 will enable you to:

 

  1. Update LCM framework on all platforms
  2. Detect the firmware version of BIOS/BMC/HBA/Disk on NX platforms

 

As LCM is decoupled from the AOS releases, we can deliver LCM updates more frequently. We expect to issue many new updates in the next few months, which will support a wider range of platforms, so stay tuned. Another important point to note is that LCM can update itself. The LCM framework is treated like any other entity, which can be detected and updated. 

 

The following diagram shows the Inventory page. It displays all of the entities that can be updated using LCM. You will notice that this list includes an entity called Cluster software components. Click the See All link in Cluster software components section:

 

Picture1.png

 

The drill-down into the Cluster software component section displays the LCM component. The LCM software version details are displayed as shown in the image below:

 

Picture2.png

 

Let’s look at the high-level architecture of LCM:

 

Picture3.png

 

Similar to most services on Nutanix, LCM follows a master/slave architecture. The master LCM selects one node at a time in the cluster to apply the updates. Before updating the master LCM node, it relinquishes its position to another slave LCM node in the cluster. LCM persists its configuration in Zookeeper, which is available to all of the nodes in the cluster.

 

LCM persists its internal state in a write ahead log (WAL), which is backed by Cassandra and also available on all nodes when needed (e.g., when the LCM master crashes and some other LCM slave needs to acquire leadership and continue the currently running LCM operation.)

 

LCM has three components:

 

  1. AOS interface: This component is dependent on the AOS and is a versioned interface (the current version being 1.0) that interacts with the AOS and LCM modules.
  2. Framework module: The framework is the central component, which on one hand, interacts with AOS, and on the other, runs the LCM modules. The LCM framework is the main module controlling LCM operations. The framework is organized as python module, and therefore, can be easily upgraded.
  3. LCM modules: The LCM modules abstract the entity level details, such as how to perform inventories and updates on a given platform for a set of entities.

 

LCM operations are executed sequentially on the selected nodes in a cluster. These are irreversible operations and may require service downtime. Therefore, plan your tasks in advance before executing an LCM operation. Before performing an update, LCM runs a pre-check to verify the state of the cluster. If the check fails, the update operation is aborted.

 

All of the LCM operation logs are written to genesis.log and lcm_ops.out . The lcm_ops.out log file records all operations, including successes and failures. In the event of any errors, reach out to Nutanix support for assistance.

 

The video below shows you how to perform an inventory and update the disks on all nodes: 

 

 

If you are new to Nutanix, we invite you to start the conversation on how the Nutanix Enterprise Cloud Platform can work for your IT environment. Send us a note at info@nutanix.com or follow us on Twitter Nutanix and join the conversation in our community forums.

 

 

Disclaimer: This blog contains links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.

 

 

11 Reasons why Nutanix is the Best All-Flash Platform

by Community Manager ‎12-12-2016 12:10 AM - edited ‎12-12-2016 05:05 AM (9,836 Views)

AFA-Nutanix.png

This post was authored by Steve Kaplan, VP of Client Strategy at Nutanix

 

All Flash Arrays: Dead Men Walking

All Flash Array (AFA) manufacturers may be rejoicing in the inevitable demise of spinning disk, but hyperconverged infrastructure (HCI) is increasingly upending the entire storage category. While an AFA may be faster and easier to manage than a traditional array, it’s still a SAN. Nutanix Enterprise Cloud is not only a better platform for flash than the AFAs, but also than other HCI solutions. Here are the 11 reasons why:

 

1) Dramatically reduced network latency effects

Nutanix HCI already bests AFA performance by eliminating network latency (see @vcdxnz001 post, Your Network is Too Slow and What to Do About it). Innovations such as NVMe and 3D XPoint amplify the advantage of storing data on flash or other Storage-Class Memory (SCM) next to the compute in a hyperconverged environment. Accessing data in a traditional model from an All Flash Array over a slower network negates the benefits of the faster flash/SCM.

 

Picture1.jpg

 

Putting flash in a proprietary array at the end of a network designed for the latency of magnetic media instead of next to the compute intuitively makes no sense. This really boils down to simple physics where proximity matters. Flash should be directly connected, not remotely attached where it requires multiple hops, protocols, and performance-constraining controllers to be accessed. It’s just physics!

 

Picture1.png

I/O path length for AFAs versus Nutanix

 

AFA vendors will often suggest faster networks and NVMe over fabrics that will offer lower latency and higher bandwidths. Nutanix enables customers to optimize the benefits of flash without the requirement to purchase expensive new storage fabrics that perpetuate legacy complexity.

 

FlashvsNetwork.png

Image from Long Virtual White Clouds by Michael Webster

 

2) Density advantage

Nutanix enables packing 92TB of flash, in addition to all of the server resources, into just 2U. AFAs require not just the array, but also the compute, the storage fabric and possibly lower-cost disk storage. All of this requires more power, rackspace, and cooling.

 

3) Commodity Hardware

Most AFAs such as Pure utilize proprietary hardware, but this is a roadblock to quickly deploying new hardware innovations. All-flash arrays risk technological leaps that leave customers with obsolete products and facing forklift upgrades the next time they need more capacity. In today’s fast-paced technology environment, it’s the companies who leverage global economies of scale that succeed by leveraging the innovation driven by the world’s largest commodity hardware manufacturers.


Take the case of Sun Microsystems. Sun bet on proprietary hardware while the industry shifted to commodity servers utilizing the more cost-effective Intel-compatible microprocessors popularized in personal computers. Sun lost 80% of its value before being acquired by Oracle in a fire sale.


Violin Memory is another example. Violin was one of the first companies to introduce all-flash memory solutions to the marketplace. This was very cool and fast tech, with great engineering when they launched a decade ago.


But consumers had another idea. They loved the speed and reliability of solid-state drives (SSDs) which now can be found in almost every laptop, desktop and memory array. Even as the price of SSDs plummeted, Violin preferred to design its own proprietary field-programmable gate arrays. A sophisticated solution, perhaps, but no match for the rapid improvement of SSDs; Violin’s proprietary hardware quickly fell behind and the company has been delisted from NYSE.


The hyperconverged business is hardly the only example of a thriving enterprise technology built upon commodity hardware. All of the leading cloud providers also utilize commodity servers. Proprietary hardware, while once essential to protecting a company’s innovations, now hinders, or even destroys, a manufacturer’s ability to compete.


4) Distributed storage controllers

Most AFAs have physical, non-distributed, storage controllers that are easily saturated with traffic. Since the controllers are the bottleneck, adding more shelves of SSDs does not increase performance.


If we assume a single enterprise SSD is capable of delivering ~500MB/s of throughput, then a controller with dual 4Gb FC adapters is bottlenecked with only two SSDs. Even upgrading to dual 16Gb FC adapters only accommodates eight SSDs.


To overcome these limitations, AFAs must accommodate multiple adapters resulting in complex fabric configurations. But this inevitably hits the controller limits, forcing customers to purchase more AFA systems and creating more silos.


Contrast this with Nutanix where every time a node is added to a cluster, it’s also adding a virtual storage controller - enabling immediate enhanced performance. Resiliency is massively improved as loss of one controller has very little impact. This is why Nutanix can do non-disruptive 1-click upgrades and maintenance with very low impact.


5) Data locality

Imagine what would happen if 75% of the cars in Los Angeles were suddenly removed from the roads. Not only would traffic congestion quickly dissipate, but the city would benefit from other benefits such as fewer accidents, less road maintenance, reduced pollution, and so on.


Nutanix data locality similarly affects the data center environment by pulling the majority of read traffic off of the network; reads instead come from the local SSD within the node. Available network bandwidth is effectively increased for writes/end user applications improving not just storage performance, but also the application performance that the storage is servicing.

 

Picture1.png

 

6) Scalability

Capacity Performance: AFAs, which are typically limited to two physical storage controllers, hit a metadata bottleneck in scaling capacity that is limited by the amount of RAM/NVMRAM they have in a system. Adding SSDs, in most cases, does not improve performance.

 

At some point, the AFA customer must either upgrade to a bigger unit with more processing power, add complex fabric interconnection, or start creating silos. AFA manufacturers will say they can replace existing controllers with new faster ones, but despite the disruption and expense, that shifts the bottleneck to the network or possibly even to the existing flash medium.

 

Contrast this with Nutanix which, unlike AFAs, are not bottlenecked by two physical storage controllers. The VMs on every node are serviced by the Controller Virtual Machine (CVM) on that node. Every time a node is added to the cluster, a CVM is also added, thereby linearly scaling not just capacity, but also performance and resiliency, as well as expanding the management stack capabilities. Acropolis Block Services (ABS) and Acropolis File Services (AFS) enable Nutanix customers to scale physical and virtual workloads as well as file serving from the same Nutanix cluster, thereby eliminating silo inefficiencies.

 

Picture1.png

 

Dedupe/Compression Performance: Nutanix’s unique implementation of dedup and compression ensure that performance overhead is minimized. Nutanix does not brute force dedupe/compress of all data as this requires more physical resources and impacts all IO regardless of the outcome.

 

Resiliency: Both resiliency and high availability are built-in across the entire Nutanix stack. Replication Factor 2 (RF2) or RF3 along with erasure coding (EC-X) enables superior fault tolerance for disk. Block awareness mitigates node failure, while sync and async-replication provides resiliency for entire datacenters.

 

All-Flash Storage-Only Nodes: Storage-Only nodes provide Nutanix customers the ability to scale compute and storage separately, thereby minimizing costs of their all flash environments.

 

7) Simplicity

Nutanix one-click upgrades reduce both complexity and risk involved with the upgrade process - there is no complex interoperability matrix or operational guidelines. Nutanix also simplifies the flash-based architecture by eliminating LUNs and their presentation by focusing on VMs rather than on storage constructs, and by including both centralized management and capacity planning.

 

Picture1.png

Nutanix’s Simple and Intuitive Prism Management Dashboard

 

8) Workload Consolidation

AFAs must send information from the flash array across the network to the compute for processing. Beyond adding the aforementioned latency, this also requires additional queue management and overhead. CPUs can quickly become overloaded when simultaneously receiving small block, high IOPS and large block, high throughput application requests. To ensure consistent performance, AFA administrators must frequently separate OLTP & OLAP workloads from running on the same platform.

 

Nutanix gives the compute direct access to the storage. Servicing requests with limited overhead and consistent low latency enables mixing of workloads. And with Nutanix Acropolis Block Services, Nutanix becomes the storage backplane for bringing together different types of applications. Customers can even consolidate both physical workloads and virtualized workloads in the same cluster.

 

Additionally, AFAs tend to have block storage devices for blocks and flash arrays for blocks. With Nutanix, the storage is shared between block and file.

 

Picture1.png

 

9) Proven Mission-Critical Application Deployment

Nutanix enables optimal performance for critical apps right out of the box, even with multiple workloads. It eliminates the single point of failure challenge with storage access failover, self-healing, and ongoing data integrity checks. Storage performance is predictable, and no complex configuration or tuning is needed.

 

Non-disruptive software updates eliminates planned downtime, enhancing Nutanix’s appeal for hosting mission-critical applications. Maintenance windows for software upgrades and scaling become a thing of the past. Unlike almost all other HCI solutions, Nutanix has years of proven maturity and success in enterprise deployments of Splunk, Oracle, SAP, SQL Server, Exchange, and many other mission-critical applications (only Nutanix and VxRack are SAP-certified).

 

Picture1.png

 

10) Lower Total Cost of Ownership (TCO)

AFAs eventually run out of controller capacity, technology advances to the point where the existing AFA solution is comparatively uneconomical, or the equipment just gets old. In any of these cases, the AFA owner faces a forklift upgrade - a process that is typically expensive, complex and time-consuming. As a result, AFA owners typically purchase more capacity than required initially in hopes of having enough resources available to meet requirements four or five years down the road.

 

Nutanix owners never face a forklift upgrade, and therefore do not require purchasing more nodes than needed at any point in time. As technology changes, newer nodes can simply be added to the cluster with a mouseclick, and the software takes care of everything else. Nutanix eliminates the risk of under-buying.

 

Completely eliminating the need for storage arrays and storage fabric along with excess capacity up-front helps lower the CapEx cost for Nutanix. As the project footprint expands over the next few years, increasingly fewer nodes are required to run the same workload due to an increased density of VMs per node driven by both Moore’s Law and by performance enhancements in Nutanix software.

 

Picture2.png

 

The CapEx for the project lifetime is thereby further reduced along with the associated rack space, power and cooling. Administrative requirements for Nutanix are also slashed - an IDC study found an average 71% reduction in administration time required for organizations migrating to Nutanix.


11) The Advantage of an Enterprise Cloud Platform

At the end of the day, it’s not just about the work, it’s about how you do it. Nutanix’s utilization of Web-scale architecture is a unique differentiator incorporating hyperconvergence as part of an Enterprise Cloud Platform. Distributed technologies such as Cassandra, NoSQL, MapReduce and Curator enable significantly higher performance and efficiency when optimizing all-flash environments.

 

Data Access: Classic tree-structured metadata query architectures (Btree & R&B) that work well in an array environments where metadata is stored in each physical controller are not optimal in all-flash HCI environments. In HCI, the metadata is distributed across many nodes - making tree-structured lookup inefficient. To combat this inefficiency, Nutanix utilizes big-data technologies such as Cassandra and NoSQL to enable very fast look-up and very high fault tolerance. No single point of failure exists.


Data Sorting: Unlike legacy 3-tier and other HCI approaches which sort data in the IO path, Nutanix evaluates it in the background, enabling better performance. The system scales as nodes are added allowing faster dedupe and compression with increased data locality. Seamless scalability enables rapid evaluation of whether to promote or demote data depending upon memory and storage tiers available.


Analytics: Even all-flash environments have different tiers of flash (performance & endurance). Metadata continues to grow and it can be difficult to cost-effectively keep it in memory or on the fastest tier.


Nutanix has again utilized a big data approach to solve this challenge. A custom-written version of MapReduce/Curator is used to determine key elements of the data including what is hot, compressible, and dedupeable. The same framework similarly determines what data needs to move to another node for data locality, what data has been deleted, and what data needs to be relocated or rebalanced – particularly in the event of failure.


These analytics enable deeper insight including trending, real-time analysis, proactive monitoring and root cause analysis, and alerting.


Timing In contrast to other solutions that rely solely on sub-optimal, inline compression and proprietary hardware for dedupe, Nutanix enables offline sorting with MapReduce/Curator. This enables more writes before deciding to compress or dedupe and avoids the requirement for a performance limiting centralized database.


Unified Cache Cache enables data locality. Deduplication makes it possible to store more data in this performance tier and maximize local cache hit potential. To maximize efficiency without limiting performance, Nutanix performs in-line local deduplication of context cache. 

 

Picture1.png

 

NVMe: Dead Man Running?

At least one of the legacy storage manufacturers is promoting NVMe as the future. But migration to NVMe is going to further amplify the advantages of putting the compute next to the data rather than across the network. It will accelerate the journey to extinction of all the fabric stretched monoliths - including AFAs.


Thanks for content and edits to @joshodgers @briansuhr @_praburam @Priyadarshi_Pd  @sudheenair @binnygill @vcdxnz001 @RohitGoyal

Learn More

Nutanix Flash Forward Website Landing Page & eBook
Ten Things You Need to Know About Nutanix Acropolis Block Services
Ten Things You Need to Know About Nutanix Acropolis File Services


Disclaimer: This blog contains links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.

 

 

Announcements

One of the fun things about participating in an online community is developing a community identity. One way to do that is with a personalized avatar.

Read More: How to Change Your Community Profile Avatar
Labels