byaluciani01-16-201709:42 AM - edited 01-16-201710:09 AM
This blog is authored by Deepa Pottangadi, Instructional Designer, Nikhil Bhatia, Staff Engineer, Manoj Sudheendra Member Technical Staff and Viswanathan Vaidyanathan, Member Technical Staff at Nutanix
Life Cycle Management (LCM) is a new feature in AOS 5.0 that enables you to update the software and firmware of your Nutanix clusters. LCM can be installed separately, and going forward, it will have its own release cycle. AOS 5.0 includes the first version of Life Cycle Management, or LCM 1.0.
The decoupling of updates from the core Prism functionality will enable you to more efficiently plan your infrastructure upgrade and update cycles.
The LCM modules released along with AOS 5.0 will enable you to:
Update LCM framework on all platforms
Detect the firmware version of BIOS/BMC/HBA/Disk on NX platforms
As LCM is decoupled from the AOS releases, we can deliver LCM updates more frequently. We expect to issue many new updates in the next few months, which will support a wider range of platforms, so stay tuned. Another important point to note is that LCM can update itself. The LCM framework is treated like any other entity, which can be detected and updated.
The following diagram shows the Inventory page. It displays all of the entities that can be updated using LCM. You will notice that this list includes an entity called Cluster software components. Click the See All link in Cluster software components section:
The drill-down into the Cluster software component section displays the LCM component. The LCM software version details are displayed as shown in the image below:
Let’s look at the high-level architecture of LCM:
Similar to most services on Nutanix, LCM follows a master/slave architecture. The master LCM selects one node at a time in the cluster to apply the updates. Before updating the master LCM node, it relinquishes its position to another slave LCM node in the cluster. LCM persists its configuration in Zookeeper, which is available to all of the nodes in the cluster.
LCM persists its internal state in a write ahead log (WAL), which is backed by Cassandra and also available on all nodes when needed (e.g., when the LCM master crashes and some other LCM slave needs to acquire leadership and continue the currently running LCM operation.)
LCM has three components:
AOS interface: This component is dependent on the AOS and is a versioned interface (the current version being 1.0) that interacts with the AOS and LCM modules.
Framework module: The framework is the central component, which on one hand, interacts with AOS, and on the other, runs the LCM modules. The LCM framework is the main module controlling LCM operations. The framework is organized as python module, and therefore, can be easily upgraded.
LCM modules: The LCM modules abstract the entity level details, such as how to perform inventories and updates on a given platform for a set of entities.
LCM operations are executed sequentially on the selected nodes in a cluster. These are irreversible operations and may require service downtime. Therefore, plan your tasks in advance before executing an LCM operation. Before performing an update, LCM runs a pre-check to verify the state of the cluster. If the check fails, the update operation is aborted.
All of the LCM operation logs are written to genesis.log and lcm_ops.out . The lcm_ops.out log file records all operations, including successes and failures. In the event of any errors, reach out to Nutanix support for assistance.
The video below shows you how to perform an inventory and update the disks on all nodes:
If you are new to Nutanix, we invite you to start the conversation on how the Nutanix Enterprise Cloud Platform can work for your IT environment. Send us a note at email@example.com or follow us on Twitter Nutanix and join the conversation in our community forums.
Disclaimer: This blog contains links to external websites that are not part of Nutanix.com. Nutanix does not control these sites and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.
byaluciani01-12-201707:40 AM - edited 01-18-201709:41 AM
This blog is authored by Harry Yang, Principal Product Manager at Nutanix
In part 1, we shared how Prism has evolved from a single cluster management front end to a complete infrastructure management solution. In this blog, we will introduce the new features that we have developed as part of the 5.0 release.
From speaking with our customers, we have learned that enterprises of all sizes want to align IT with their business demands. The three challenges identified as part of this effort include:
The risk of impeding business growth when customers aren’t sure when to expand or optimize their infrastructure. Many times, they are caught by surprise.
Inefficient use of capital when customers don’t have an easy way to match IT infrastructure expansion with business demands.
Wasted time of expensive IT staff when IT teams take days to collect data, analyze, and figure out what is needed. This time can be better spent on addressing higher value tasks.
Since the first release in early 2016, Prism Pro has included an X-FIT powered capacity consumption trending feature. This feature removes the guesswork, lets you easily understand your current resource behavior, and alerts you precisely when you need to consider expansion or optimization.
In the 5.0 release, Prism Pro adds a new feature called Just-in-Time Forecast. Powered by X-FIT and designed with simplicity, it gives you control over planning and optimizing your resources to align with workload demands. Working together, capacity consumption trending and Just-in-Time Forecast will help you overcome the challenges mentioned above.
There is also a new menu item in the Prism Central console called “Planning” that will help you identify any possible shortages. The Planning page lists the results of runways (the number of days left that your resource can sustain the workloads) of all your clusters.
You can click on any cluster to learn details of the current consumption behavior. If the cluster runs short, you can start the Just-in-Time Forecast flow to find out when and how much capacity will be needed.
Using the Just-in-Time Forecast page, you will see:
How much expansion is needed if the cluster anticipates a capacity shortage
Whether a cluster has enough runway after adding new workloads
Whether you will need to expand the cluster if the current workload changes its behavior (e.g., high workload volume due to a demand surge, M&A, marketing promotions, etc.)
When and how much capacity you will need for a new cluster to support its workloads
Which cluster is the optimal place to support your new workloads
The impact on capacity if you move nodes into your production cluster from your staging environment
This feature will enable you to better align IT with your business because:
It is powered by X-FIT machine learning technology, therefore it is designed to accurately reflect when and what you need, and remove the risk of impeding your business growth.
It is designed to enable pay-as-you-grow planning. The recommendation engine lays out the timeline of your resource onboarding schedule that is closely aligned with your business growth. As a result, your capital will be better used.
It has built-in simplicity, including one-click recommendations and workload-friendly scenario definitions. One customer reported that he can now complete planning tasks within 5 minutes, compared to the 3 days that he used to spend.
Search Enhancements on Monitoring and Troubleshooting
We introduced the new Prism Search capability as part of Prism Pro release earlier this year. Many of our customers are already using it to quickly locate a resource among hundreds or thousands that need managing. This release is extending its usage into monitoring and troubleshooting.
There are two major enhancements to the Prism Search feature in the 5.0 release. First, you can now search alerts by their titles, sources, and categories. This is particularly useful when you want to investigate a cluster of alerts which may point to a common root cause.
Second, you can now use search to identify a hidden issue by searching the corresponding symptom. In this release, you will be able to use an expression in the search string, and combine the expression with a metric to find VMs with a specific symptom. For example, you can spot zombie VMs by searching all VMs whose IOPS < 20.
To make the Prism Search more accessible, we have also revamped its landing page. When you open the Prism Search page, you will find a list of sample queries that you can use in your environment. Some of them even contain your resource name. Prism Search generates these sample search strings directly from your environment. With this, you can not only familiarize yourself with the new feature very quickly, you can also put it to use immediately.
All Prism Pro features live inside the VM that hosts the Prism Central console. In this release, we increased the number of VMs that Prism Pro can support to 12,500, a 250% increase from just six months ago.
We invite you to take a personal guided tour of Prism, and review a series of blogs that will show you the different interfaces and new features inside Prism.
With every innovation cycle, we strive to provide you with bigger values, simpler flows, and broader coverage of your daily life of IT operations. In order to do that, we need to hear from you. Please join the community forum and share your thoughts and experiences with us. Let’s go on this journey together to make the life of data center management easier than ever!
Disclaimer: This blog may contain links to external websites that are not part of Nutanix.com. Nutanix does not control these sites, and disclaims all responsibility for the content or accuracy of any external site. Our decision to link to an external site should not be considered an endorsement of any content on such site.