Simplifying the Mundane IT Tasks with Intelligent Operations

As an IT administrator, you're already optimizing and streamlining work to achieve the do more with less requirements. Here’s a quick health check. How many hours do you spend checking in on infrastructure resources, including nights and weekends?

In the midst of the pandemic, your organization supports remote work, and most likely, you are managing the IT infrastructure remotely. With chronic shortages of IT staff and vendor support, it’s a challenge on top of a challenge. Especially now it is inevitable to minimize manual and repetitive maintenance focused tasks.

It’s time to meet the intelligent automation solution from Nutanix. The dynamic infrastructure management console Prism provides a 360-degree view of your hyperconverged infrastructure through a single pane of glass. Its AI Operations (AIOps) tier keeps an eye on your infrastructure so you don’t have to. The AIOps also responds to resource anomalies which will  save you time to do more strategic work. Let’s find out how!

Get Under the Hood of Intelligent Operations

The AIOps tier of Prism includes performance anomaly detection, capacity planning, custom dashboards, reporting, and advanced search capabilities. Our purpose-built X-FIT (cross-fit) and X-Play (cross-play) tools empower IT pros like you to leverage these capabilities.

With X-FIT, you gain real-time, actionable analysis of performance anomalies, capacity use, and VM efficiency, which leverages the machine-learning powered VM-level behavioral resource monitoring engine. Then you can find and fix inefficiencies in your environment without the need for time-consuming, ongoing manual observation, and configuration.

To act on these AI insights from X-FIT, you can use X-Play to create automation and write the Playbooks that greatly simplify infrastructure operations. You will see how these capabilities help transition datacenter management from reactive response to autonomous IT operation in Nutanix HCI environments in the following sections.

Dynamic Monitoring

Gain better insights into the workloads’ performance requirements. The X-FIT system learns each VM's behavior in real-time and establishes a dynamic threshold as a performance baseline for each resource. If a given data point for your VM strays outside the baseline range, the system detects an anomaly and generates an alert. If the anomalous results persist over time, it learns the new behavior and deploys proactive remediation by adjusting the baseline. You will get an early warning on issues that traditional, static threshold monitoring would not otherwise discover. Enable proactive response that can save you considerable downtime and reduce costs.

Capacity Runway

With capacity analysis, you can understand how applications use infrastructure resources by focusing on consumption from three resource buckets:

  • Storage capacity
  • CPU
  • Memory

Based on this usage, the system generates a "capacity runway" — the number of days remaining before the resource item is fully consumed. With this information, your cloud enterprise platform can deliver just-in-time provisioning of infrastructure resources.

X-FIT considers resources consumed and the rate at which the system consumes additional resources. Based on this historical data, its algorithms perform capacity calculations. Storage calculations factor amounts of live usage, system usage, reserved capacity, and snapshot capacity for your system into runway calculations.

The storage capacity runway calculation is container-aware, calculating capacity needs even when containers are growing at different rates but consuming resources from a single storage pool. This awareness enables X-FIT to create more accurate runway estimates. You will get recommendations on how to optimize existing capacity and performance.

Right-Sizing VMs

You should take the time to identify VMs that are not optimally configured before they cause performance degradation. Rightsizing VMs is critical to achieving the best performance for your infrastructure.

A panel on the Prism console displays this data, broken into four different categories for easy identification of problems:

  • Overprovisioned – VMs that are too big for the job required, leading to resource waste.
  • Inactive – zombie VMs that have been inactive for more than 30 days.
  • Constrained – VMs that do not have enough resources to function efficiently.
  • Bully – a set of VMs that consume the majority of resources, causing other VMs to starve for capacity.

You can manage the entire infrastructure globally — allocating resources from one area to another within a virtualized environment via a single console. Existing resources can be reclaimed or boosted with additional capacity as needed.

Capacity Planning

Predict CPU, memory, network, and storage needs based on current and historical consumption trends. With Prism’s capacity planning function, you can plan for peak and non-peak usage periods.

When you can’t reclaim enough resources, or when you need to scale the overall environment, the planning function can make node-based recommendations. This functionality uses X-FIT data to account for consumption rates and growth, helping you meet the target runway period.

VMs (CPU, Storage, and Memory), Cluster level resources, and SQL Server are workload planning options supported with X-FIT with others to come. For these, X-FIT provides percentage modeling of an increase or decrease in overall capacity demand.

Use data from X-FIT and workload models that have been carefully curated over time to inform future capacity planning. For example, you can model scenarios to determine how adding new workloads to a cluster may affect your capacity.

Automating Responses with X-Play Playbooks

With X-Play, you can create intelligent Playbooks that fully leverage the machine learning data, accurately reflecting the inner-workings of your infrastructure. Prism X-Play reduces automation complexity, allowing you to create helpful workflows in just a few clicks without writing a single line of code. Improve your productivity by automating your day-to-day operational tasks and freeing up time to work on more meaningful tasks.

Automate a common set of activities by creating Playbooks. Building a Playbook in X-Play involves three steps:

  • Set a trigger (such as an alert trigger)
  • Define one or more actions
  • Save and enable

Configure an alert trigger for any alert policy, including those generated in response to learned behaviors. For example, Prism can raise an alert in response to VM inefficiency, including constrained, overprovisioned, bully, and inactive VMs.

Use the out-of-the-box VM Memory Constrained alert policy as a trigger to create a Playbook, which can have one or more actions linked together. Playbooks can automate all the steps that you previously had to do manually, such as snapshotting the VM, adding memory, resolving the alert, or sending out an email, saving you time.

Next Steps

We have looked at the following capabilities to minimize manual, repetitive maintenance tasks: 

  • Balance the capacity needs or your global resources via a single console using Prism. Its machine learning-powered algorithms generate meaningful insights into your VM's performance and automatically identify VMs that are not optimally configured — a time-saver for the day-to-day management of complex infrastructure.
  • Plan the capacity runway of your storage, CPU, and memory configurations to improve efficiency and responsiveness. You can make your infrastructure management proactive instead of reactive. 
  • Simplify automation for your entire system. With just a few clicks, you can automate day-to-day operational tasks: no code is needed.

But seeing is believing — take Nutanix Test Drive to try to manage your infrastructure with Nutanix Prism today! All you need is a browser; no hardware, setup, or download are required. See firsthand how you can improve team workflows and advance your infrastructure with automation.

This post was authored by Mayank Gupta and Sachi Sawamura

