Planning to upgrade the entire Nutanix environment? Here are a few tips and tricks to plan and perform the activity in a smooth manner.
If you are new to Nutanix upgrades, don't hesitate to check this KB out to understand the overview of upgrades at Nutanix.
When confused which component to upgrade first, below listed generic order can be of help.
Step | Component | Target | Reboots | On the fly | Quick links |
1 | NCC on Prism Central | NCC framework | Health service on PC | Yes | |
2 | Prism Central | PCVM | Prism Central VM | No | |
3 | NCC on Cluster | NCC framework | Health service on all CVMs | Yes | |
4 | AOS | CVM | All CVMs, one at a time | No | |
5 | LCM Inventory | LCM Framework | LCM related services | Yes | |
6 | Software upgrades - LCM | Corresponding software components | Related components | Depends | |
7 | AHV | Host | Hosts, one at a time | No | |
8 | Firmware upgrades - LCM | Corresponding firmware components | Hosts, one at a time | No |
Would like to understand the high level working of what you are getting into during and upgrade? Check this link to understand the upgrade process through a high-level, simplified description of what happens during the Nutanix core software upgrade process.
- PRO-TIP: AOS upgrades occur across the cluster, one CVM at a time. The user VMs will not be impacted as the VM I/Os which are to be handled by the local CVM will be forwarded to other healthy CVMs in the cluster ensuring no production impact on the cluster. AOS upgrade does NOT need any downtime but can be a disruptive if two or more CVM goes down due to an unforeseeable problem, therefore it is strongly recommended to do the upgrade during non-production hours.
- Pro-TIP: AHV and Firmware upgrades follow a graceful 'one node at a time' approach and migration of user VMs during this activity is governed by the configurations on the hypervisor. They are safe to be done during production hours, but is strongly recommended to plan these activities during a maintenance window as these involve host reboots.
Planning the upgrades:
Not finding the target version in the Upgrade Software panel? The Upgrade Software dialog box in the web console lists software versions available for upgrade. If a version is listed, that version is compatible with your cluster and you can upgrade at any time to it. If a version is not listed, it might not be available for "1-click" download. Depending on compatibility, you can still upgrade to these unlisted versions manually by downloading binaries and metadata files from the Nutanix Portal and then uploading them through the web console. In some cases, you might need to upgrade to an intermediate version as part of a multi-step upgrade to upgrade to your desired version.
Based on the below listed parameters, draw out an upgrade process outline before starting the upgrades. This will help avoid upgrade initiation issues due to incompatibilities and upgrade path restrictions.
Compatibility Matrix:
Planning to upgrade but worried about compatibilities between the hardware, hypervisor, AOS and Guest OS? Browse through our Compatibility Matrix to get Hardware-Hypervisor-AOS compatibilities. AHV Guest OS matrix section lists the major and minor Guest OS release versions qualified and supported by Nutanix.
- PRO-TIP: This compatibility matrix shows the AOS, AHV, and hypervisor compatibility for Nutanix NX and SX Series platforms and SW-Only Models qualified by Nutanix (such as Cisco UCS, Dell PowerEdge, HPE ProLiant and others listed here). For other platforms not listed here, such as Dell XC, Lenovo HX, and others, please see your vendor documentation for compatibility.
Upgrade Paths:
If you are planning to upgrade components like AOS, Prism Central or Files, always make sure you check the Upgrade Path Matrix before devising the upgrade plan. This matrix provides details on the versions supported for upgrades from the source version. If your source to target upgrade is not supported by a single jump, use multi step upgrade as suitable.
Software Product Interoperability:
Do you have Prism Central and Nutanix Files setup in your cluster? Before deciding on an upgrade plan, refer the Software Product Interoperability matrix to decide on a foolproof upgrade pathway to reach your target versions. In situations where your environment versions doesn't allow a single hop upgrade of all entities as per this matrix, try to devise a back and forth upgrade plan to get all the involved entities on interoperable versions.
Performing the upgrades:
Once you have the upgrade plan figured out, always do a basic sanity check on the cluster before performing the actual upgrades.
Check out my post for upgrade specific set of sanity checks. KB-2852 lists a couple more parameters to check for cluster stability.
Relevancy of the above listed post in a nutshell:
- When upgrading PC: Check the PC services status, NCC on PE and NCC on PC.
- For AOS and NCC upgrades: Cluster services status, metadata ring, data resiliency, host and CVM maintenance mode and the NCC health check would be good parameters to begin with.
Once you ensure cluster is stable, you can perform the upgrade activity. Remember to perform the same set of sanity checks after the completion of the upgrades. Note that, post an upgrade activity, the cluster takes some time to stabilize. Wait for a few minutes before checking sanity after an upgrade or reboot activity.
Note: When you perform an AOS upgrade, components like Foundation and NCC are implicitly upgraded along with it to the compatible version bundled with that release. Performing NCC upgrade before PC or AOS upgrades and running a fresh NCC health check will be an additional checkpoint to ensure there are no critical failures in the cluster, which might interrupt the upgrades.
I shall be posting a series of posts with quick links, tips and tricks on the individual component upgrades and posting the links on this post as well. Stay Tuned!!