How It works
Have questions about how the Nutanix Platform works? Looking to get started - start here!
- 1,135 Topics
- 1,683 Replies
Hi,I have a customer with multiple clusters running hyper-V 2016. We are noticing dropped RX packets in the clusters. There was to an extent where there was a RX dropped packets alert generated in one of the clusters. So we proceeded to upgrade the Intel NIC firmware as recommended by support, after this, we notice the alert does not happen anymore. But we still see dropped packets.Is this normal or okay for a cluster/NIC to have rx dropped packets?
Below are the top knowledge base articles for the month of March 2020. KB 4116 - Alert - A1187, A1188 - ECCErrorsLast1Day, ECCErrorsLast10Days KB 7503 - G6, G7 platforms with BIOS 41.002 and higher - DIMM Error handling and replacement policy KB 4141 - Alert - A1046 - PowerSupplyDown KB 1540 - What to do when /home partition or /home/nutanix directory is full KB 1113 - HDD/SSD Troubleshooting KB 4158 - Alert - A1104 - PhysicalDiskBad KB 4188 - Alert - A1050, A1008 - IPMIError KB 2090 - AHV | Host and Guest Networking KB 4519 - NCC Health Check: check_ntp KB 4409 - LCM: (LifeCycle Manager) Troubleshooting Guide KB 8792 - NCC checks: same_hypervisor_version_check, duplicate_cvm_ip_check, same_timezone_check, esx_sioc_status_check, power_supply_check, orphan_vm_snapshot_check giving ERR KB 4541 - Alert - A101055 - MetadataDiskMountedCheck KB 2486 - NCC Health Check: cvm_mtu_check KB 2473 - NCC Health Check: cvm_memory_usage_check KB 3357 - NCC Health Check: ipmi_sel_cecc_check KB 4494 - N
Below are new knowledge base articles published on the week of March 22-28, 2020. KB 8864 - LCM Pre-check test_hyperv_2019_support KB 8940 - LCM on HPE - SPP update compatibility matrix KB 8999 - NCC Health Check: copyupblockissue_check KB 9007 - LCM Darksite: Inventory success but fails to list the available KB versions. KB 9014 - How to Create a Mapped or Network Drive from your Sandbox to a Utility Server KB 9122 - download for AOS Bundles or any large files from Prism and Portal fails for specific customers KB 9132 - Alert - A1305 - Node is in degraded state KB 9137 - Overview of Memory related enhancements introduced in BIOS: 42.300 and BMC: 7.07 for Nutanix NX-G6 and NX-G7 systems KB 9138 - LCM on INSPUR - BIOS-BMC Compatibility matrix Note: You may need to log in to the Support Portal to view some of these articles.
SAS is the leader in analytics and Nutanix is the leader in invisible infrastructure. Nutanix has thousands of customers and many of them already have SAS software running in their organization. They have experienced the benefits of invisible infrastructure and are moving more of their applications (including SAS) to Nutanix. It’s easy to deploy and manage SAS 9.4 and SAS Viya on Nutanix Today, SAS 9.4 helps discover insights, manage data and make analytics approachable. SAS 9.4 has been tested on Nutanix NX Models – both as a hyper-converged infrastructure (HCI), as well as using Nutanix as back-end storage only, for external hosts. Nutanix AHV clusters perform well in both scenarios. SAS software makes great demands of IT infrastructure, so you must get the design right to ensure a successful deployment. Evaluating the SAS I/O requirements accurately is pivotal. Nutanix encourages involvement of Nutanix engineers to determine the back-end service requirements as well as optimal imple
Scheduled power outage? Relocating cluster hardware? If you need to shut down all the nodes in your AHV cluster, here's how.
For most maintenance tasks and upgrades we can keep the cluster up and VMs running, but in some cases the whole cluster will need to be shut down. If you just need to power off a single node, a cluster of three or more nodes won’t need to stop. To stop a single-node cluster please see the section “Shutting Down a Single-node Cluster” in the NX and SX series hardware administration guide. To stop a single node in a larger cluster see the section “Shutting Down a Node in a Cluster (AHV)” If there's going to be a site power outage, a full network outage, or physical relocation of the whole cluster you're going to want to gracefully shut down the whole cluster. The full procedure is covered in the article Shutting Down an AHV Cluster for Maintenance or Relocation. In summary the procedure will be as follows: Update NCC and perform a health check, then address any items of concern. Shut down all the user VMs. Stop any Nutanix Files cluster, if applicable. At this point no VMs other than
Hi, I’ve configured a Nutanix device running Prism 5.10 with SNMP - I’ve set a transport for UDP on port 161 and made sure there’s a tick in Enable for Nutanix objects. But when I try to run an SNMPWalk or SNMPGet from a device on the same subnet to start building a custom service in Solarwinds NCentral I get no response from the device. To confirm the issue isn’t on the server I’m running the query from I’ve done the same thing to another windows server and that works fine, so I’m fairly confident I’ve got the local firewall configured, I’m just not getting a response from the Nutanix device? I configured a trap on the Nutanix device and pointed that at the same server and that worked, it just doesn’t seem to be responding to incoming SNMP requests? Any suggestions as to what I may have missed would be gratefully accepted!
Let’s say that you ran the health checks on your cluster and received a failure under the component “cvm_name_check”, what does it mean and how do you fix it? The NCC health check cvm_name_check ensures that any renamed CVMs (Controller VMs) conform to the correct naming convention to avoid issues with certain operations that depend on identifying the CVM from UVMs on the same host. The default Controller VM naming is NTNX-<block_serial>-<position-in-block>-CVM. The display name of the Controller VM must always: Start with "NTNX-"; and End with "-CVM" For more information check out: https://portal.nutanix.com/#/page/kbs/details?targetId=kA00e000000XfCMCA0 To see how to modify the hostname of the Controller VM check out: https://portal.nutanix.com/#/page/kbs/details?targetId=kA032000000TUjkCAG You followed the naming convention and the check is still showing a failure? This might be a false-positive alert depending on your AOS, contact Nutanix support for verificatio
You may have noticed when adding disks to the node or replacing disks with larger capacity ones, utilisation distribution between the disks does not occur immediately. You check on the cluster sometime later and notice that newly added disks still show minimal usage, much lower than expected. By default, the aim is to bring disks utilisation within +/-7.5% spread of the tier utilisation. There are some things to consider when expecting a certain outcome: Disk balancing is not triggered unless the tier usage is at least 35%. Only 1 GB of data is moved during a Curator scan per node. Even if the tier usage is below 35% should any disk usage across the cluster reach 70%, disk balancing takes place. Disk Balancing: Disk balancing ensures data is evenly distributed across all disks in a cluster. In disk balancing, data is moved within the same tier to balance out the disk utilization. This is different from ILM (Information Lifecycle Management), where data is moved between dif
Is there a way to execute a script (or PowerShell script/command) inside a newly created AHV VM which does not have an IP address yet? The VM has Nutanix Guest Tools installed. I’m looking for automating the creation of a new VM. In VMware, I use the Invoke-VMScript command. I didn’t know if Nutanix had something similar? Thanks! -TimG
When a Nutanix / vSphere cluster is deployed by Foundation the recommended drivers are installed, but after some time you may want to check if there is a newer driver recommended. From the Nutanix perspective, we have covered this with an NCC Health Check: esx_driver_compatibility_check so if you update NCC and run a health check, this check should tell you whether there is a later driver version qualified by Nutanix. To run the check from the CLI use “ncc health_checks hypervisor_checks esx_driver_compatibility_check” from any CVM in the cluster. You may see a newer driver listed for your NIC hardware and ESXi version. A newer driver may not have been qualified yet by Nutanix and in some cases could cause issues for the cluster, so generally we recommend staying with the recommended drivers as identified by NCC.
Q. Does Nutanix support inline encryption? Inline encryption is not currently supported on the Nutanix platform. However Data At Rest Encryption (DARE) of two kinds is supported: Using Self Encrypted Drives (SED) is supported Security Guide v5.16: Preparing for Data-at-Rest Encryption (SEDs) Security Guide v5.16: Configuring Data-at-Rest Encryption (SEDs) Software Only Data Encryption Security Guide: Data-at-Rest-Encryption (Software Only) Q. How do I know that the data is encrypted? More details around the encryption status and logs can be viewed via the nCLI, using REST APIs or PowerShell cmdlets. KB-7846 How to verify that data is encrypted with Nutanix data-at-rest encryption Q. Is it possible to monitor the encryption? Monitoring of the encryption state is done via our Nutanix Cluster Checks (NCC) that generate an alert on any issue detected within the cluster. Please keep in mind that enabling encryption is a cluster-scope setting. Q. What is recommended sizing of the CV
Below are new knowledge base articles published on the week of March 15-21, 2020. KB 8885 - Alert - A15039 - IPMI SEL UECC Check KB 9009 - AHV | No Intel Turbo Boost frequencies shown in the output of "cpupower frequency-info" command KB 9070 - [CSI] PVC Volumes Stuck in Pending State | Error: Secret value is not encoded using '<prism-ip>:<prism-port>:<user>:<password>' format KB 9071 - Configuring hypervisor after satadom replacement fails with phoenix 4.5.2 KB 9085 - [Objects 2.0] Error creating the object store at the deploy step. KB 9095 - Nutanix Files- FSVM expansion may file with error IP already in use KB 9102 - How to identify plugged-in SFP module hardware details KB 9103 - WARN: Could not use proxy. URL Error <urlopen error [SSL: TLSV1_ALERT_INTERNAL_ERROR] Note: You may need to log in to the Support Portal to view some of these articles.
In a Nutanix AHV cluster the image service is used to index and manage ISO and virtual disk images for cloning to new VM disks or mounting to the virtual CDROM. With the addition of Prism Central 5.5 or later, this adds a global image service to manage these files across multiple clusters. When managing images from Prism Central we sometimes will see an image show up on a cluster as "inactive". This means the metadata for the image exists but the file does not exist locally on that cluster. The article "Prism Central: Adding Images to Prism Central" gives a few options to remediate this condition when an image is needed on a certain cluster but is inactive. These methods are useful with Prism Central 5.5 and 5.10 versions. In Prism Central 5.11 we have added image placement methods to control where your images will be available. During image upload you can choose to select individual clusters where the image should reside, or you can apply a category to the image to utilize an imag
Hello, does anybody have experiences with a container OS like RancherOS on AHV? In the Compatibility Matrix of Nutanix I can see that RancherOS v1.5.5 is compatible with my AOS 5.10, but I don’t know which image of RancherOS I should use. The are some images for a specific cloud provider and some for a specific hypervisor, but no one for Nutanix, AHV or KVM. I tought then I simply have to use the rancheros.iso, but there a some contradictions that I don’t understand. The good news is that the VM is working as expected, but RancherOS everytime enables the hyperv-vm-tools on startup and the container os-hypervvmtools is restarted every few seconds. The logs of this container are also empty. Does anybody know if there is something special to consider? For example do a need to enable the qemu-guest-tools or the kernel-extras? Any information or tips are welcome. Best regards H.Budde
2 of 3 nodes are fine and working. The 3nd CVM is up and i could ping it. Restart the Cluster with “allssh genesis stop cluster_health; cluster start” does not start these Cluster Partner. After Login in Prism i saw a “Disk degraded” for these 3rd node and now these disk is missing. How to fix node Nr. 3 in a 3 Node Cluster?
What is Erasure Coding? Erasure coding increases the usable capacity on a cluster. Instead of replicating data, erasure coding uses a parity information to rebuild data in the event of a disk failure. The capacity savings of erasure coding is in addition to deduplication and compression savings. If you have configured redundancy factor 2, two data copies are maintained. For example, consider a 6-node cluster with 4 data blocks (a b c d). In this example, we start with 4 data blocks (a b c d) configured with redundancy factor 2. In the following image, the white text represents the data blocks and the green text represents the copies. Data copies before Erasure Coding Computing Parity Data copies after Computation of Parity Erasure Coding Best Practices and Requirements: A cluster must have at least four nodes populated with each storage tier (SSD/HDD) represented to enable erasure coding. Avoid strips greater than (4, 1) because capacity savings provide diminishing returns and
What is Metro Availability? Nutanix provides native “stretch clustering” capabilities which allow for a compute and storage cluster to span multiple physical sites. In these deployments, the compute cluster spans two locations and has access to a shared pool of storage. The solution is currently available for ESXi only. This expands the VM HA domain from a single site to between two sites providing a near 0 RTO and a RPO of 0. In this deployment, each site has its own Nutanix cluster, however the containers are “stretched” by synchronously replicating to the remote site before acknowledging writes. The following figure shows high-level architecture of a Nutanix Metro Availability deployment: The following figure shows an example link failure: Nutanix Metro Availability also can be set up with Async-DR replication to a third site to combine the multi-site resiliency of the Metro setup with traditional space-efficient incremental snapshot backups. Metro Availability Configuration
This alert is generated when an API comes in which is authenticated as "admin". Nutanix recommends any script or third party application sending APIs to the cluster should use a service account rather than using 'admin'. You can read more about this alert from the article "Alert - ExternalClientAccessCheck" If you are seeing this alert, it is informing you that some system is authenticating as admin. To aid in investigation the IP address is provided. The intent is that any 3rd party application or script should be using a service account and not ‘admin’ as this makes command auditing much more reasonable and helps keep the admin password secure. When a third party application such as Veeam is set up to authenticate to the cluster as ‘admin’ that should generate this alert. If you log in to Prism Element as admin, access the REST API explorer, and then test an API you should see this alert because that’s your desktop sending an API as ‘admin’. Likewise if you set up a PowerShell script
I am having an External NTP server, Which is tagged to my CVM, unfortunately my AHV is not corresponding with mt NTP server even it is reachable from my hypervisor level. I have checked my Hypervisor thru ssh and run the command ‘date’ its gives me different time from my NTP server.
Below are new knowledge base articles published on the week of March 8-14, 2020. KB 7424 - NCC Health Check: metro_invalid_break_replication_timeout_check KB 9000 - NCC reports LSI firmware is blacklisted for DELL XC nodes, when LCM inventory does not have any newer versions KB 9042 - Prism Central session times out unexpectedly when a user logged in with 'Admin' role. KB 9063 - [Karbon] Kube DaemonSet Rollout Stuck Alert; Daemonset wrongly reports unavailable pods KB 9068 - AHV | nutanix-network-crashcart scripts fail with "No module named fc_progress" error on hosts imaged with Foundation 4.5.2 KB 9074 - [Karbon] Kubernetes Upgrade Fails with Error: Upgrade failed in component Monitoring Stack Could not upgrade k8s and/or addons Note: You may need to log in to the Support Portal to view some of these articles.
Below are new knowledge base articles published on the week of March 1-7, 2020. KB 8869 - NGT installation via Prism Central on Windows Server 2016 or more recent Operating Systems fails with INTERNAL ERROR message KB 8905 - How to download images from Prism Element clusters via command line KB 8917 - NCC - ERR : The plugin timed out KB 8993 - Foundation : Upgrade foundation using LCM Dark site bundle KB 8997 - Not able to delete a Role in Prism Central KB 8998 - Era Registration fails if container is not mounted on all hosts KB 9004 - Increased number of connections to File Server once migrated from Windows to Nutanix Files KB 9013 - How to Create a Shared Folder in Windows Server 2016/Windows 10 KB 9016 - Unable to open Java console after BMC upgrade from version 7.00 to 7.05 KB 9028 - ERA-DB provisioned from the OOB template failed to register with ERA server KB 9031 - Prism and Microsoft LDAP Channel Binding and Signing KB 9045 - How to find a VM creation date and time in Prism Cen
Suppose you need more disk capacity on a virtual machine in your environment. You choose the VM in Prism, click ‘update’, select to edit the appropriate disk, and change the size of the disk from 200 to 300GiB. You click update and see that the task completes successfully, then close the VM update UI. The VM details reflect the increase in disk space, but when you access the VM it appears the capacity of the drive is unchanged! This is actually expected. There is just a bit more work to be done. The partition will need to be extended following the steps for your VM guest operating system. You can see the steps to complete this in Windows from the KB article “Expand volume group disk size on Windows OS” or if you are using Linux, check the KB article “Increase disk size on Linux UVM”
Let’s say you received an alert stating that all CVMs are not in the same timezone or all hosts are not in the same timezone. What does it mean? Well, as simple as the alert indicates, the CVMs/hosts are not in the same timezone. We need to ensure that the same timezone is configured across all the CVMs/Hosts as it ensures that all the guest VMs log messages are timestamped consistently. How will you know about the timezone issue? There is an NCC health check, “same_timezone_check” in place to inform any discrepancy in the timezones. To know more about the alerts and errors which can be seen and how to change the timezone, take a look at https://support-portal.nutanix.com/#/page/kbs/details?targetId=kA0600000008hm9CAA Have any questions? Leave a comment and let’s start a discussion.
Hello guys, Can anyone please help with NCC Health Check report on Email? The purpose is to receive NCC Health Check every week on Tuesday at 7.00 AM CET. I configured NCC report frequency via Nutanix Prism Central on each nutanix cluster But now I receive a report each Tuesday and Wednesday at 7.00 AM CET. I have tried a method to disable NCC frequency and setup it newly, but no results, no changes. For configuring I used guide - https://portal.nutanix.com/#/page/docs/details?targetId=Web-Console-Guide-Prism-v55:wc-ncc-frequency-configuration-t.html Thank you in advance
Login to the community
Login with your account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.