How It works
Have questions about how the Nutanix Platform works? Looking to get started - start here!
- 1,118 Topics
- 1,677 Replies
Hello Community!! I'd like to discuss an ambiguous issue.There is an 3-nodes Hybrid Storage AHV cluster set to RF2. In this case, each node has 70% storage usage. Each node’s storage capacity is same.Of course, it's 70 percent overall.The data resiliency status is critical.That cluster has enough CPU and memory. The problem comes up from here.If one node goes down, will data rebuild occur?If data rebuilding occurs, the total cluster capacity will be over 95%.That would cause problems with the cluster's IO. Is there anyone who can clearly tell me how the cluster works in the event of a failure in that situation.
Self Service restore (SSR) lets you open and copy a previous version of a file. For SMB, you can use SSR to restore files. Enable or disable SSR through the Prism Web Console SSR for SMB does not restore streams or attributes in directories. Files does not support SSR at the root of distributed shares or exportsFiles take snapshots of the stored cluster data at the share/export level. Self-Service Restore (SSR) exposes these snapshots to the share or export. SSR lets you restore a file from any of the previous snapshots without an administrator. SSR is disabled by default, but you can enable it during or after share or export creation. By default, Files creates a snapshot every hour and retains the most recent 24 snapshots. By default, Files deletes the oldest SSR snapshot after exceeding the retention count for the snapshot type. The snapshot retention count corresponds to the retention period. SSR has the following retention periods: 24 hours of hourly snapshots. 7 days for daily s
Below are new knowledge base articles published on the week of April 11-17, 2021.KB 10388 - Alert - A801102 - L2StretchInvalidAncVersion KB 10390 - Alert - A801104 - L2StretchVpnConnectionNotFound KB 10391 - Alert - A801105 - L2StretchRemoteAzUnreachable KB 10392 - Alert - A801106 - L2StretchSubnetNotFound KB 10393 - Alert - A801107 - L2StretchCidrMismatch KB 10394 - Alert - A801108 - L2StretchDhcpPoolOverlap KB 10395 - Alert - A801109 - L2StretchLocalIfConflict KB 10396 - Alert - A801110 - L2StretchRemoteIfConflict KB 10604 - Alert - A801111 - L2StretchIpConflict KB 10718 - NCC Health Check: node_storage_tier_skew_check KB 11034 - How to query the CVM, backplane, host, and IPMI IP address mappings from CVM CLI KB 11063 - How to convert the certificate from PKCS #7 to PEM format if openssl commands fail KB 11091 - "java.lang.NullPointerException" error during update of virtual switch in Prism UI KB 11101 - How to disable Prism Ultimate Trial license in Prism Central KB 11106 - Xi-Frame
Nutanix Insights is a new software-as-a-service (SaaS) offering that aims to redefine the Support experience for our customers, and significantly improve the health of their clusters, by leveraging the telemetry we receive from clusters where a customer has activated Pulse. Nutanix includes a set of features known collectively as Insights that provides a predictive health and support automation platform. Insights dynamically analyzes the extent to which you are following best practices in configuring your clusters for long-term reliability, availability, and performance.Insights works as follows: Pulse collects cluster data and sends it to Nutanix customer support. The Pulse data goes to the Insights engine, a SaaS-like service in the cloud, that does deep analytical processing of the Pulse telemetry and identifies potential issues based on findings or patterns in the data. Insights employs analytics built on historical data and best practices to identify cluster configuration gaps t
Below are new knowledge base articles published on the week of March 7-13, 2021.KB 9484 - Alert - A802003 - VpcRerouteRoutingPolicyInactive KB 10545 - Alert - A110457 - StaleVMPresent KB 10898 - Objects - Unable to register s3 endpoints that are self signed | Peer certificate cannot be authenticated with given CA Certificates KB 10909 - Move fails with SCSI VirtIO device driver not found if RedHat includes kernel version 2.6.32-220 or prior in /boot.Note: You may need to log in to the Support Portal to view some of these articles.
A Storage Replication Adapters (SRA) allows VMware Site Recovery Manager (SRM) to integrate with 3rd party storage array technology. Nutanix SRA is one such software module construct that allows SRM to interact with Nutanix clusters. This allows SRM to perform “Array Based Replication” using Nutanix replication, Data Protection and disaster recovery.SRM is an orchestration tool that allows us to perform recovery plans and runbook functionality. You can set up, test and perform pre and post recovery steps in case of a failover, Set the order in which the VMs come up and decide to change IP address if needed. SRM depends on vCenter and is licensed separately. Before you start, ensure the following. SRM and vCenter versions are compatible. Please visit the VMWare website to confirm compatibility. AOS SRA and SRM versions are compatible. Please refer to the Nutanix portal Nutanix SRA for SRM compatibility matrix to confirm compatibility. You will need 2 clusters managed by 2 vCenter
I’m sure you have seen that one before. In most cases you expect it or at least understand what caused it. In some instances you probably ignore it (we all do, no shame). What if this happens when you log into the CVM or the host? Has cluster security been compromised?During the upgrade or rescue of the AOS new keys are created for each node in the cluster. When you open SSH session, these keys are compared to those that were noted on the client previously and since there is a mismatch a warning is triggered.KB-2388 Upgrade/Re-install of AOS changes the ssh key for remote host identification explains how to clean up the keys to get rid of the warnings.
A single-node cluster is configured like a regular (three-node or more) cluster in many ways, but here are some of the conditions. Nutanix offers the option of a single-node cluster for ROBO implementations and other situations that require a lower cost option and accept lowered resiliency protections. Single-node clusters are supported only on a selected set of hardware models. Refer the following article for details single-node-supported-hardwares Do not exceed a maximum of 1000 IOPS Do not exceed a maximum of 5 guest VMs . To protect the guest VMs from a scenario of node failure, nutanix recommends to configure backups. These are unlike single-node replication targets which are for replication and backup purposes. LCM is supported for software updates, but not firmware updates. There is no built-in resiliency for Prism Central on a single-node cluster. Do not create a Prism Central instance (VM) in the single node cluster. Async DR is supported for 6 hour RPO only Use
Below are new knowledge base articles published on the week of March 14-20, 2021.KB 9365 - Alert - A802002 - AncDnsUnresolvable KB 10693 - Alert - A130334 - NGT CD-ROM not Unmounted on the VM KB 10694 - Alert - A130192 - Conflicting NGT policies KB 10746 - AHV Guest VM Boot Order Being Modified to Default by Prism After Any Config Save KB 10847 - LCM Pre-check: "test_ncc_checks" KB 10857 - UVMs with VLAN tag may disconnect from network when uplink bond is tagged with VLAN on AHV host KB 10874 - VM migration tasks stuck and libvirt in inconsistent state on clusters running AHV 20170830.x KB 10897 - Xi Frame - Enterprise profile disks growing on every reboot (not extending their partition) KB 10912 - Nutanix Files: Managing Files-At-Root on NFS distributed shares KB 10944 - Alert - A130103 - NGT Mount failed KB 10957 - AHV networking interfaces are renamed after AHV upgrade to 20201105.1082 or later if RDMA NICs are presentNote: You may need to log in to the Support Portal to view some o
Nutanix Era is a suite of software which automates and simplifies database management, bringing one-click simplicity and invisible operations to database provisioning and lifecycle management (LCM). Starting with Copy Data Management (CDM) as its first offering, Nutanix Era enables database admins to provision, clone, refresh and restore their databases to any point in time. Through a rich, but simple to use, UI and CLI, they can restore to the latest application-consistent transactionEra enables you to easily provision database environments (either production or otherwise) on your Nutanix clusters. Also, you can only provision the database server VM that hosts a database, so that you can later create or clone databases on that database server.Some of the components include Database engines: Custom software images that are tailor-made to enterprise needs. Database profiles: Customizable database profiles for software, compute, networking, and database parameters. Database recovery
NVIDIA GPUs primarily have two modes of operation: Compute and Graphics.Compute Mode: the GPU operates within a configuration that is optimized for high-performance computing applications.Graphics Mode: the GPU is optimized for graphics processing and can subsequently be assigned into vGPU profiles for virtual machines (vGPU profiles cannot be used while in compute mode).Various NVIDIA GPUs are provided with default configurations for either of these modes and, sometimes, it is necessary to change the mode to better suit the corresponding workload of the host.In previous models of GPUs, it has been necessary to temporarily boot an AHV host into a NVIDIA-provided Linux ISO and invoke a “gpumodeswich” command with options to apply this change. With newer models of GPU, a command can be found natively within the AHV host filesystem after the corresponding GRID driver has been installed.You can find more information regarding this command via the “Nvidia: Unable to Assign vGPUs to guests w
Below are new knowledge base articles published on the week of March 21-27, 2021.KB 10516 - [ Karbon ] PE cluster is showing alerts for VGs used by the Kubernetes cluster(s) KB 10651 - NCC Health Check: metering_rest_connection_check KB 10768 - A number of VMs may be missing from the list when monitoring cluster using SNMP protocol KB 10813 - "UnicodeDecodeError" and "UnicodeEncodeError" for VM operation KB 10936 - Duplicate scheduled reports triggered after Daylight Savings Time (DST) change KB 10946 - Identifying the source IP generating TCP Reset packets in a network path KB 10953 - Using Nutanix Objects Self-Signed Certificate with Veritas Enterprise Vault KB 10954 - SMCIPMITool commands output "The node product key needs to be activated for this device" on BMC 7.10 KB 10967 - Cloning a Secure boot enabled VM on AHV with the "Custom Script" option enabled fails with "q35 machine type does not support ide bus type" error KB 10976 - Cluster instability after upgrading both primary an
Failures are part of everything and Nutanix Clusters is not immune to it. But how we plan for failures determines the versatility of the product or a person for that matter!!Nutanix categorizes the type of failures into availability domains essentially based on type of failure. Nutanix provides the ability to tolerate rack failure for extended data availability, in addition to drive, node, block and network link failure. Node FailureA Nutanix Node comprises Physical host and a controller VM. Both these components can fail without any impact to the Nutanix cluster.CVM failureWhen a CVM fails, an alert is generated in Prism and another CVM redirects the storage path on the related host to another CVM. Read and writes will occur over the 10GbE network until the CVM comes back online.It is business as usual for the end customer with maybe a slight performance decrease.Controller VM FailurePhysical Host failureIf a node fails, all HA-protected VMs can be automatically restarted on other nod
Hardware failures are a part of any datacenter lifecycle. The Nutanix architecture was designed with this inevitability in mind. A cluster can tolerate one or two failures (depending on the replication factor of the cluster or container) of a variety of hardware components while still running guest VMs and responding to commands through the management console. Many of these failures also trigger an alert through that same management console in order to give the administrator a chance to respond to the situation.Nutanix provides the ability to tolerate rack failures for extended data availability, in addition to drive, node, block, and network link failure.Block fault tolerance lets a Nutanix cluster make redundant copies of data and metadata and place the copies on nodes in different blocks.A block is a rack-mountable enclosure that contains one to four Nutanix nodes. All nodes in a block share power supplies, front control panels (ears), backplane, and fans.Nutanix offers block fault
Below are the top knowledge base articles for the month of March 2021.KB 7503 - NX Hardware [Memory] – G6, G7 platforms - DIMM Error handling and replacement policy KB 4141 - Alert - A1046 - PowerSupplyDown KB 1540 - What to do when /home partition or /home/nutanix directory on a Controller VM is full KB 1113 - HDD/SSD Troubleshooting KB 4409 - LCM: (Life Cycle Manager) Troubleshooting Guide KB 9937 - Alert ID 111066 - Failed to send alert emails KB 4158 - Alert - A1104 - PhysicalDiskBad KB 6945 - How Upgrades Work at Nutanix KB 2090 - AHV host networking KB 2473 - NCC Health Check: cvm_memory_usage_check KB 3784 - Alert - A1030 - StargateTemporarilyDown KB 4519 - NCC Health Check: check_ntp KB 5582 - NCC Health Check: idf_db_to_db_sync_heartbeat_status_check KB 3741 - Nutanix Guest Tools Troubleshooting Guide KB 1863 - NCC Health Check: sufficient_disk_space_check KB 7386 - NCC Health Check: power_supply_check KB 6153 - NCC Health Check: default_password_check and pc_default_password_
Recycle bin feature is available from AOS 5.18 onwards. With help from Nutanix Support, the recycle bin tool helps you to restore deleted storage entities (guest VMs and volume group vDisks) and manage the recycle bin itself.After you delete a guest VM, the configuration file and disk remain in the recycle bin for up to 24 hours. After 24 hours, these files are deleted. The files are deleted in less than 24 hours if your cluster is unable to maintain sufficient free disk space.Recycle Bin Limitations and Guidelines The recycle bin stores vDisk and configuration data for up to 24 hours. After 24 hours, these files are deleted. The files are deleted in less than 24 hours if your cluster is unable to maintain sufficient free disk space. Recycle bin is not supported on storage containers where metro availability is enabled. Recycle bin is not available for recovering protection domain snapshots. You can disable and enable the recycle bin or clear its contents. As a default, the recycle
Below are new knowledge base articles published on the week of March 28-April 3, 2021.KB 8378 - NCC Health Check: Check Interface Configuration Files KB 10075 - NCC Health Check: ngt_client_cert_expiry_check KB 10743 - NCC Health Check: pulse_enablement_checks KB 10966 - Using storage_container_reference functionality with Nutanix provider for Terraform fails with error "SPEC_INCOMPATIBLE_ERROR" or "INVALID_REQUEST" KB 10980 - Hyper-V VM created in Windows Server 2008 R2 or older may fail to be migrated by Move KB 10998 - NCC ipmi_checksf fail after upgrading ESXi to version 7.0 KB 11018 - LCM-driven AHV upgrade may fail due to missing /bin/bash binary KB 11026 - Fujitsu hardware platform BIOS power setting preventing NVIDIA AHV vGPU driver loading KB 11031 - Agentless Management Service (AMS) version 11.4.0 filling up /tmp on HPE ESXi hostsNote: You may need to log in to the Support Portal to view some of these articles.
I want to gather data for the following stats metrics using the REST API v2. I understand that I need to put the metrics ID as part of the parameters, is that correct?In addition, can anyone tell me the ID for the following stats metrics?Disk Usage (%) GPU Framebuffer Usage GPU Usage GPU video decoder Usage GPU video encoder usage Memory Usage (%) Network Rx Bytes Network Tx BytesThanks!
Hi,Looking to automate some reports using the Rest API. So far I can successfully pull out CPU, RAM and Physical & Logical storage cluster wide and snapshot details but am unable to calculate the values as shown on the capacity runway when viewing a cluster in PC.. specifically the storage usage figure for ‘snapshots’ and ‘system.Anyone know how these are calculated?Thanks
Below are new knowledge base articles published on the week of April 4-10, 2021.KB 9952 - NCC Health Check: list_vms_with_qos_attrs KB 10350 - NCC Health Check: rdma_enabled_check KB 10482 - Getting Error Message "You’ve uploaded an invalid cluster summary file" When Licensing a Cluster KB 10668 - NCC Health Check: list_containers_being_converted_to_aes KB 10955 - Create Category in Prism returns "The name you entered is in use" even though the name does not show up in the list of categories KB 11010 - Adding Nutanix Objects as Primary or Secondary Storage with Veritas Enterprise Vault KB 11041 - Nutanix Files share not accessible after performing Nutanix Files upgrade or rebooting File Server VMs KB 11051 - Message "kvm: already loaded the other module" when booting into Phoenix KB 11054 - Expand cluster with node reimaging on AOS 5.19.1 and ESXi 7.0u1 fails KB 11057 - Cluster creation fails on HPE nodes when using generic VMWare ESXi image KB 11059 - Unable to enable HA or HA in crit
Login to the community
Login with your account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.