Nutanix Disaster Recovery
Make Disaster Recovery a Breeze
- 205 Topics
- 592 Replies
Good Day, I am very new to Nutanix and recently purchased a cluster. It has only been running for about 30 days now managing my network and I am in the process of make some configurations to it. I need some assistance with an error message that I am receiving. I am using AHV as hyper-visor on my 3 node cluster. On this cluster I am running 5 Windows based Server VM's (not using Hyper-V or VMware). I followed the instructions from the Administration Guide by enabling VSS Shadow Copies on the Servers, then installing guest tools on all the servers and creating a protect domain Async DR. My configurations are working and snapshots are being created for my Domain Controllers and Application Servers. However, when a snapshot is trying to be created of my File Server, I keep getting the following error. "Warning : VSS snapshot failed for the VM(s) FS-01 protected by the FileServer in the snapshot (169035, 1563300389081879, 960) because Quiescing guest VM(s) failed or timed out. Impact
Today - we will highlight the flexibility Nutanix offers with Protection Domains. You can access Protection Domains via the “Data Protection” Menu option in Prism.DR Basics in Nutanix:The foundation of data protection and disaster recovery in a Nutanix cluster is the concept of snapshots.Snapshots work very similar to VM/vDisk clones - leveraging a "redirect on write" algorithm that marks a snapshotted vDisk as immutable and directing new write operations (block overwrites and new blocks) to a new vDisk. Read operations reference the correct vDisk blocks based on metadata lookup from the meta-data store. From the Nutanix Bible - Book of Acropolis - Backup & Disaster RecoveryWhat is a Protection Domain (PD)?* Key Role: Macro group of VMs and/or files to protectDescription: A group of VMs and/or files to be replicated together on a desired schedule. A PD can protect a full container or you can select individual VMs and/or files.A protection domain is a group of Virtual Machines fo
Trying the veeam AHV appliance, i had a crash whilst taking a backup now am left with orphaned snapshots on a PAID ACCOUNT, and nutanix sends the ball to veeam back to nutanix, so here i am Protection Domain DP-QC-3 has 3 aged third-party backup snapshot(s) and may unnecessarily consume storage space in the cluster. i have a multitude of these and want to manually remove those snapshots, is there a process for this ?
Hello! I’m pretty new to this Nutanix world. Have been dealing with standard server+storage for more than a decade. We have 2 clusters here, with 3 nodes in each site with metro availability. There are 3 protection domains active in each site (active) that are replicated to the other site (passive) and vice-versa. Site1: Node1 - Node3 - Node5 PDs site1 (active): PROD_001, DEV_001, INFRA_001 PDs site1 (passive): PROD_002, DEV_002, INFRA_002 Site2: Node2 - Node4 - Node6 PDs site2 (active): PROD_002, DEV_002, INFRA_002 PDs site2 (passive): PROD_001, DEV_001, INFRA_001 In vCenter cluster configuration we obviously have affinity rules for VMs/Hosts in site1 and VMs/Hosts in site2, preventing the VMs running in “odd” nodes from being stored in “even” nodes. Sometimes we have to migrate VMs from one site to another. So we do a complete vmotion (compute and storage). After the migration, we start to constantly receive alerts with this message: Snapshot status for vstore INFRA_001: Faile
The abundance of options causes anxiety. That’s a proven fact. It is only true, however, if those options are hardly distinguishable. This is not the case with Nutanix DR options at all as each feature has a clear purpose and application. It all starts with an SLA. Understanding your disaster scenarios and the amount of data that you are ready to lose in the event of the disaster is the key to choosing a solution that is right for you. Roughly, there are several main options: Local backups (also known as Time Stream). Use these to roll back any Guest OS maintenance that went wrong. Remove a snapshot when the maintenance is over and the rollback is no longer required. Do not leave your snapshots unattended. Remote backup and DR: Asynchronous Replication: Protection from VM Corruption and Deletion as well as from a total site failure. Snapshots can be 1hr or longer apart. Near-sync Replication - leverages Lightweight Snapshots that are snapshots of the metadata. Recovery points can
There are many situations when you might need to abort an ongoing disaster recovery replication job on your Nutanix cluster. You may be deleting the protection domain or maybe you added a large VM to an existing protection domain and now you don’t want to include this new VM’s baseline replication in the existing job, at least not before the weekend. Maybe the destination is running short of space and we need to stop adding to the problem. You can see the replication in Prism, but there’s no button to pause or cancel it. To abort the replication we will need to go to the CLI. This process is covered in https://portal.nutanix.com/kb/3272 SSH to a CVM in the cluster as the user ‘nutanix’, then run ‘ncli’ to get into this interactive CLI. First, list current ongoing replications with this command. You’ll need the Protection Domain name and ID from this output for the next command. <ncli> pd ls-repl-status You’ll need the Protection Domain name and ID from this output for th
While learning AHV I’m a bit confused on a few things as it relates to vStores and Remote Sites. What is the purpose of a vStore and how do they differ from Containers? What are the implications of mapping a Site-A’s vStore (Container-A) to Site-B’s vStore (Container-B) that has running VMs in it? What are the implications of not including all VMs from within Site-A’s Container-A in a Protection Domain and failing-over said PD to Site-B? - Will the unprotected VMs from Container-A continue to run in Site-A? - What happens when I fail-back the PD to Site-A? Let’s say I’ve mapped Container-A (Site-A) to Container-B (Site-B). If I create a Protection Domain that contains VMs from Container-A2 (Site-A), how does this factor into my vStore mappings between sites? Thanks for the assist!
I'm looking into backing up our Nutanix Acropolis cluster to either Azure or AWS. What I'm curious about are the actual costs once you factor in not just their basic storage rate, but Puts and data writes and everything that happens during a backup and restore (rare as those would be). We have around 5 TB total with all our VMs, with the daily snapshots averaging around 40GB in changed data. We would want to retain maybe a month's worth of backups with dailies and weeklies. Is this enough data to do a decent estimate, or is it one of those 'Try and see' scenarios? I'd be interested in anybody else that's utilizaing it as a remote backup site and what kinds of costs they're seeing. Thanks! Clint
Hello, When I try to do a consistent snapshot I get this: Warning : Vss snapshot failed for the VM(s) xxx, protected by xxx because Quiescing guest VM(s) failed or timed out. Impact : Crash consistent snapshot is taken instead of application consistent snapshot. Cause : Guest is not able to quiesce VM due to internal error. The servers has the NGT installed, are alone in the consistency group. Before I open a case, I'd like to have your hindsight. Thanks.
We love our VM level snapshots on VMware ESXi. It is a quick and easy way to roll back recent changes as if nothing has ever happened. It gives peace of mind and boosts confidence. It is not an answer to all prayers as, just like anything, it has its limitations. What are the limitations of VM level snapshot in a Nutanix cluster? First, let’s take a look at what are some of the interesting events that occur during the snapshot operation and its presence: If a virtual machine is running off of a snapshot, it is making changes to a child or sparse disk also called delta disk. The delta disk metadata in-memory of vSphere host includes the delta disk header. Updates to the header of the delta disks happen in memory as required and the changes are written to disk only upon certain events such as snapshot consolidation or when the delta disk is closed. Storage snapshot operations and storage replications are transparent to ESXi hosts. If the storage snapshot used to restore a VM wa
There are 2 nutanix cluster in different regions. I'm curious about a replication issue. I currently have 19.77 TiB free (physical) 50.7 TiB and 9.74TiB free (logical) 25.35 TiB disk space on first nutanix cluster site . there is enough free space in the second zone. I saw it through Nutanix. now if I replication 2 servers ( both 500gb ) in other region. in the configuration, local snapshot and remote snapshot 1 selected. How much to lose from Nutanix storage. For example, in the nutanix cluster there is 9.7 tb of free space, if I replicate 10 servers . 500 gb each. how much space will i lose on nutanix storage. Does this have an account? I want to learn this. I have 9.7 tb of free space, but I want to replicate 10-15 servers. Is there a problem in the storage area? How to calculate?
Hi, Does Nutanix Leap support ESXi Hypervisor for on-premise DR automation? In two different sources I have found different information: AAPM 5.10: Data Protection: Prism Central and Acropolis 5.10 or higher must be deployed on both sites. Orchestration is only available if you run AHV. Xi Leap Admin Guide (page 7): Hypervisor Requirements Supported Hypervisors On premises, the hypervisor must be one of the following: • AHV • ESXi https://nutanixbible.com/ Supported Environment(s): On-Premise: AHV (As of AOS 5.10) Thank you!
From this tech note - [url=http://go.nutanix.com/rs/nutanix/images/TechNote-Nutanix_Storage_Configuration_for_vSphere.pdf,]http://go.nutanix.com/rs/nutanix/images/TechNote-Nutanix_Storage_Configuration_for_vSphere.pdf,[/url] my impression is that Nutanix thin provisions a VM if the disk are set to "Thick Provisioned Lazy Zeroed". [i]All Nutanix containers are thin provisioned by default; this is a feature of NDFS. Thin provisioning is a widely accepted technology that has been proven over time by multiple storage vendors, including VMware. As containers are presented by default as NFS datastores to VMware vSphere hosts, all VMs will also be thin provisioned by default. This results in dramatically improved storage capacity utilization without the traditional performance impact. Thick provisioning on a VMDK level is available if required for the limited use cases such as fault tolerance (FT) or highly demanding database and I/O workloads. Thick provisioning can be accomplished by cr
Hi there, I have many PDs and inside these PDs I have many VMs. I want to be able to run NCLI (or the like) to unprotect VM’s using a wildcard if possible. Example: ncli pd unprotect name=ProtectionDomain vm_name=VM* Anyone have an idea on how to do this to unprotect machines by wildcard to remove a mass amount at once?
Does anyone have thoughts on how backup and recovery will work with the Acropolis hypervisor? Our company is definitely interested in moving in this direction, but the simplicity of Veeam B&R has been great, and I do not want to move away from that unless there is a solution that provides all of veeams features.
The best way to know if your DR solution will work when you need it is to actually test the DR workflows, right? Of course a DR test can be disruptive so you’ll want to understand the procedures and best practices before your testing window actually starts.The Async-DR solution built into your Nutanix cluster can handle both planned and unplanned failover scenarios. Testing these capabilities is not much different from simply using them when needed.The two most relevant documents for these procedures will be the Prism Web Console Guide and the Data Protection and Disaster Recovery best practice guide. The Prism Web Console Guide provides the execution steps for setup and failover, while the best practice guide provides additional detail on available solutions, requirements, and related considerations around space, bandwidth, and seeding, and a best practices checklist. After reviewing the requirements and scheduling a test window, you can follow the planned failover workflow in the Pri
Getting the following error “VSS Scripts Not Installed” in your Nutanix environment with the description “VSS software or pre_freeze/post_thaw Scripts Not Installed”.Confused about VSS and the above-mentioned scripts?Let us help you understand the use of VSS for a snapshot!When you are enabling the Nutanix Guest tool, the following features VSS and Application consistent snapshot is enabled by default.Nutanix native in-guest VmQuiesced Snapshot Service (VSS) agent is used to take application-consistent snapshots for all the VMs that support VSS. This mechanism takes application-consistent snapshots without any VM stuns (temporary unresponsive VMs) and also enables third-party backup providers like CommVault and Rubrik to take application-consistent snapshots on the Nutanix platform in a hypervisor-agnostic manner. What’s the use of pre_freeze/post_thaw then? Within a Windows VM, NGT will use the Microsoft VSS writer built into the OS to quiesce the VM to take the app consistent snapsho
I am new to our environment and have been tasked with backing up our VM Servers. The backups are scheduled hourly but have not been configured properly (I believe we are missing a backup location, but I have not been able to find where to configure that item). Any help would be appreciated.
We are moving away from VMware over to Acropolis and I was wondering what others have been using to backup their VMs under Acropolis, 3rd party wise? Under VMware, we had been using Veeam, but this is no longer a choice.Any comments/suggestions would be helpful.
Hi, We are looking for a solution to restore the VMs running on Nutanix AHV to a vSphere (SAN based) environment. I've heard that HYCU can do backups on ESXi non-nutanix, however is not clear if I can restore VMs running on AHV to ESXi non-Nutanix and vice-versa. If HYCU is not the answer, is there any other vendor that can achieve this goal? Nutanix AHV backup/DR to ESXi non-Nutanix. We need to run the VMs on ESXi non-Nutanix in case of Nutanix/Primary DC failure.
I am backing up with veeam in nutanix cluster. I use esxi 6.5 on nutanix cluster I also replicate to remote office with some virtual machines with data protection. and backup of the same virtual servers with veeam. Is there a limit here or this building is made right. both are using the same snapshot. Would it be a problem if replication starts with veeam backup at the same time
Login to the community
Login with your account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.