Protection Domain Based DR


Badge +2

Hi, I am having two physical sites with on-prem nutanix cluster. I am trying to config async dr from primary to secondary site every one hour. So the snapshot will trigger every one hour and replicate to the secondary site. I read in the portal , the first replica is a full copy and the subsequent are incremental. My question here is , once my first replica, which is a full copy of my data was completed to the secondary site and certain period of time, my first replica will expire and deleted based on my retention configuration. In that case, how can I restore my full data with the help of incremental snapshot from the remote site?


This topic has been closed for comments

32 replies

Userlevel 3
Badge +5

Hi Jay,

 

yes exactly, 

 

F>P

Badge +2

Yes. But in that doc, they have mentioned we can create manual as well. I understood that is  like on demand snap and replication. That is fine. But can you confirm the below one finally.

If i want to automate my Recovery point creation and replicate (Async)to the secondary site  based on the schedule ,

  1. Either I can go with protection-domain based  DR with starter license or DR(LEAP) with automatic manner like creating protection policies, replicate recovery point ,recovery plan with pro +advanced replication add on license. 

     I can understand that however doing a failover and failback either planned or unplanned we need to manually from the respective sites.

Userlevel 3
Badge +5

Hi Jay,

Its pleasure helping you, To add to the statement “DR with manual creation Recovery point and manual replication will not suit for my requirement.” 

The recovery points are always created automatically based on defined schedule, and automatically replicated. The only manual thing is when u have disaster u must activate and power on the VMs.

 

F>P

Badge +2

OK thanks for the info. Much appreciate your help on this. I hope the below two points are valid.

If i want to automate my Recovery point creation (async)and replicate to the secondary site  based on the schedule ,

  1. Either I can go with protection-domain based  DR with starter license or DR with automatic manner like creating protection policies, replicate recovery point ,recovery plan with pro +advanced replication add on license. 
  2. DR with manual creation Recovery point and manual replication will not suit for my requirement.

-JAI

Userlevel 3
Badge +5

Hi,

You can use https://www.nutanix.com/one-platform and access the DR which gives you guided tour and lab to validate all the options before deploying on your production cluster.

F>P

Userlevel 3
Badge +5

Hi,

Using DR through PC, you have option to see all Recovery points, if the VM is not active you can only clone on recovery site, cannot revert as the entity is not active there.

The documentation is generic, as you have different VMs running on both the cluster and replicated to each other hence not keeping single cluster idle.

F>P

Badge +2

Yes in PD i understood. I a referring in Disaster recovery section below.

https://portal.nutanix.com/page/documents/details?targetId=Disaster-Recovery-DRaaS-Guide-vpc_2023_1_0_1:ecd-ecdr-procedure-manualprotection-pc-c.html ….Kindly look this and let me know.

 

Userlevel 3
Badge +5

Hi

There is no revert option in recovery site, on recovery site you will have activate option, when u have failure and primary site is not available you have to activate and the snapshots and entities will be available, with PD you will need to power on vms, using recovery plan vms will be on automatically as defined.

 

F>P

 

Badge +2

But my question is how the revert option will be available in recovery site. The revert option will overwrite the original entities which will be in my primary.

-JAI

Userlevel 3
Badge +5

Hi Jay,

Yes, Nutanix snapshots either local or remote you can revert (replace original entity to that point in time) or clone form snapshot (create new entity from that point in time and leave original entity intact).

If required you can pull the remote snapshot and take same actions on that as above.

 

F>P

Badge +2

Hi,

     Thanks for your reply.

Protection and Manual DR (Disaster Recovery) - From this link https://portal.nutanix.com/page/documents/details?targetId=Disaster-Recovery-DRaaS-Guide-vpc_2023_1_0_1:ecd-ecdr-procedure-manualprotection-pc-c.html -  There are two options Clone and Revert...Here Revert Option is only available in the primary site ? Because the revert option will overwrite the existing VM which is running only on the  primary site. Can  you clarify this pls?

 

Userlevel 3
Badge +5

Hi Jay,

Exactly, 

Ref: https://www.nutanix.com/products/cloud-platform/software-options

Data Protection and Disaster Recovery section

F>P

Badge +2

Ok. I want to clear the license needed part.  Let me put like this and correct me if anything wrong.

 

  1. Protection Domain -based DR  :      Work with starter license(Async), implement in prism element both primary and secondary.
  2. Disaster Recovery(Leap) -  Either manual/automatic(Async), We need pro with advanced replicaion license or ultimate. Need to enable from prism central.

If it is nearsync or sync, with both PD based DR or Leap, we need required license to work.

Userlevel 3
Badge +5

Hi Jay,

DR Orchestration is always manually triggered though Recovery Plan,  What it means is when the primary site failed for any reason, the recovery plans are manually triggered by administrator on secondary site. that will automatically power on replicated VMs in sequence defined.

When u need kind of fully automated failover you have to consider a third site which host the witness to avoid split brain scenarios, where once a site failed the VMs will be restarted automatically like a HA (like within cluster when a nodes fails) to other cluster. That is also referred as MSC (Metro stretch cluster) or Metro Availability.

F>P 

Badge +2

You mean to say * Automate DR orchestration (Prism Central)…….in the second line ?

-JAI

Userlevel 3
Badge +5

Hi Jay,

To be more specific,

Async RPO 60 min or above with manual DR - AOS Starter.

RPO near-sync (1-15 min ) and Sync replication, Metro availability (RPO 0), Manual DR Orchestration  (Prism Central) needs Advanced Replication addon top of AOS Pro license or AOS Ultimate

F>P

Badge +2

Hi,

    @sl.farhanparkar . Thanks for your reply.   I am seeing the below topics in DR guide. 

Protection and Manual DR (Disaster Recovery)
Protection and Automated DR (Disaster Recovery)

Both the option needs Advanced Replication addon top of AOS Pro license or AOS Ultimate?

-JAI

Userlevel 3
Badge +5

Hi Jay.

There are few things to be understood.

The automatic failover is ONLY possible with using Metro Availability, with witness site ONLY. That also requires Advanced Replication addon top of AOS Pro license or AOS Ultimate.

Metro Availability for vSphere / Hyper-V is configured using Prism Element and for AHV using Prism Central.

Prism Central allows you to create protection policies, create recovery plans which gives DR runbook capabilities, DR test drills, VM power on/ shut down sequence, re-ip etc. this is referred as DR orchestration or Runbook and requires Advanced Replication addon top of AOS Pro license or AOS Ultimate. 

Async DR using Prism Element gives basic DR capabilities through Protection Domain with 60 min RPO and manual activation and VM power actions on actions, this does not require any additional licenses.

Hope that clarifies, if you need you can DM me for peer discussion.

F>P

 

Badge +2

@StuB - 

What kind of orchestration you are referring ? Actually i want to create protection polices  with async  and recovery plans and ensure my entities will run automatically in DR if my primary goes down. Shall I achieve this with my starter license ?

-JAI

Badge +2

@hscavetta - Yes ! but that is only if we use protection domain based DR. If we are going to use Disaster Recovery ( LEAP) then we need to deploy PC.

Badge +1

Actually you don´t NEED to deploy Prism Central for configuring Replication between to Nutanix Clusters, just need to configure the Remote Sites + Protection Domains at each Prism Element.But it´s highly advisable to have at least a single PC instance as a single-pane-of-glass for managing all Nutanix clusters and features.

Userlevel 5
Badge +6

Thanks. If I want to use Disaster recovery(Leap) solution in my environment between two different sites. How many Prism Central I need to deploy ? and What License I need to have to use the DR feature?

-jai

Starter license will work for Async replication, for NearSync and Sync replication you will need Pro License + Adv. Rep. license.

One PC will work too.

Badge +2

What kind of orchestration you are referring ? Actually i want to create protection polices  with async  and recovery plans and ensure my entities will run automatically in DR if my primary goes down. Shall I achieve this with my starter license ?

-JAI

Userlevel 1
Badge +1

Hi, technically yes, both points are valid, you can use protection policies for Async replication with any license but you don’t get the advanced features such as orchestration, and you can operate Disaster Recovery from a single PC to which both clusters are registered.  Obviously if you are going with a single PC it would need to be running on the target cluster not the source one.

Badge +2

Thanks for your reply. I can see even with starter license ,the disaster recovery will work. Only if we want Advanced Orchestration with Runbook Automation2 we need  Pro license with Advanced Replication add-on, or Ultimate license. can you vaildate my point pls ? 

 

Also having two PC at each site is best practice. But it will also work if we have one PC where both the clusters can register… Pls validate this point as well.