Hi everyone
I have to deploy a Metro Availability architecture for two AHV clusters (no ESXi involved) and I have several questions so I’d appreciate some feedback from those who have deployed similar setups.
The main goal is to achieve the lowest possible RPO and RTO between two datacenters with a robust and simple architecture. Both clusters will be separated by about 100m on two separate buildings and they will be connected with FC cables (<5ms).
As far as I know, only ESXi with vMSC supports true RTO = 0
However, since this environment is AHV-only, my understanding is that in a failover event the VMs from Site A would need to be powered on in Site B rather than running continuously on both sides... Is that correct?
That being said, here you are the possible scenarios to deploy Metro Availability between both sites:
-
Two AHV clusters in active–active mode, without an external Witness
-
Two AHV clusters in active–active mode, with an external Witness (this one looks nice for me)
-
Two AHV clusters in active–passive mode, without an external Witness
-
Two AHV clusters in active–passive mode, with an external Witness
On top of that, I’m also evaluating how to deploy Prism Central in this scenario. These are the options I’m considering:
-
A single Prism Central managing both clusters (I assume this scenario makes only sense without external witness but its very fragile cause it has a sigle point of failure)
-
Each cluster will have its own Prism Central (I assume this scenario makes only sense with external witness and it seems to be the most robust in the event of a cluster failure the other site still has his own PC)
-
A single PC with async DR to the second site (the idea is to simplify the architecture by managing a single PC for both clusters but in case the cluster hosting the PC fails it can be recovered on site B)
-
One Prism Central per cluster + Adding Nutanix Central On-Prem to unify the PCs and avoid a single point of failure (As far as I know since PC 7.3 you can add an aditional management layer with Nutanix central on premise to manage several PCs and I think it avoids the request of external witness)
I’m particularly interested in what you consider the most stable and practical approach when running Metro Availability across two sites.
If you have experience running MA with AHV especially around Prism Central design choices, I’d really appreciate your opinions, recommendations, or pitfalls to avoid.
Thanks in advance!
