Connectivity to the remote site is not normal. What to do?

  • 4 March 2020
  • 0 replies
  • 1813 views

Userlevel 3
Badge +4

The alert text “Connectivity to remote site is not normal” may show up for a number of different reasons, so it may not be immediately clear what to do about it.

If you have recently added a remote site configuration in Prism it could be some problem of configuration, but if you had Async DR replication working for several days before the error popped up the issue is more likely to be related to network issues or the status of the remote site.

Since the cause might only be temporary, the first thing to do is to re-run the check. To see the full detail I recommend running this one from the CLI:

nutanix@cvm$ ncc health_checks data_protection_checks remote_site_checks remote_site_connectivity_check

If you see a PASS result now and no corrective action was taken, the issue was temporary. If that’s the case you may want to check the alert timestamp and look into whether the network or remote site was undergoing maintenance or upgrade, or exhibiting some issue at the time of the alert. Since high ping latency could be the cause you might also want to look at cluster and CVM utilization levels around the time of the alert since it could be I/O, CPU or RAM contention issue causing higher ping latency.

If you see a FAIL result and this is a new setup, the details given should help narrow down the cause. Please note that ping is tested first and further tests are not run if ping fails since it is assumed that other connectivity is not possible. Ensure ping is allowed across any firewall between the sites.

If you are seeing a FAIL result and replication was previously working, review any recent changes and maintenance tasks first to see if something could have affected connectivity. After that we should pursue basic network connectivity testing and validating the health of the remote cluster. Since the check is validating we can reach the services on the remote site any disruption to cluster services or cluster stability on the remote site could be a cause here.

For more detail on testing and resolution steps review the KB article NCC Health Check: remote_site_connectivity_check


This topic has been closed for comments