Solved

Remote site replication tuning


Userlevel 2
Badge +5
What options are available within nutanix to speed up replication to a remote site? The best i am able to get on a single stream is about 20MBps (160mbps) with an aggregate speed of about 60MBps (480mbps) spread out over several streams. However, we have a 10gbps link between sites, so i expected to get better throughput.
icon

Best answer by penguindows 8 June 2018, 16:35

Yes, I see dlink7,s reply. Thanks @dlink7 for the good info on expected performance under ideal situations. That does give me a good baseline.

Our issue is definitely latency. Our remote site is across the continental US, and our per VM size is 30TB (10TB used) on a single disk. as you can imagine, this creates a bottleneck for us. Replications with default settings are not meeting RPO.

Our AOS version is 5.5.2.

I have found (with help from support) these tunable settings within the cluster:


code:
nutanix@NTNX:~$ python /home/nutanix/serviceability/bin/edit-aos-gflags | grep stargate_
2018-06-08 10:00:09 INFO zookeeper_session.py:110 edit-aos-gflags is attempting to connect to Zookeeper
stargate_cerebro_replication_max_rpc_vblocks = 16 #default 4
stargate_cerebro_replication_max_rpc_data = 4194304 #default 1048576
stargate_cerebro_max_outstanding_vdisk_replication_rpcs = 16 #default 4
stargate_cerebro_replication_param_multiplier = 32 #default 16
stargate_vdisk_read_extents_max_outstanding_egroup_reads = 6 #default 3


I am putting together a helpful howto on how to tune these parameters, but the gist of it is editing with...

code:
nutanix@NTNX:~$ python /home/nutanix/serviceability/bin/edit-aos-gflags --service=stargate


...then restarting stargate on each cvm.

The restarts went perfectly with a sleep 60 in there, we had no failure on running replications, and our throughput went up ~4x as expected. our 12 day running replications began to finish.

my ultimate solution is two fold. first, the above settings to accelerate each stream. second, we are working to break out our 30TB disk in to 10x 3TB disks.

RE: going off stock; I think that as nutanix grows in to more environments (and they will, the technology is amazing and i foresee continued adoption) there are going to be many more flavors of environments that sticking to stock options just wont solve. I expect in the next few years that nutanix will pivot in to a more open position. I'd love to see some of these settings within reach of the prism GUI along with some better man, info pages on what each individual setting and switch does.

That being said, your warning about going off stock is noted and appreciated. i understand that nutanix attempts to tune AOS as well as possible out of the box, and that tuning these settings can have an impact (sometimes bad, sometimes good). Therefore, I will only seek a behavior change in the technology to achieve some desired result.

View original
This topic has been closed for comments

6 replies

Userlevel 4
Badge +20
Hi What AOS version are you rocking?

In Acropolis, every node can replicate four files, up to an aggregate of 100 MB/s at one time. Thus, in a four-node configuration, the cluster can replicate 400 MB/s or 3.2 Gb/s.

What can cause this number to go down? Other tasks on the cluster, like curator running if they snapchain is too high and bad latency.

If your still meeting your RPO I wouldn't change anything. Going off stock will come back to haunt you later.
Userlevel 7
Badge +35
Hi @penguindows

Did you see the reply from @dlink7
Userlevel 2
Badge +5
Yes, I see dlink7,s reply. Thanks @dlink7 for the good info on expected performance under ideal situations. That does give me a good baseline.

Our issue is definitely latency. Our remote site is across the continental US, and our per VM size is 30TB (10TB used) on a single disk. as you can imagine, this creates a bottleneck for us. Replications with default settings are not meeting RPO.

Our AOS version is 5.5.2.

I have found (with help from support) these tunable settings within the cluster:


code:
nutanix@NTNX:~$ python /home/nutanix/serviceability/bin/edit-aos-gflags | grep stargate_
2018-06-08 10:00:09 INFO zookeeper_session.py:110 edit-aos-gflags is attempting to connect to Zookeeper
stargate_cerebro_replication_max_rpc_vblocks = 16 #default 4
stargate_cerebro_replication_max_rpc_data = 4194304 #default 1048576
stargate_cerebro_max_outstanding_vdisk_replication_rpcs = 16 #default 4
stargate_cerebro_replication_param_multiplier = 32 #default 16
stargate_vdisk_read_extents_max_outstanding_egroup_reads = 6 #default 3


I am putting together a helpful howto on how to tune these parameters, but the gist of it is editing with...

code:
nutanix@NTNX:~$ python /home/nutanix/serviceability/bin/edit-aos-gflags --service=stargate


...then restarting stargate on each cvm.

The restarts went perfectly with a sleep 60 in there, we had no failure on running replications, and our throughput went up ~4x as expected. our 12 day running replications began to finish.

my ultimate solution is two fold. first, the above settings to accelerate each stream. second, we are working to break out our 30TB disk in to 10x 3TB disks.

RE: going off stock; I think that as nutanix grows in to more environments (and they will, the technology is amazing and i foresee continued adoption) there are going to be many more flavors of environments that sticking to stock options just wont solve. I expect in the next few years that nutanix will pivot in to a more open position. I'd love to see some of these settings within reach of the prism GUI along with some better man, info pages on what each individual setting and switch does.

That being said, your warning about going off stock is noted and appreciated. i understand that nutanix attempts to tune AOS as well as possible out of the box, and that tuning these settings can have an impact (sometimes bad, sometimes good). Therefore, I will only seek a behavior change in the technology to achieve some desired result.
Userlevel 4
Badge +20
You'll be happy to know that eng had project to make these settings more dynamic. Glad you got sorted.
penguindows wrote:

Yes, I see dlink7,s reply. Thanks @dlink7 for the good info on expected performance under ideal situations. That does give me a good baseline.

Our issue is definitely latency. Our remote site is across the continental US, and our per VM size is 30TB (10TB used) on a single disk. as you can imagine, this creates a bottleneck for us. Replications with default settings are not meeting RPO.

Our AOS version is 5.5.2.

I have found (with help from support) these tunable settings within the cluster:


code:
nutanix@NTNX:~$ python /home/nutanix/serviceability/bin/edit-aos-gflags | grep stargate_
2018-06-08 10
:00:09 INFO zookeeper_session.py:110 edit-aos-gflags is attempting to connect to Zookeeper
stargate_cerebro_replication_max_rpc_vblocks = 16 #default 4
stargate_cerebro_replication_max_rpc_data = 4194304 #default 1048576
stargate_cerebro_max_outstanding_vdisk_replication_rpcs = 16 #default 4
stargate_cerebro_replication_param_multiplier = 32 #default 16
stargate_vdisk_read_extents_max_outstanding_egroup_reads = 6 #default 3



I am putting together a helpful howto on how to tune these parameters, but the gist of it is editing with...

code:
nutanix@NTNX:~$ python /home/nutanix/serviceability/bin/edit-aos-gflags --service=stargate



...then restarting stargate on each cvm.


How did you make these changes or where is the how to you were creating? Thanks!
Userlevel 2
Badge +5
@Bradley4681 When you run:
code:
nutanix@NTNX:~$ python /home/nutanix/serviceability/bin/edit-aos-gflags --service=stargate

You enter in to a text editor with all the stargate service glfags. Here, you can edit the settings to achieve different results.

The settings i adjusted that wound up working for me basically amounted to an increase in the number of streams per vdisk, the amount of data saturation allowed on the line, and the priority that stargate gave to replication processing. This allowed me to overcome the high latancy on my cross US pipe.

You can find my specific setting adjustments in the block that you quoted. Also, you'll probably need to get a genesis restart at a minimum.

Its worth noting a few things about my environment that made this combination of settings work without any negative impacts:

  1. Homogeneous images: every guest in the entire cluster is doing the same type of work, is the same size and has the same schedule. This meant that i can apply cluster wide changes without adversely affecting odd ball systems.
  2. Low compute demand: I have 6 guests in an 8 node cluster. basically, i can afford to give loads of power to nutanix services rather than servicing guest compute because my guests do not have a high compute demand.
  3. Scheduled workloads: The guests here are backup media servers. They have a "busy" timeframe where IO for the guests is important, and a "slow" timeframe when IO can be reserved for nutanix services. basically, we backup more at night and less during the day.
  4. High latancy, but high bandwidth. The pipe between my cross country datacenters is 2 x 10gbps. we have a qos policy that limits our workload down to 5gbps for this cluster, but that is still a rather wide pipe. The latency is 74ms.
RE: the guide: To my embarrassment, my guide is limited to some internal notes and a few wiki pages for my team. It isn't organized or cleaned up in a fashion that i'd be comfortable sharing with the community. I have high aspirations to get these notes together and in a form that would be useful to the community. I do need to beg for your patience however, as new demand seems to be a constant stream right now.