Installation & Configuration

Welcome to the Nutanix NEXT community. To get started please read our short welcome post. Thanks!

cancel
Showing results for 
Search instead for 
Did you mean: 

Jumbo Frames and CVM Interfaces

SOLVED Go to solution
Highlighted
Explorer

Jumbo Frames and CVM Interfaces

We have a 6 node cluster (ESXi), and each node has 4 x 10G NICs.  2 of those NICs we're using for just vMotion, management and CVM traffic.  We enabled Jumbo Frames on the physical switch, and enabled Jumbo Frames on both the virtual switch and the vds, but reading through documentation it's stated that Jumbo Frames should also be enabled on the CVM internal and external interface:

 

The Nutanix CVMs must be configured for Jumbo Frames on both the internal and external interfaces. The converged network also needs to be configured for Jumbo Frames. Most importantly, the configuration needs to be validated to ensure Jumbo Frames are properly implemented end-to-end.  (page 17)  http://go.nutanix.com/rs/nutanix/images/Nutanix_TechNote-VMware_vSphere_Networking_with_Nutanix.pdf

 

Does this mean we need to logon to the CVM and change the network configuration, i.e. /etc/sysconfig/network-scripts/ifcfg-eth0   ?   We are pre-deployment, and I've set the MTU on these interfaces and tested pinging, and it work properly, however the internal interface seems to reset to 1500 each time the CVM is rebooted.  I feel like Jumbo Frames is recommended and best practices, but it's not fully explained as to what interfaces should be set.  I have the same question regarding the vSwitchNutanix, it appears it's best to set it to 9000, but since there is no physical layer, does it make a difference?


Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Moderator Moderator
Moderator

Re: Jumbo Frames and CVM Interfaces

@Bladerunner - No doubt that Jumbo frames has a place, especially in very high throughput situations, like high end databases. Don't disagree with you there

 

In our current code levels, it really only makes a difference in corner and edge cases. For most customers, default configuration will do have than enough performance. We've even got high end DB customers using default with great success. Generally we only go down the jumbo route when we're tuning for those last few percentage points of performance on ultra high end workloads.

 

Keep in mind, the days where storage traffic == jumbo traffic are gone with Nutanix, as Nutanix's data locality greatly reduces the amount of chatty bandwidth chewing up the wire.

 

 

My best analogy here is by asking what the read / write ratio is on your current storage system. Maybe people will say something like 60% reads or 70% reads, or something along those lines.

 

Shifting that workload over to Nutanix, Given that Nutanix has an extremely high focus on data locality, those 60% or 70% or whatever the % of reads *dont even hit the network*, therefore, there are no switch side CPU cycles to optimize, no switch ASIC buffers to manage, it just reads it via the local PCI bus, sometimes right from DRAM. Doesn't even leave the server.

 

Obviously writes are always going to hit some of the network, albeit one copy is always going to be local, but the RF traffic with RF2 == one copy remote and RF3 == two copy remote. In this example, that means only 30-40% of your traffic is ever going to hit the network, and its basically got a clear highway to do so, since the read traffic is all local and not on the wire

 

Meaning those switch side CPU's have a heck of a lot of free time, and switch side buffers are realtively empty, so Jumbo frames just becomes "something extra" that you need to configure and manage, and doesn't add a whole lot of business value IMHO.

 

 

RE Best Practices

There's a reason I personally call them recommended practices instead, because best practices implied an ummutable edict for all of time. We wrote the networking guide for vSphere back in 2014, where Jumbo frames made a bit more sense for our product. We've made more enhancements than I can literally count, and default non-jumbo performance is pretty darn good now.

 

We've actually updated our vSphere networking guide, waiting to go through publishing right now actually, and you'll see the verbiage around jumbo frames change.

Jon Kohler | Principal Architect, Nutanix | Nutanix NPX #003, VCDX #116 | @JonKohler
Please Kudos if useful!
7 REPLIES
Moderator Moderator
Moderator

Re: Jumbo Frames and CVM Interfaces

hey @nsegalle - thanks for reaching out. Following up as I see no one responded to you here. 

 

Honestly, we're a big fan of the KISS principle here at Nutanix, so unless your application / technical requirements are driving jumbo frames, most folks dont actually need jumbo frames. 

 

Non-scientifically, I'd wager a large bet to say the vast, vast, vaaaaast majority of the Nutanix install base is not using Jumbo frames and we've got some very high performance customers on non-Jumbo setups.

Jon Kohler | Principal Architect, Nutanix | Nutanix NPX #003, VCDX #116 | @JonKohler
Please Kudos if useful!
Journeyman

Re: Jumbo Frames and CVM Interfaces

Jon,

 

I think Nick and I are going to try a little experiment and take everything back down to 1500 MTU and see if there is any difference. I am actually really surprised because all best practices I've read say to use 9000 for storage type traffic... this is large data and I would expect it to  benefit from JUMBO Frames...

 

Rob

 

 

 

 

Moderator Moderator
Moderator

Re: Jumbo Frames and CVM Interfaces

@Bladerunner - No doubt that Jumbo frames has a place, especially in very high throughput situations, like high end databases. Don't disagree with you there

 

In our current code levels, it really only makes a difference in corner and edge cases. For most customers, default configuration will do have than enough performance. We've even got high end DB customers using default with great success. Generally we only go down the jumbo route when we're tuning for those last few percentage points of performance on ultra high end workloads.

 

Keep in mind, the days where storage traffic == jumbo traffic are gone with Nutanix, as Nutanix's data locality greatly reduces the amount of chatty bandwidth chewing up the wire.

 

 

My best analogy here is by asking what the read / write ratio is on your current storage system. Maybe people will say something like 60% reads or 70% reads, or something along those lines.

 

Shifting that workload over to Nutanix, Given that Nutanix has an extremely high focus on data locality, those 60% or 70% or whatever the % of reads *dont even hit the network*, therefore, there are no switch side CPU cycles to optimize, no switch ASIC buffers to manage, it just reads it via the local PCI bus, sometimes right from DRAM. Doesn't even leave the server.

 

Obviously writes are always going to hit some of the network, albeit one copy is always going to be local, but the RF traffic with RF2 == one copy remote and RF3 == two copy remote. In this example, that means only 30-40% of your traffic is ever going to hit the network, and its basically got a clear highway to do so, since the read traffic is all local and not on the wire

 

Meaning those switch side CPU's have a heck of a lot of free time, and switch side buffers are realtively empty, so Jumbo frames just becomes "something extra" that you need to configure and manage, and doesn't add a whole lot of business value IMHO.

 

 

RE Best Practices

There's a reason I personally call them recommended practices instead, because best practices implied an ummutable edict for all of time. We wrote the networking guide for vSphere back in 2014, where Jumbo frames made a bit more sense for our product. We've made more enhancements than I can literally count, and default non-jumbo performance is pretty darn good now.

 

We've actually updated our vSphere networking guide, waiting to go through publishing right now actually, and you'll see the verbiage around jumbo frames change.

Jon Kohler | Principal Architect, Nutanix | Nutanix NPX #003, VCDX #116 | @JonKohler
Please Kudos if useful!
Explorer

Re: Jumbo Frames and CVM Interfaces

Jon,

 

First of all, thank you for the great reply.  Rob and I are setting up the Nutanix cluster and are in pre-production, but about to move SQL servers over next week.  We went back and forth over the Jumbo Frames, and earlier today we actually had the network team remove the JF from the switch, and set everything back to default.  Our final decision was based by the fact that while JumboFrames might be of benefit, albeit small, the complexity of the setup was not worth the headache, especially since there is plans to expand the cluster within the next year.

 

Thanks again for the insight, was just what I needed to read..!

 

-Nick

Journeyman

Re: Jumbo Frames and CVM Interfaces

Thanks Jon... appreciate your insight..

 

Most of my concern centers on migration of 1 server. We are migrating many servers "live" storage vmotion from HP (fc connected to XIV and v7K) to NUTANIX over 10Gb Network. This beast has over 12TB of disk and almost a full TB of RAM. That being said we are going to follow your advice as I am not seeing any real benefit in enable JUMBO Frames..

 

Cheers,

 

Rob

Journeyman

Re: Jumbo Frames and CVM Interfaces

Hi,

 

Just to clarify here, if I had a 4 Node Nutanix Block with 12 drives in it with a Repication factor of 2, I could have all 12 drives in the same node fail and still have all data available, correct?  In my vmware cluster, each VMDK would have a copy on each node within the cluster, so if every disk fails within a node, you will have a copy of the data on another disk.  Is this correct?

 

I understand this would probably never happen, I just want to make sure I am fully understanding the data locality within a nutanix cluster.

Guardian

Re: Jumbo Frames and CVM Interfaces

That's correct.  Every write that a VM does is synchronously accepted and written to SSD by another node in the cluster.  With three or more blocks (groups of nodes) the write will have every attempt made to be done in another block.  All of that means, if you lost a node full of drives, the data can be still read from the remaining nodes in the cluster.  Also, the solution will automatically start to reprotect those blocks of data as long as capacity remains.

 

Does that help?