Disk I/O Latency on a Nutanix Cluster

  • 16 September 2020
  • 0 replies

Userlevel 3
Badge +2

The Prism interface allows the investigation of the disk I/O latency. As a result, following questions are raised.


Note: Nutanix recommends that maximum latency readings should not be used as a measure of cluster performance and health. Average latency is a useful measure of cluster performance and health.

  • What should be the average latency on a production cluster?

  • What should be the maximum latency?

  • What point is the latency too high?

  • How to investigate the high latency?

Consider the following for latency investigations.

  • The end-user impact for any performance investigation. If the impact is not measurable by the end-user, then any investigation of performance statistics is going to reveal normal and healthy cluster operations.

  • VM combinations, traffic type at the time, write or read size, sequential versus non-sequential, read versus write factors on which investigations are dependent.

Latency Variables in a Nutanix Cluster

The following points provide you with the information regarding latency on a Nutanix cluster.

  • All-flash-array nodes are provided by Nutanix, but the focus of this KB is on the two-tier (SSD and HDD) nodes. This two-tier design aims to keep frequently read data in the host (SSD) tier and Information Life Cycle Management (ILM) promotes and demotes the data from the hot tier. This provides a cost-effective solution that has variable latency response.

  • Extent store : HDD and SSD together makes the extent store. However some portion of the SSDs is used for Oplog. 

  • Oplog: This is used for random writes where data is temporarily written and provides quick acknowledgement. This is eventually drained to an extent store.

  • Cluster that are correctly sized will have a Working Set Size (WSS) that fits within the SSD tier. This ensures that the commonly accessed data on the cluster is available from the SSD.  If ILM is moving data from hot to cold tier and back, it implies that the cluster is under sized and higher latencies will be experienced due to the higher cold-tier hit rate for the data reads.

  • Data that is read from the cold tier (HDD - spinning disk) will have higher latency than the data that is read from the hot tier.

  • Data writes on a Nutanix cluster are not acknowledged back to the VM until the data is written to two nodes (if the default redundancy factor 2 (RF2) configuration is used) in the cluster. This introduces some latency compared to a single, local write.

  • Non-sequential data writes are small and time sensitive. They are normally candidates for writing to the hot tier. Non-sequential (random) writes are first written to Oplog and eventually moved to extent store .

  • Sequential writes skips the Oplog if the outstanding writes are more than 1.5 MB. In that case it is directly written to the extent store.

  • Write size has a large impact on the latency of the write. A 1 MB write has much higher latency than an 8 KB write.


Average Latency versus Maximum Latency

Factors that are already noted in the Latency variables in a Nutanix cluster section introduces periods of high latency. For example, nearly instantaneous spikes for a short write to HDD.


What should be the average latency on a production cluster?

It depends on the type of the workload on the cluster, but most workloads should see average latency reported at 1 to 10 milliseconds with ranges of 10 to 20 milliseconds due to particular traffic patterns (for example, sequential large-block writes).


At what point is latency too high?

Ideally, the answer to this question is "at the point that end-users are reporting slow response", or more precisely, if you are concerned with higher latency. Is it possible to investigate it for repetition (if intermittent or sporadic) and to work with end-users to see if it has any impact on them?


Periods of high latency:

  • If the latency is above 10 milliseconds most of the time, or above 20 milliseconds for minutes at a time are candidates for further investigation.

  • If there are very high spikes, high hundreds of milliseconds to thousands of milliseconds of latency, are very likely to have end-user impact and must be investigated.

However, if the spikes are instantaneous and infrequent, non-periodic spikes in latency beyond 20 milliseconds (into the low hundreds of milliseconds) are much more likely because of normal read or writes to the cold tier. If there is no end-user impact and no correlation to known VM or network events then these spikes should be ignored.


Note: The NCC health check for the VM I/O latency will report problems at 200 milliseconds or higher.


How to investigate high latency?

Following are some of the methods that you can use to investigate the high latency. Use these methods that suits your circumstances:

  • Checking the WSS (available in Prism) to see if the working set is too large for the hot tier.

  • Using Prism create a graph to check the read latency and the write latency separately can be helpful.

  • Considering the network:

    • Are the hosts connected to a 10 GB line-rate switches (as required for most Nutanix clusters). In particular, are Cisco Fabric Extender switches used? See KB 1612 and then replace the Fabric Extender with a 10 GB line-rate switch.

    • Are there network errors? (for example, Rx errors on host NICs or switch interfaces). Does change in cabling stabilize the error counts?

  • Considering the initial sizing plan for the cluster.

    • How many VMs was the cluster specified to run? What size and combinations of VMs? Are you running more VMs than what the cluster was configured for?

  • Correlating latency events with activities on the cluster:

    • Anti-virus scans on a large number (or all) VMs on the cluster at the same time.
      Note: The best practice for running an anti-virus on a Nutanix cluster is to stagger scans across the VMs.

    • Database batch jobs. Databases that read large amounts of cold data should be isolated from other VMs on a different node (wherever possible) so that their hot-tier requirements does not interfere with the other VMs.

    • Protection domain replications. The cluster should not normally allow these background tasks to interfere with the VMs I/O requirements.

    • Other backup tasks

    • New VM creations

  • Raising a ticket with Nutanix Support

    • Nutanix Support can investigate the performance issues with you. If you have any unexplained latency issue, especially anything that is having an end-user impact, log a case to discuss with Nutanix Support.

For more information, please follow: https://portal.nutanix.com/page/documents/kbs/details?targetId=kA03200000098bBCAQ


This topic has been closed for comments