Solved

Criteria for Periodic Inspections of Nutanix Clusters

Forum|Forum|3 years ago
September 26, 2022
7 replies
174 views

+2

junsu
Trailblazer

Hello,

I'm a beginner at Nutanix

I regularly inspect the client's cluster

ncc health check, newly marked alerts, Investigate relevant kb and documentation
and Enter a command such as
cs | grep -v UP
gs
df -h
nodetool -h 0 ring

From cvm

===================================================================

I also check IO Bandwidth Usage and Disk IOPS

Is there a way to set the standard for judging these two questions to be safe?

Each cluster has a different structure and environment, but I want to set a reference point How should I consider it?

====================================================================

Thank you for reading this long commen

Best answer by junsu

https://portal.nutanix.com/page/documents/kbs/details?targetId=kA03200000098bBCAQ

I think we can get a hint from here roughly

This topic has been closed for replies.

+4

mikkisse
Vanguard
Forum|Forum|3 years ago
September 26, 2022

Hello
The best advice I can take is to configure alert policies and receive notifications via email. It’s also possible so send a daily digest, which will contain actual cluster problems (if exists).

Like

+2

junsu
Author
Trailblazer
Forum|Forum|3 years ago
September 26, 2022

What is the average level of risk for io bandwidth?

When is the risk level of disk IOPS measured?

ㅠㅠ

Like

+2

junsu
Author
Trailblazer
Answer
Forum|Forum|3 years ago
October 7, 2022

https://portal.nutanix.com/page/documents/kbs/details?targetId=kA03200000098bBCAQ

I think we can get a hint from here roughly

Like

+7

Kcmount
Vanguard
Forum|Forum|3 years ago
October 8, 2022

What is the average level of risk for io bandwidth?

When is the risk level of disk IOPS measured?

ㅠㅠ

This is a very subjective question I'm afraid depending upon workload and hardware.

I have some clusters with 70k iops and 1ms latency and some different ones with 5k iops and 3ms latency. Both are happy and working.

I'd recommend establishing baselines for your clusters on what is 'normal' for them and investigate deviations from this. I found a noisy VM generating lots of writes pushing up the avg latency this way well before it was a problem.