Criteria for Periodic Inspections of Nutanix Clusters | Nutanix Community
Skip to main content
Solved

Criteria for Periodic Inspections of Nutanix Clusters

  • September 26, 2022
  • 7 replies
  • 88 views

junsu
Forum|alt.badge.img+2
  • Trailblazer
  • 38 replies

Hello,

I'm a beginner at Nutanix

I regularly inspect the client's cluster

ncc health check, newly marked alerts, Investigate relevant kb and documentation
and Enter a command such as
cs | grep -v UP
gs
df -h
nodetool -h 0 ring 

From cvm

===================================================================

I also check IO Bandwidth Usage and Disk IOPS 

Is there a way to set the standard for judging these two questions to be safe?

Each cluster has a different structure and environment, but I want to set a reference point How should I consider it?

====================================================================

Thank you for reading this long commen

 

 

 

Best answer by junsu

https://portal.nutanix.com/page/documents/kbs/details?targetId=kA03200000098bBCAQ

I think we can get a hint from here roughly

 

View original
Did this topic help you find an answer to your question?
This topic has been closed for comments

7 replies

mikkisse
Forum|alt.badge.img+4
  • Vanguard
  • 108 replies
  • September 26, 2022

Hello
The best advice I can take is to configure alert policies and receive notifications via email. It’s also possible so send a daily digest, which will contain actual cluster problems (if exists).

 


junsu
Forum|alt.badge.img+2
  • Author
  • Trailblazer
  • 38 replies
  • September 26, 2022

What is the average level of risk for io bandwidth?

When is the risk level of disk IOPS measured?

ㅠㅠ


junsu
Forum|alt.badge.img+2
  • Author
  • Trailblazer
  • 38 replies
  • Answer
  • October 7, 2022

Kcmount
Forum|alt.badge.img+7
  • Vanguard
  • 367 replies
  • October 8, 2022
junsu wrote:

What is the average level of risk for io bandwidth?

When is the risk level of disk IOPS measured?

ㅠㅠ

This is a very subjective question I'm afraid depending upon workload and hardware.

I have some clusters with 70k iops and 1ms latency and some different ones with 5k iops and 3ms latency. Both are happy and working.

I'd recommend establishing baselines for your clusters on what is 'normal' for them and investigate deviations from this. I found a noisy VM generating lots of writes pushing up the avg latency this way well before it was a problem. 


junsu
Forum|alt.badge.img+2
  • Author
  • Trailblazer
  • 38 replies
  • October 9, 2022

Is it okay to specify the baseline for normal as a measurement when the vm in the cluster is operating without any problems?


Kcmount
Forum|alt.badge.img+7
  • Vanguard
  • 367 replies
  • October 9, 2022

Hello,

 

Yes absolutely, set your baseline on a ‘good experience’ so you know what has changed if you have a bad experience.

 

The built in thresholds are good worst case scenarios when something is up, but you’d rather know earlier ;)


junsu
Forum|alt.badge.img+2
  • Author
  • Trailblazer
  • 38 replies
  • October 11, 2022

Thank you so much for your kind reply.

 

Have a nice day