Solved

Troubleshooting Step for Nutanix System

  • 6 May 2014
  • 6 replies
  • 47027 views

Badge +6
Hi,

Does any experience Nutanix Support can share the step on how to troubleshoot an issue occur in Nutanix System?

Thank.
icon

Best answer by shuguet 7 May 2014, 15:45

To open a support case, go to http://portal.nutanix.com then log in using your credentials or create an account using your block serial number.

As for the genesis process logs you can find them on ~/data/logs/genesis.* (genesis.out, and the previous versions if any).

On the node which is reported as down, you can also use the "genesis status" command to try and get the status directly.

As for NCC, it is shipped by default with the latest version (3.5.3.1 I think?) but it is also available on the support site along with the installation instructions.

Sylvain.
View original

6 replies

Userlevel 4
Badge +21
Depending on the issue (broken hardware, broken service, performance issue, etc.) you may have to look at different locations.

If there is no obvious error in the Prism UI, you may need to look in the ~nutanix/data/logs folder on any CVM (via SSH).
There you will find the logs of all the Nutanix component (Stargate, Curator, Cassandra, Genesis, ZooKeeper, etc.)
In this folder, you will find raw logs (.out), ERROR only (.ERROR), FATAL only (.FATAL), etc.
For exemple the file "~nutanix/data/logs/stargate.out" is the current log file for all Stargate logs and "~nutanix/data/logs/stargate.FATAL" is the current log file only for FATAL logs concerning Stargate.

That's the basic place for all logs.

You may want to run diagnostics (~/diagnostics/diagnostics.py) to check for performance related problem (Do not run this with other VM running on the cluster!) or you may want to run NCC (especially "ncc health_checks run_all") to perform all kind of health check on the Nutanix cluster.

If you find errors in the NCC results, I strongly suggest you open a support case with Nutanix so that they can help you resolve the issue.

Sylvain.
Badge +6
Hi Shuguet, Thank for you explaination and is very helpful. For Example, when i check for cluster status , it show " CVM: 192.168.7.2 Down " , from my understanding, it should be the Genesis process is not running on that node, so which one should i look into it under ~nutanix/data/logs folder ? Where should i start to check and which component is having problem ? can you give me an example based on your previous experience on genesis process is not running? Can you advice where i can get the NCC from, and the step to test in my current Nutanix system? Where can i open a support case to nutanix? Thank for helping as i am new to Nutanix and trying to catch up everything if i could.
Userlevel 4
Badge +21
To open a support case, go to http://portal.nutanix.com then log in using your credentials or create an account using your block serial number.

As for the genesis process logs you can find them on ~/data/logs/genesis.* (genesis.out, and the previous versions if any).

On the node which is reported as down, you can also use the "genesis status" command to try and get the status directly.

As for NCC, it is shipped by default with the latest version (3.5.3.1 I think?) but it is also available on the support site along with the installation instructions.

Sylvain.
Badge +6
i will figure out the message in Genesis.out, will post the log here if i can't fix it .

Thank so much for the help and explaination.
Badge +5
Thanks for your reply. I am new to Nutanix, which file should I inspect for installation log? I try /data/logs directory but too many files...
Userlevel 3
Badge +10
try checking this log

/home/nutanix/data/logs/install.out

Reply