Sometimes, NCC (Nutanix Cluster Check) will alert for PSU down and a case is automatically created when a PSU down is detected.This can be a false positive alert or are transient errors (when relocating a node or some maintenance which requires a PSU down).
However, a Power supply alert should never be taken lightly. One can always verify the physical status of Power supply by checking the LED indicators on the PSU unit itself, but in certain cases this may not be possible right away - due to the equipment location etc.
To ensure Power Supplies of a Block or Node are in optimal state, you can login to the physical host or execute a command from the CVM to check their status:
From a AHV Host:
[root@AHV-HOST~]# ipmitool sdr | grep -i ps
PS1 Status | 0x01 | ok
PS2 Status | 0x01 | ok
From a CVM on AHV (this will query all HOSTS in the cluster):
nutanix@cvm$ hostssh "ipmitool sdr | grep -i ps"
From a CVM on a ESXi Host:
You can query all hosts PSU status:
nutanix@cvm$ hostssh "/ipmitool sensor | grep -i ps"
So what does the above output mean:
The output of each of these should show PS1 and PS2 with a value of 0x1 or 01.
This indicates a good PSU reading status. If a PSU shows a value other than the good readings above, please run the same command on other hosts in the same block.
Please note : a value of 'nr' means Non Recoverable and requires a PSU replacement.