INFORMATIONAL : how to correct FAN critical alerts on NUTANIX-DELL configs

  • 17 January 2017
  • 1 reply
  • 891 views

Badge +9
Hi all !

My client have encountered a strange behaviour on some of its Nutanix-Dell Blocks and i'd like to share the resolution on this.

Thanks to Nutanix / Dell support team for their 'patience' with IT team on this false-positive critical alert resolution.

context :
Multiple critical alert Seen on PRISM : FAN SPEED LOW for a whole cluster (this mean all FANS on ALL blocks).
No alerts detected on IPMI / IDRAC and VIclient (directly on ESXi for HW messaging).
HW: DELL XC 630-10
SW : NOS 4.6.4
HV : ESX 6.0 u2

Resolution :
This is a false positive message which occurs when NOS cannot correctly interpret IPMI messaging due to a miconfig on a /etc/nutanix/hardware_config.json.

the sensor of type "fans : rpm" has a misconfig on its adress.

GOOD adress is "ipmi_sensor:FAN"
BAD adress is "ipmi_sensor:FAN RPM"

resolution will be to modify this descriptor for each FAN (14 on a XC630-10 model) on each host.

If you encounter this kind of issue, please contact NUTANIX / DELL support, they will give you a py script that will modify all json on all node or modify it by yourself (and do it at your own risk 😉. Last possibility, upgrade yoru cluster for a 5.x / 4.7.x version.

best regards,

Thomas

This topic has been closed for comments

1 reply

Userlevel 3
Badge +17
NewVirtTom thanks for sharing