Skip to main content
Question

Satadom wearout check

  • 20 May 2022
  • 2 replies
  • 106 views

I have an old cluster of 4 NX-8035 nodes running AHV used for testing purposes.  The LCM update is failing with sata_dom_wearout_check.  Is there anyway to skip that check when running LCM updates?  This is old hardware and I wont be replacing any parts, just want to get updates to run.

2 replies

Badge

*Warning

This is a temporary solution.
The methods below can crash your cluster.

Please only use it on testbeds.

 

1. Backup “/home/nutanix/ncc/plugin_config/plugin_schema/health_checks/disk_checks.json”

cd ~/ncc//plugin_config/plugin_schema/health_checks

cp disk_checks.json disk_checks.json.orig

 

2. vi disk_checks.json and Delete “plugin_schema_list” includes “name: sata_dom_wearout_check”

 

from “plugin_schema_list {“

plugin_schema_list {
  name: "sata_dom_wearout_check"
  executable_name: "health_checks hardware_checks disk_checks sata_dom_wearout_check"
  check_schema_list {
...

    check_metadata {
      ncc_version: "Unknown"
      source_jira_ticket: "TBD"
      fix_jira_ticket: "NA"
      component: kPlatformSolutions
      hw_specific: False
      hypervisor_specific: False
      comment_list: "NA"
    } 
  } 
}  # til “}”
 

3. copy disk_checks.json to other cvm

scp disk_checks.json <CVMIP2>:~/ncc/plugin_config/plugin_schema/health_checks

scp disk_checks json <CVMIP3>:~/ncc/plugin_config/plugin_schema/health_checks

 

4. LCM Upgrade job

 

5. After upgrade, rollback to backup file.

cp disk_checks.json.orig disk_checks.json

scp disk_checks.json <CVMIP2>:~/ncc/plugin_config/plugin_schema/health_checks

scp disk_checks json <CVMIP3>:~/ncc/plugin_config/plugin_schema/health_checks

 

Done.

I have been fighting to get Nutanix CE to install on a Dell Poweredge R710 (2 -6 CORE CPU, 128GB RAM, 128GB SD card, 256 GB SSD in the Optical drive bay, PERC 6/i with several SAS and SATA drives). I have been booting from a USB drive with the ISO image expanded on it (using Rufus). It was booting up and going to the AHV setup and License screen thru to to the reboot after USB removal screen. when it reboot I got some error and a indicated that it did not complete correctly when I logged into the root user i was not able to ping the address. So, I tried another install with the same results. Than I wiped out all of the drives, deleted the SSD and SD card and reinitialized the drives on the PERC 6i. restarting to bare metal to make sure nothing was causing an issue.  I also swapped the SD card to a 256GB. Started a new install and it  did not get the the AHV config screen it dumped me out to a root@phoenix  (CentOS Linix 7 (core) ).  I thought that it maybe the SD card so I replaced it with the original 128GB one and got the same result.   I am at my witts end as to what is going on with the system. Can anyone clue me in on what i am doing wrong?

Reply