Solved

NOS 4.0.1

  • 4 July 2014
  • 5 replies
  • 1388 views

Badge +5
I have 3350 cluster.
After upgrade cluster to 4.0.1 i have error :

Critical i RESILIENCY STATUS
Yes REBUILD CAPACITY AVAILABLE
Yes AUTO REBUILD IN PROGRESS

What i have to do ?

ncli> cluster get-domain-fault-tolerance-status type=node
Domain Type : NODEComponent Type : STATIC_CONFIGURATIONCurrent Fault Tolerance : 1Fault Tolerance Details :Last Update Time : Fri Jul 04 06:14:33 PDT 2014
Domain Type : NODEComponent Type : ZOOKEEPERCurrent Fault Tolerance : 1Fault Tolerance Details :Last Update Time : Fri Jul 04 05:50:41 PDT 2014
Domain Type : NODEComponent Type : EXTENT_GROUPSCurrent Fault Tolerance : 0Fault Tolerance Details : Based on placement of extent group replicas thecluster can tolerate a maximum of 0 node failure(s)Last Update Time : Fri Jul 04 05:55:55 PDT 2014
Domain Type : NODEComponent Type : OPLOGCurrent Fault Tolerance : 1Fault Tolerance Details :Last Update Time : Fri Jul 04 05:55:55 PDT 2014
Domain Type : NODEComponent Type : METADATACurrent Fault Tolerance : 1Fault Tolerance Details :Last Update Time : Tue Jun 24 05:35:08 PDT 2014
Domain Type : NODEComponent Type : FREE_SPACECurrent Fault Tolerance : 1Fault Tolerance Details :Last Update Time : Fri Jul 04 06:09:34 PDT 2014


--cluster_function_list: List of functions of the cluster (use with create).Accepted functions are ['ndfs', 'multicluster', 'cloud_data_gateway']

NDFS=Nutanix
Multicluster=Prism Central
What is Cloud Data Gateway? Something that running in Azure/Amazon?
icon

Best answer by dlink7 5 July 2014, 19:23

View original

This topic has been closed for comments

5 replies

Userlevel 4
Badge +19
HiThe UI is just being conservative. When the CVM goes down there could be writes happening to that CVM so it marks it as unhealthy until the next partial scan happens so it can confirm. When a CVM goes down the Oplog does have it's own recovery process and will fix itself right away.Nutanix does support backup to Amazon now, Azure DR is coming.
Badge +4
I just did upgrade for NX-1350 in the past weekend from NOS 3.5.1 to NOS 3.5.4 then NOS 4.0.1 refer to a nice simple guide:http://craigwaters.org/2014/04/29/nos-4-0-feature-1-click-os-upgrade/with Upgrade Guide NOS 3.5.4 & Upgrade Guide NOS 4.0.1.it is troublefree almost and the process of upgrade from NOS 3.5.4 to NOS 4.0.1 only cost less than 20 minutes for my case.:S
Badge +2
I noticed this same error or status after I rebooted a controller / host while doing updates. Is it ok to proceed to shutdown the next cvm and reboot node?
Userlevel 1
Badge +8
Hello -

For partial scans, which component does the scans - Stargate, Curator, other?

Thanks,
Userlevel 4
Badge +18
It's always the curator responsible for running scans. Be it a full scan or partial scan. Stargate performs I/O operations after curator has identified the data which needs to be moved around in NDFS.

-Navpreet
SRE Nutanix