Solved

SATADOM Failure

  • 11 April 2017
  • 18 replies
  • 12870 views

Userlevel 1
Badge +5
I assume that Hypervisor (consider ESXi) and CVM configuration files are residing in SATADOM.

1) what happens if the SATADOM fails ?
2) Is there any redundant mechanism available to withstand failure ?
icon

Best answer by patrbng 11 April 2017, 17:35

If the satadom fails then the hypervisor fails and the VM's will see it as a node failure. If vSphereHA is setup, the VMs should recover to the other nodes in the cluster as long as compute/memory capacity exists to run them. It's recommended to setup vSphereHA with admission control enabled to ensure that capacity is available. The VM's when they restart will read their blocks of storage from whatever local RF data on the node it is running on. If the RF data isn't on the node it will get the storage over the network and if it needs the data more often, the data will be replicated back to the VM's host node for data locality. If the failed node remains offline long enough, it will be marked as out of the metadata ring and the protection data will be re-created on the remaining nodes.
Make sense?
View original

18 replies

Userlevel 4
Badge +20
1) There is only 1 SATADOM and if it fails, you’ll have to re-image the node using the foundation/phoenix process to re-install the hypervisor and Nutanix CVM virtual machine. After that’s done, the host can be added to the cluster. Support can/will walk you through the process.

2) I don't think any of the NX/XC/HX solutions support redundant SATADOMs.
Userlevel 1
Badge +5
Thanks for the quick reply. What would happen if the ESXi node facing SATADOM failure has running user VMs. I suspect that the user VMs will failover restart to working neighbor ESXi node post declaring PDL.
Userlevel 4
Badge +20
If the satadom fails then the hypervisor fails and the VM's will see it as a node failure. If vSphereHA is setup, the VMs should recover to the other nodes in the cluster as long as compute/memory capacity exists to run them. It's recommended to setup vSphereHA with admission control enabled to ensure that capacity is available. The VM's when they restart will read their blocks of storage from whatever local RF data on the node it is running on. If the RF data isn't on the node it will get the storage over the network and if it needs the data more often, the data will be replicated back to the VM's host node for data locality. If the failed node remains offline long enough, it will be marked as out of the metadata ring and the protection data will be re-created on the remaining nodes.
Make sense?
Userlevel 1
Badge +5
Thanks a lot. Clear enough.
Userlevel 1
Badge +3
THIS IS A VERY PAINFUL PROCESS.

I have had 2 SATADOM failues one of my customers cluster.
I was told these things last 1-2 years. There are new models out that were replaced with that is supposed to last longer. But, they are a Single Point a Faulire with a very short life.

In both instances we never received any alerts or predictave failures, NCC Health_checks didn't find any problems with this, and there was another utility that shows the life of these devices and those didn't show the correct information

In both instances this was a very painful process. Bascially the node was running in memory, we lost management of the host. The VM's were running but could not be migrated over to any other host. We incurred outages on our VM's to get the environment working again. I spent a few sleepless nights and many hours troubleshooting this with Nutanix support and had to respond in emergency fashion to get things working.

I have 3 more host with the old Satadoms and don't really have a clear forward plan to prevent this from happening. I have been requesting for these to be replaced, but no word on this yet.

This is a very bad design and I expect better when there are claims are made for invisbile infrastrucutre and self healing. This is not true in this case and if you have the old Satadoms then you should expect this happening to you in 1 -2 years.
I do have one question for chadgiardina how to you check what the SATA DOM model is? and if it is improved version or not??

Many Thanks


UPDATE - I have this model SATADOM 2DSL_3ME

How do I find if that new or not?
Userlevel 4
Badge +8
@patrbng Is @nutanix considering a design that allows mirroring of the SATADOM/ Hypervisor? I'm not looking forward to @chadgiardina 's experience.
Userlevel 4
Badge +20
The G6's are using .M2 drives instead of SATADOMs so there is "some" improvement but no capability of mirroring to a second .M2 yet even though a second one is installed. I would work with your Nutanix SE to see if mirroring is on a roadmap.
Badge +1
I hope the CVM will take care the VMs will be restart on another node incase of node failure. In case of satadom failure how the cvm will act as it holds the CVM configuration too.
We have just deployed some G6s with this new .M2 drive and before it even went into production we've already lost one host due to this drive failing...

Hi everybody.

Is it possible to know if SATADOM is broken or has anomalies?
Or the only way to know is to restart a node?

Badge +2

I decided to do each node one by one.  After one successful update the second or any other node gives this failure.

Operation failed. Reason: Lcm prechecks detected 1 issue that would cause upgrade failures. Check 'test_under_replication' failed with 'Failure reason: Cluster currently under replicated, please wait and retry after sometime, Please refer KB 2826'

Search high and low to under what this is so hoping someone seen this before and can advise me.

Userlevel 3
Badge +3

Hello @jolivette  

You tried following the KB-2826, to find out what’s the current utilisation of the cluster? 
Can you post your findings here? 
Can you also verify if Data resiliency in PRISM is OK  or not?

Badge +2

@HITESH0801 

In Prism everything was OK for data resiliency so it gave me the impression I can move to the next node.  Cluster status shows Up and the ring is normal.

 

nutanix@NTNX-16SM6B400132-A-CVM:10.2.99.104:~$ curator_cli get_under_replication_info summary=true
Using curator master: 10.2.99.110:2010
+------------------------------------------+
| Disk Id | Under replication data (bytes) |
+-----------------------------------------∓
 

Lockdown mode: Disabled

        CVM: 10.2.99.100 Up

        CVM: 10.2.99.101 Up

        CVM: 10.2.99.102 Up

        CVM: 10.2.99.103 Up

        CVM: 10.2.99.104 Up

        CVM: 10.2.99.105 Up

        CVM: 10.2.99.106 Up

        CVM: 10.2.99.108 Up

        CVM: 10.2.99.109 Up

        CVM: 10.2.99.110 Up, ZeusLeader

        CVM: 10.2.99.111 Up

        CVM: 10.2.99.112 Up

        CVM: 10.2.99.113 Up

        CVM: 10.2.99.117 Up

        CVM: 10.2.99.118 Up
2019-11-13 08:07:32 INFO cluster:2747 Success!

        CVM: 10.2.99.119 Up
nutanix@NTNX-16SM6B400132-A-CVM:10.2.99.104:~$ nodetool -h 0 ring
Address         Status State      Load            Owns    Token
                                                          zzzzzzzzR7styAbE3tsFVam28Wb6JMWu1EBYh6bLDTSRrARY5kv3LMSmPS0i
10.2.99.119     Up     Normal     10.67 GB        10.00%  6COnbCOm0000000000000000000000000000000000000000000000000000
10.2.99.111     Up     Normal     9.82 GB         5.00%   9IbCOnbB0000000000000000000000000000000000000000000000000000
10.2.99.100     Up     Normal     8.65 GB         5.00%   COnbCOna0000000000000000000000000000000000000000000000000000
10.2.99.112     Up     Normal     8.87 GB         5.00%   FUzzzzzz0000000000000000000000000000000000000000000000000000
10.2.99.105     Up     Normal     8.47 GB         5.00%   IbCOnbCO0000000000000000000000000000000000000000000000000000
10.2.99.118     Up     Normal     7.59 GB         5.00%   LhOnbCOn0000000000000000000000000000000000000000000000000000
10.2.99.109     Up     Normal     8.48 GB         7.50%   QLhOnbCO0000000000000000000000000000000000000000000000000000
10.2.99.102     Up     Normal     11.47 GB        11.25%  XK9IbCOmMs0tRK41iBdvbUmWbisrblMsH1oMxADyksv5N3zFK0oA4BSwHw1B
10.2.99.113     Up     Normal     10.7 GB         5.63%   aoNFUzzy0000000000000000000000000000000000000000000000000000
10.2.99.117     Up     Normal     10.33 GB        5.63%   eIbCOnbB0000000000000000000000000000000000000000000000000000
10.2.99.106     Up     Normal     12.84 GB        5.00%   hOnbCOna0000000000000000000000000000000000000000000000000000
10.2.99.108     Up     Normal     10.1 GB         5.00%   kUzzzzzz0000000000000000000000000000000000000000000000000000
10.2.99.103     Up     Normal     8.14 GB         5.00%   nbCOnbCOQFT5cDkvjbjxwsI88Mj1yYF3jgWcef19SXw0dDjAk2SgDaf3t7Xy
10.2.99.104     Up     Normal     10.05 GB        10.00%  tnbCOnbBEadJAe0YO7MWlPXJ7y2RGvQ2neiwtVqUAlgxSOGvy66XQQv6a5g0
10.2.99.110     Up     Normal     9.54 GB         5.00%   wtnbCOna0000000000000000000000000000000000000000000000000000
10.2.99.101     Up     Normal     7.3 GB          5.00%   zzzzzzzzR7styAbE3tsFVam28Wb6JMWu1EBYh6bLDTSRrARY5kv3LMSmPS0i
 

Badge +2

@HITESH0801  Forgot to mention I check at least the cluster status and data resiliency before I went to the next node :-)

Userlevel 3
Badge +3

Hey @jolivette  

From the output, the cluster looks healthy and not under replicated. 
I think we need to do in-depth troubleshooting as to what exactly happens when LCM updates the first node and moves on to the second. We need to analyse the logs, especially LCM and curator to find out why this pre-check is failing. 

Can you open a case with Nutanix  Support ? 

 

Badge +2

Done, I will post back any findings.

Badge +2

@HITESH0801, Strangely enough, the next try is working.  No real root cause from support on why the pre-check failed other than data resiliency wasn’t complete but in Prism and cluster checks showed otherwise.  Thanks all for chiming in.

Reply