Solved

Removing nodes from cluster


Badge +2
I have a customer that is planning to relocate half of their 11 node cluster to a different rack. They are using less than 50% of the cluster resources and they are asking if this is something that can be done online. I'm pretty sure that I know the answer to this question but wanted to double check.Can they move 5 (of 11) nodes without impacting productions, since they are using less than 50% of the resources?

Can we temporarily remove nodes from the cluster and then re-add them after the move?
icon

Best answer by DonnieBrasco 31 May 2016, 19:58

Yes, you can remove the nodes from the cluster.

Anyways, you can remove single node at a time so you can assess the cluster resiliency status before you remove a node from the cluster.

-NP
View original

11 replies

Userlevel 4
Badge +18
Yes, you can remove the nodes from the cluster.

Anyways, you can remove single node at a time so you can assess the cluster resiliency status before you remove a node from the cluster.

-NP
Userlevel 7
Badge +30
What Nav said ... remove one node at a time and move the nodes to the other rack, then re-add them back in.

Cheers,
Jon
Userlevel 1
Badge +8
Hi all,

Do you have the document to remove the node?

Cheers,
Jaffer
Userlevel 7
Badge +30
See the Prism Web Console guide for documentation on removing a node or modifying a clusteR: https://portal.nutanix.com/#/page/docs/details?targetId=Web-Console-Guide-Prism-v50:wc-cluster-modify-wc-t.html
Userlevel 1
Badge +8
Thank you :)

Cheers,
Jaffer
Badge
@Jon Is this article still a valid procedure for 5.x and removing a node temporarily? I have a 16 node cluster and need to physically move a 4 node block. Cluster capacity is less than 50% and our plan was to shut down CVMs one at a time, each time waiting for the cluster to evict the node and then heal back to RF3 before moving to the next node. This article is a different approach, but seems like it might be better. Any input would be helpful. Thanks!
Userlevel 7
Badge +30
That seems like a bad idea (to shutdown CVMs and just wait). Why not just click the remove button in prism, @tonynt ?

Also, as an aside, not sure what version you're on, but in 5.9.1++, we made removing nodes (and rebuilds in general) stupid fast, like significantly better to do the tune of multiple times better.
Userlevel 7
Badge +30
Meaning - If you have to do this with any sort of gusto, Being at ~5.10.3.2++ (as of the current minute) will help you get it done quicker. cheers - jon
Badge
That seems like a bad idea (to shutdown CVMs and just wait). Why not just click the remove button in prism, @tonynt ?

Also, as an aside, not sure what version you're on, but in 5.9.1++, we made removing nodes (and rebuilds in general) stupid fast, like significantly better to do the tune of multiple times better.

@Jon I thought this was a a bad idea as well, which is why I was searching through the forums and this thread. Support had recommended doing it the shutdown method and then I found this gem. 🙂 Thank you.
Userlevel 7
Badge +30
Shutdown is both disruptive, and incrementally longer. It takes ~30 minutes for a ring level timeout (e.g. cassandra start kicking itself in the butt), and then some amount of time to 100% replicate all data.

During that time, the amount of data per node is exposed to single copy behavior. Meaning some amount of blocks are only having one copy.

NEW data is ALWAYS written RF2/RF3 (depending on your policies), but then the existing data needs to rebuild itself.

If you hit the remove button, it will "over replicate" temporarily, such that all copies are online 100% of the time with 100% redundancy.

Can you hit me up jon@nutanix.com with your support ticket number? Not a witchhunt, not do I want to kick any butts, just need to make sure we dont ever recommend this to another customer.
Badge

hi Jon,

 

i am using 3 node AHV cluster and my company planned to migrate AHV cluster into ESX based cluster.

But the challenge is we brought extra 2 nodes and we planned to create new cluster based on ESX based, so we planned to empty one node from 3 node AHV cluster and add it on ESX cluster to create 3 node ESX cluster. so my question is, it is  possible to work on 2 node AHV cluster for almost 7-10 days?

it will impact on data or cluster?

Reply