ECX error & disk id & svmotion

Question

Hello Folks,
I`m busy installing a 6 node cluster, some errors or doubt comes out along.
my environment : 3 block, 6 nodes (2 nodes/block) AOS 4.6.2
 
一. When I create a new container, make the Enable Erasure coding button checked, error pops up"enable EC-X will break the current block awareness !" seems mean ECX will disable block level awareness ?
 
二. As many of us know, if need locate the physical slot location of a disk, you can simply led on that drive. but there is possibility led on function inaccurate or disabled, in this scenario, you must locate the disk by disk id rule, e.g shelf id.slot/bay ID, but the DISK ID in prism hardware section seems ramdom/ruleless number, I can`t locate physical location for a disk.
 
三. When I do the test about removing a node from nutanix cluster, A doubt appears: Do I need or is it the best practice to execute storage vmotion(other then only vmotion vm`s ) before removing the node ? I think the diffecrence of two methods is who is the data mover, one is ESXI svmotion, the other is CVM RF ? perhaps I`m wrong totally,even if I choose svmotion, RF recovery still occurs ? Thanks you guys valuable time and help in advance !
 
四.   I enable RF3 first time, and after somes days, user would like to revert to RF2, do nutanix support this ? I know I can online change RF2 to RF3; 
       another case for RF: when I image node using Foundation tool, I choose default RF2, and Can I enable RF3 support by NCLI command ? (I notice there is related command in NCLI) 
       similarly, Can I enable EC-X and disable it again ?   I`d like to know common features that can be changed twice or many times, other than one change is permanent. 
 
Thanks community friends !

Jon · Accepted Answer

RE ECXYes, ECX and block awareness do not mix. There is a long technical explanation of why, but the short of it is that the majority of customers who would use ECX wouldn't have enough physical blocks to actually enforce block awareness on ECX data (due to the different way that ECX stores blocks on disk), so we have delayed working on making the two work together as of now. RE Physical disk LED / finding a diskThe disk ID is really an "internal" construct, do not pay much attention to it. If the LED on a disk was not working, or perhaps a HDD was 100% "dead" and the disk was just 100% offline, you would just simply look for the node that it is attached to (which is visibile in the Prism hardware display diagram) and go to that physical node and remove it. Alternatively, You could flash the LED on that node as well, which would help, OR perhaps flash all of the LED's of the disks around the dead disk, so that the dead disk stands out as "not flashing" If there was any doubt, you would compare the serial number from the Disk in Prism to the physical one printed on the drive to double check. This is the general approach for almost all storage systems. RE Removing a nodeYou do not need to svMotion anything at all. The CVM's will move the data off that node. Even if you did "svMotion" all of those VM's, you'd still have data on that CVM before it is removed from the cluster, as it is a part of the clusters overall capacity, so the svMotion literally doesn't do anything but waste time. Just regular vMotion the VM's to another node first, then remove the node using Prism and wait for the task to complete. RE RF3 to RF2 changeAre you talking at the cluster level? or the container level? Cluster level can not be changed back. I *think* the container level can be changed back, but I've never tried it. Absolute worst case, just provision another container at RF2 and storage vMotion the data. That said, this question is one of those that you should really think through before arbitrarily changing. This is why the "change RF" command is NCLI only, and not in the GUI. RE Foundation RF2This is setting up the "cluster level" RF, but regardless of foundation, if you have a cluster that is RF2, you can go to RF3 with NCLI, yes. RE Enable/Disable ECXSure, you can change that all you want. Just note that encoded data will stay encoded until the data is overwritten. This is slightly different behavior than enable/disable compression, which does actively compression/decompress data when the setting is changed.

xiaowei · Answer

Dear Jon,Thanks for your detailed reply.   I`d like to confirm these further more:--RE ECX--You mean ECX and block awareness function can`nt be used together as of now.But I think most users will need block level fault-tolerant and decrease RF data( more usable capacity) in the same time.also 3 block is a conventional config, need more block to enable ECX together ? --RE RF3 to RF2 change & RE Foundation RF2--Just want confirm difference and relationship between Cluster-level and Container Level RF? In particular, if Cluster-level set to  RF3, can container-level be set to RF2 and RF3 (I mean different containers with different RF, I know a container can be set only one RF type simultaneously ) ?As a opposite, If Cluster-level set to  RF2, can container-level be set to RF2 and RF3 ? In my Current understanding, Cluster-level controls  RF of some cluster components , e.g. Zookeeperand Container-Level for usual vm data.    Also RF of Some cluster components even can be set 5, I`m curious if I can change it. --RE Enable/Disable ECX--From what you said, I guess if I disable ECX, No addtional capacity is required  ? as data still encoded. But if I disable compression, Additional capacity must be satisfied for decompression data ? otherwise I may fail to disable it ?  if so, how to calculate the required free space ? TKS. BR

Sign up

Login to the community