Solved

Data rebuild when CVM is down and Oplog question

  • 11 June 2020
  • 6 replies
  • 1306 views

Badge +1

Please answer the below questions :

  1. When one cvm goes down(in a 10 node cluster with RF3) for 20 mints, then Guest VM’s new write and read IO would be served by anther's CVM and all these will be traveling across 10 g network.

A : Will those new IOs served from local copy via another CVM or IOs will be served from the replica copy

B: Will cluster starts to build a new replica to accommodate the missing copy or RF3

 

Ques 2 - When new Write IO request comes, first it will write the data on Oplog then synchronously send to other CVM’s Oplog. All host in clusters having 2 SSD and 6 HDDs.

Will Write IO process by both SSD in every host or only one SSD is hsving Oplog partition active at the same time ? 

I think there is only one oplog per CVM/host whether the host is aving all flash drive or 2 SSD and remaining HDD. But I am not sure about this statement, Please clarify..

icon

Best answer by Neel Kotak 12 June 2020, 06:47

When one cvm goes down(in a 10 node cluster with RF3) for 20 mints, then Guest VM’s new write and read IO would be served by anther's CVM and all these will be traveling across 10 g network.

 

  • All the new I/Os will be redirected to another CVM when the local CVM is down. It doesn't wait for 20 minutes for the I/Os to be redirected. The hypervisor will inject route to a remote CVM as soon as it identifies the local CVM or stargate is down on that node.

 

Will those new IOs served from local copy via another CVM or IOs will be served from the replica copy.

 

  • IOs will be served from the replica copy. Considering the fact local CVM (or stargate on the CVM) is down

 

Will cluster starts to build a new replica to accommodate the missing copy or RF3

 

  • Yes, Cluster will start rebuilding the data as soons as the Stargate is donw to accomodate the missing copy ot RF34.


When new Write IO request comes, first it will write the data on Oplog then synchronously send to other CVM's Oplog. All host in clusters having 2 SSD and 6 HDDs

 

  • All the new writes are not served by oplog. Oplog is a subset of SSDs which is used to handle the random IOPs. Sequential OPs are served from the Extend store which is a combination of SSDs and HDDs.

 

If it is a random IO, data is first written to the local CVM's oplog and then syncronously written to the remote CVM's OPlog. I think there is only one oplog per CVM/host whether the host is aving all flash drive or 2 SSD and remaining HDD. But I am not sure about this statement, Please clarify.. 

 

  • There is only one oplog store for each CVM which holds the oplog data of the vdisks. Each vdisk is having an oplog space which has a limit of 6 GB. Also, the CVM is also having a oplog limit, which is based on the size of SSDs on that particular node.

 

 

 

View original

6 replies

Userlevel 2
Badge +3

When one cvm goes down(in a 10 node cluster with RF3) for 20 mints, then Guest VM’s new write and read IO would be served by anther's CVM and all these will be traveling across 10 g network.

 

  • All the new I/Os will be redirected to another CVM when the local CVM is down. It doesn't wait for 20 minutes for the I/Os to be redirected. The hypervisor will inject route to a remote CVM as soon as it identifies the local CVM or stargate is down on that node.

 

Will those new IOs served from local copy via another CVM or IOs will be served from the replica copy.

 

  • IOs will be served from the replica copy. Considering the fact local CVM (or stargate on the CVM) is down

 

Will cluster starts to build a new replica to accommodate the missing copy or RF3

 

  • Yes, Cluster will start rebuilding the data as soons as the Stargate is donw to accomodate the missing copy ot RF34.


When new Write IO request comes, first it will write the data on Oplog then synchronously send to other CVM's Oplog. All host in clusters having 2 SSD and 6 HDDs

 

  • All the new writes are not served by oplog. Oplog is a subset of SSDs which is used to handle the random IOPs. Sequential OPs are served from the Extend store which is a combination of SSDs and HDDs.

 

If it is a random IO, data is first written to the local CVM's oplog and then syncronously written to the remote CVM's OPlog. I think there is only one oplog per CVM/host whether the host is aving all flash drive or 2 SSD and remaining HDD. But I am not sure about this statement, Please clarify.. 

 

  • There is only one oplog store for each CVM which holds the oplog data of the vdisks. Each vdisk is having an oplog space which has a limit of 6 GB. Also, the CVM is also having a oplog limit, which is based on the size of SSDs on that particular node.

 

 

 

Badge +1

Thanks a lot for answering the above questions.

Could you please answers with the below ones as well ?

  1. what would be CVM oplog size if we have 2 SSD(each 1 TB) per node ?
  2. what is the vDisk, VM configuration and snapshot file’s extension of a Virtual Machine running on AHV cluster ? ( like in case of vmware it is .vmdk, .xml, -delta)
Userlevel 2
Badge +3

What would be CVM oplog size if we have 2 SSD(each 1 TB) per node ?

By default, Oplog size is 25% of the size of SSD and it can go maximum up to 400GB by setting it manually but that is based upon requirement and with the consultation of Nutanix Support.

 

What is the vDisk, VM configuration and snapshot file’s extension of a Virtual Machine running on AHV cluster ? ( like in case of vmware it is .vmdk, .xml, -delta)

A vDisk is any file over 512KB on Distributed Fise System including .vmdks and VM hard disks.  vDisks are logically composed of vBlocks which make up the 'block map.'

 

Please refer Nutanix Bible for more details...

 

Badge +1

A vDisk is any file over 512KB on Distributed Fise System including .vmdks and VM hard disks.  vDisks are logically composed of vBlocks which make up the 'block map.'

 

Thanks again for answering the requested questions.

I Agreed with the above statement but I am willing to know the files extension of a VM runs on AHV hypervisor, I heard the the below terms please correct me if I am wrong

VM vDisk extention format in AHV is Qqow2 

VM config extension is Bin or XML

Userlevel 2
Badge +3

Yes, that's correct. 

 

VM vDisk extention format in AHV is Qqow2 

VM config extension is  XML

Badge +1

Okay, Thanks again for your response..

Reply