Duplicated CVM machine? | Nutanix Community
Skip to main content

Hello,

After 3 node cluster deployment, I can see 4 CVMs.

Two of them looks like the same VM but duplicated.

How to fix it and remove one of them? I cannot update vs0 on the node where two CVMs are placed.
AHV/CVM reboot did not help. And I cannot use acli on that node.

 

nutanix@NTNX-b7617012-A-CVM:10.170.0.11:~$ ncli vm list

Id : 00062cfc-0318-cd70-17eb-005056b90eac::7fa39b58-506c-40f9-b5b8-54146d84855e
Uuid : b452d890-a399-49b3-b415-e975c489d577
Name : NTNX-b7617012-A-CVM
VM IP Addresses : 10.170.0.11, 192.168.5.2, 192.168.5.254
Hypervisor Host Id : 00062cfc-0318-cd70-17eb-005056b90eac::5
Hypervisor Host Uuid : f20c4224-3d4f-4472-bb79-2e8aa0968bc5
Hypervisor Host Name : NTNX-b7617012-A
Memory : 20 GiB (21,474,836,480 bytes)
Virtual CPUs : 2
VDisk Count : 0
Protection Domain :
Consistency Group :

Id : 00062cfc-0318-cd70-17eb-005056b90eac::9aef76d3-3584-49a1-af9c-4b50c900dd9d
Uuid : 1a269e5c-0aee-4af3-af21-e036ccf0e80f
Name : NTNX-b7617012-A-CVM
VM IP Addresses : 10.170.0.11, 192.168.5.2, 192.168.5.254
Hypervisor Host Id : 00062cfc-0318-cd70-17eb-005056b90eac::5
Hypervisor Host Uuid : f20c4224-3d4f-4472-bb79-2e8aa0968bc5
Hypervisor Host Name : NTNX-b7617012-A
Memory : 20 GiB (21,474,836,480 bytes)
Virtual CPUs : 2
VDisk Count : 0
Protection Domain :
Consistency Group :

Id : 00062cfc-0318-cd70-17eb-005056b90eac::a26669b2-6d6e-4a2d-89ad-085902f41792
Uuid : ddad9870-29f1-4b41-b247-007b10463ac5
Name : NTNX-073ed744-A-CVM
VM IP Addresses : 10.170.0.12, 192.168.5.254, 192.168.5.2
Hypervisor Host Id : 00062cfc-0318-cd70-17eb-005056b90eac::6
Hypervisor Host Uuid : 4d97eeb4-86f3-4818-877d-598864bdf7a2
Hypervisor Host Name : NTNX-073ed744-A
Memory : 20 GiB (21,474,836,480 bytes)
Virtual CPUs : 2
VDisk Count : 0
Protection Domain :
Consistency Group :

Id : 00062cfc-0318-cd70-17eb-005056b90eac::f696a5af-ace4-411b-804d-d1223fdaf246
Uuid : c2203270-3a25-4e52-8e96-4d05dadb8ee7
Name : NTNX-7b205cf1-A-CVM
VM IP Addresses : 10.170.0.13, 192.168.5.2, 192.168.5.254
Hypervisor Host Id : 00062cfc-0318-cd70-17eb-005056b90eac::7
Hypervisor Host Uuid : d355c39c-46c8-4b02-a2d6-c5e445b5c0d6
Hypervisor Host Name : NTNX-7b205cf1-A
Memory : 20 GiB (21,474,836,480 bytes)
Virtual CPUs : 2
VDisk Count : 0
Protection Domain :
Consistency Group :
nutanix@NTNX-b7617012-A-CVM:10.170.0.11:~$ acli
Failed to connect to server: Errno 111] Connection refused


 

Wow, never seen that before. I know you want to know what happend and how to solve this. But just to be honest, reinstall the nodes.  That is way faster to get up and running again.  

 

I dont know about your environment. Is this Community Edition? (I assume yes, as you only have 2 cores on the CVM). Are you running nested? If so, did you clone the virtual machines? What was the exact cluster create command, maybe you created a typo there (although the create script should tackle that). As you can see, so manny question. And zookeeper already has a wrong config in it, just reinstall ;) 


Yes, it is a Community Edition and it is nested env.

Cluster create command:

nutanix@cvm$ cluster -s 10.170.0.11,10.170.0.12,10.170.0.13  --cluster_external_ip 10.170.0.10 --cluster_name nut-cl01 create

If I could remove somehow that wrong VM (I have identified which one) but unfortunately, any command I know, does not work.


I’m also running CE and came across the same issue with duplicate CVMs. After initially installing my 3 node cluster, I renamed all hosts and CVMs following Nutanix docs. After installing and configuring Prism Central I went through LCM in Element to run updates. During the AOS upgrades, when the first AHV rebooted, it came back up with two CVMs, both powered off. The “new” CVM that appeared had the old hostname before I changed their names.

I logged into the AHV hosts one at a time and ran “virsh list --all” to see the CVM names, then ran “virsh start <newAhvHostname>” to start the correct CVM. After manually starting the correct CVM, the upgrade process completed for that node and moved to the next node. I had to repeat this process for all three nodes in the cluster.

I’m still trying to figure out the “safest” way to remove the CVMs with the old hostnames.

CE version is 2.1. The following upgrades were available:

  • Foundation from 5.7 to 5.7.1
  • FSM from 5.0.0.3 to 5.1.0
  • AOS from 6.8.1 to 6.10.1
  • AHV hypervisor from el8.nutanix.20230302.101026 to el8.nutanix.20230302.103003

And the component compatibility for these versions are behind the paywall and we don’t have a support account yet since we’re in our PoC phase, so I’m unable to comment on this as a potential reason for the upgrade issues, but can also say that we were not to verify any compatibility before beginning the process.


Running the command “virsh list --all” shows the existing VMs in KVM.
Running “virsh list” shows the duplicate CVM is not running.
Checked /etc/libvirt/qemu and found config files for each CVM. I ran diff against both files and found they were almost identical. I was able to remove the duplicate CVM by running “virsh undefine <incorrectAhvHostname>”, which also deleted the config file in /etc/libvirt/qemu. I just now have to rinse and repeat for all hosts in the cluster.


I’m also running CE and came across the same issue with duplicate CVMs. After initially installing my 3 node cluster, I renamed all hosts and CVMs following Nutanix docs. After installing and configuring Prism Central I went through LCM in Element to run updates. During the AOS upgrades, when the first AHV rebooted, it came back up with two CVMs, both powered off. The “new” CVM that appeared had the old hostname before I changed their names.

I logged into the AHV hosts one at a time and ran “virsh list --all” to see the CVM names, then ran “virsh start <newAhvHostname>” to start the correct CVM. After manually starting the correct CVM, the upgrade process completed for that node and moved to the next node. I had to repeat this process for all three nodes in the cluster.

I’m still trying to figure out the “safest” way to remove the CVMs with the old hostnames.

CE version is 2.1. The following upgrades were available:

  • Foundation from 5.7 to 5.7.1
  • FSM from 5.0.0.3 to 5.1.0
  • AOS from 6.8.1 to 6.10.1
  • AHV hypervisor from el8.nutanix.20230302.101026 to el8.nutanix.20230302.103003

And the component compatibility for these versions are behind the paywall and we don’t have a support account yet since we’re in our PoC phase, so I’m unable to comment on this as a potential reason for the upgrade issues, but can also say that we were not to verify any compatibility before beginning the process.

I am having the same issue upgrading from 6.8.1 to 6.10.1

 

Upgrade is still happening so I will see if it settles once it is all done.