Failed to start CVM after CE 2.1 installation | Nutanix Community
Skip to main content

Hi,

I’ve been trying to install Nutanix CE 2.1 (from this post) on a bare metal server bought from OVH (RISE-1-11) with the following specifications:

  • CPU: Intel Xeon-E 2136 - 6c/12t - 3.3 GHz/4.5 GHz
  • RAM: 64 Go ECC 2666 MHz
  • Disks:
    • 2×512 Go SSD NVMe
    • 2×4 To HDD SATA

After multiple tries (OVH’s IPMI/KVM feature is pretty bad so I had a very hard time completing the installation process), I finally succeeded to install the ISO on this machine.

However, once I boot in AHV, I try to ping the CVM but get a Destination Host Unreachable error. After checking with virsh list --all I found out that the CVM was in a shut off state. I tried to start it with virsh start <CVM_NAME> but got the following errors from qemu:

error: Failed to start domain ‘NTNX-b05f675e-A-CVM’
error: internal error: qemu unexpectedly closed the monitor: 2024-08-28T14:42:04.640171Z qemu-kvm: warning: Large machine and max_ram_below_4g (536870912) not a multiple of 1G; possible bad performance.
2024-08-28T14:42:04.653150Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: ‘cirrus-vga’ is deprecated, please use a different VGA card instead
2024-08-28T14:42:04.654006Z qemu-kvm: -device vfio-pci,host=0000:02:00.0,id=ua-a3b24f6a-3a35-4eb6-90ff-1a082c9f57af,bus=pci.0,addr=0x7,rombar=0: vfio 0000:02:00.0: group 1 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

I checked the CVM XML definition and didn’t see anything weird aside from the fact that the RAM amount is 20GB (20971520 exactly). I read somewhere that it should be at least 32GB, so I edited the definition with virsh edit <CVM_NAME> and set memory and currentMemory to 32150000. Then I tried to start the CVM but got the exact same error as above.

Furthermore, I found out that I don’t have access to the internet although I did specify the right gateway upon installation. When I run ping 8.8.8.8 I just get the following error: connect: Network is unreachable.

The result of ip route is the following:

10.0.0.0/16 dev br0 proto kernel scope link src 10.0.0.1
169.254.0.0/16 dev br0 scope link metric 1006
192.168.5.0/24 dev virbr0 proto kernel scope link src 192.168.5.1 linkdown

I tried adding the default route manually with ip route add default via <GATEWAY_IP> dev eth0 but got Error: Nexthop has invalid gateway. I checked the available network adapters with ip a and saw there was an eth1 adapter, tried the same ip route … command with it and got the same error.

I don’t believe that not having access to the internet should prevent the CVM from starting. It sure will be an issue later on but right now I can’t even start the CVM at all, with the aforementioned qemu error.

If that’s of any help, here is the configuration I specified upon installation:

  • Host IP: 10.0.0.1
  • CVM IP: 10.0.0.2
  • Subnet mask: 255.255.0.0
  • Gateway: <GATEWAY_IP>

I left the default selections for disks attribution: the two SSDs are used by the CVM and AHV and the two HDDs are used for data.

I should also mention that I already tried to install the previous version CE 2.0 version but got the same qemu errors. I also searched for the error online but couldn’t find any definitive answer as to what should be done to fix this issue.

If anyone would be willing to help, it would mean the world to me!

A couple of items, the CVM should be 20GB, the rest of the memory is there for VMs to use.  It’s not hurting anything by changing it, but you’ll just use more memory.

Can you post a copy of the CVM XML file here, and also the output of the following commands?

lspci
lspci -k | grep vfio

 

The networking here seems strange.  10.0.0.1 and 10.0.0.2 as your IP in a /16 network, is the Gateway IP at the top of the range?  Are you behind a NAT?   The only reason the installer won’t apply the gateway IP is if it’s invalid within the IP range given, (and also why you got that error when trying to set it manually). so if you could provide that (or send via DM if you don’t want to publically post it) that would help immensely.


 


Thanks for your reply. I’ll try to answer as much as I can to your questions.

First I should tell you that I only have a KVM access to the machine so I can’t copy/paste anything. Therefore, everything I’ll share here was rewritten by hand. I tried to avoid any mistakes but if something doesn’t look right, I may have made one…

Here is the CVM XML file (note that I set back the memory to 20GB):

<domain type='kvm'>
<name>NTNX-b05f675e-A-CVM</name>
<uuid>294589fb-09ab-4184-a7f6-9f79777654fb</uuid>
<memory unit='KiB'>20971520</memory>
<currentMemory unit='KiB'>20971520</currentMemory>
<memoryBacking>
<hugepages/>
<nosharepages/>
</memoryBacking>
<vcpu placement='static' cpuset='0-11'>2</vcpu>
<resource>
<partition>/machine</partition>
</resource>
<os>
<type arch='x86_64' machine='pc-i440fx-rhel7.6.0' max_ram_below_4g='536870912'>hvm</type>
<boot dev='cdrom'/>
<bootmenu enable='no'/>
</os>
<features>
<acpi/>
<pae/>
</features>
<cpu mode='host-passthrough' check='none' migratable='on'>
<topology sockets='2' dies='1' cores='1' threads='1'/>
<numa>
<cell id='0' cpus='0-1' memory='20971520' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/var/lib/libvirt/NTNX-CVM/svmboot.iso'/>
<target dev='hdc' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='1' target='0' unit='0'/>
</disk>
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native'/>
<source dev='/dev/disk/by-id/ata-HGST_HUS726T4TALA6L1_V1H6S4PH'/>
<backingStore/>
<target dev='sda' bus='scsi'/>
<serial>V1H6S4PH</serial>
<wwn>5000cca0bcd128a3</wwn>
<vendor>ATA</vendor>
<product>HGST HUS726T4TAL</product>
<alias name='ua-08e4c300-95bb-4c84-80c6-dc45d75c1bfc'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native'/>
<source dev='/dev/disk/by-id/ata-HGST_HUS726T4TALA6L1_V6HMX2US'/>
<backingStore/>
<target dev='sdb' bus='scsi'/>
<serial>V6HMX2US</serial>
<wwn>5000cca097d72401</wwn>
<vendor>ATA</vendor>
<product>HGST HUS726T4TAL</product>
<alias name='ua-779e6b5e-81bf-4feb-9463-0ff7cb958e84'/>
<address type='drive' controller='0' bus='0' target='0' unit='1'/>
</disk>
<controller type='usb' index='0' model='none'/>
<controller type='pci' index='0' model='pci-root'/>
<controller type='scsi' index='0' model='virtio-scsi'>
<driver max_sectors='2048'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<controller type='ide' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<interface type='bridge'>
<mac address='52:54:00:ef:7b:97'/>
<source bridge='br0'/>
<virtualport type='openvswitch'>
<parameters interfaceid='d3fb43af-f6e7-41a3-aef6-2347ca83613d'/>
</virtualport>
<model type='virtio'/>
<driver name='vhost' queues='4'/>
<alias name='ua-0464b746-2b94-425a-a04c-c6a69ac611e6'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<interface type='network'>
<mac address='52:54:00:07:f3:11'/>
<source network='NTNX-Local-Network/>
<model type='virtio'/>
<driver name='vhost' queues='4'/>
<alias name='ua-db957cf8-99c2-431b-ba9e-1dc25b6a82d2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</interface>
<interface type='bridge'>
<mac address='52:54:00:b2:ba:84'/>
<source bridge='br0'/>
<virtualport type='openvswitch'>
<parameters interfaceid='4c5403d8-6552-4b0c-9c55-aaaefe067672'/>
</virtualport>
<model type='virtio'/>
<driver name='vhost' queues='4'/>
<alias name='ua-d55a8935-f89a-4dd5-9d87-671e3822e316'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</interface>
<serial type='file'>
<source path='/var/log/NTNX.serial.out.0'/>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
</serial>
<console type='file'>
<source path='/var/log/NTNX.serial.out.0'/>
<target type='serial' port='0'/>
</console>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes' listen='127.0.0.1'>
<listen type='address' address='127.0.0.1'/>
</graphics>
<audio id='1' type='none'/>
<video>
<model type='cirrus' vram='16384' heads='1' primary='yes'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</source>
<alias name='ua-a3b24f6a-3a35-4eb6-90ff-1a082c9f57af'/>
<rom bar='off'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</hostdev>
<memballoon model='none'/>
</devices>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='dac' relabel='yes'/>
</domain>

Result of the lspci command:

00:00.0 Host bridge: Intel Corporation 8th Gen Core Processor Host Bridg/DRAM Registers (rev 07)
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 07)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) (rev 07)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Inter Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:15.0 Serial bus controller: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0 (rev 10)
00:15.1 Serial bus controller: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #1 (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:16.1 Communication controller: Intel Corporation Device a361 (rev 10)
00:16.4 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller #2 (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10)
00:1b.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #17 (rev f0)
00:1b.4 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 (rev f0)
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #1 (rev f0)
00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 (rev f0)
00:1e.0 Communication controller: Intel Corporation Cannon Lake PCH Serial IO UART Host Controller (rev 10)
00:1f.0 ISA bridge: Intel Corporation Cannon Point-LP LPC Controller (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
01:00.0 Non-Volatile memory controller: Sandisk Corp SanDisk Extreme Pro / WD Black 2018/SN750/PC SN720 NVMe SSD
02:00.0 Non-Volatile memory controller: Sandisk Corp SanDisk Extreme Pro / WD Black 2018/SN750/PC SN720 NVMe SSD
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
05:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04)
06:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)

Result of the lspci -k | grep vfio command: Kernel driver in use: vfio-pci

As per the networking side, it was unclear what was expected here from the documentation I read and the How to Install Community Edition Episode 1 | Nutanix University video I watched, so I assumed those were the internal IPs (except for the gateway). I can’t divulge the public IP that was assigned to the server so I’ll send it to you privately, but the host public IP and gateway IP are not 10.0.x.x, they are under a 255.255.255.0 mask and indeed the gateway is at the top of this /24 range. No NAT involved afaict.

Thanks again for your help! Looking forward to read back from you.