Question

Boot error after upgrade with LCM

  • 19 November 2021
  • 3 replies
  • 44 views

Using the LCM I updated one of our hosts to the latest firmware from G4G5T6.0 to G4G5T8.0, (BMC 3.64 to 3.65) and since then all I get is a black screen with the words, “boot error”

I can get into the bios via IPMI and see the boot device, “sSATA P3: SATADOM-SL 3ME” but it just not booting from it anymore. I also could not find any documents on how to trouble shoot it.

Could someone point me in the right direction please, thanks!

Note, out of the 6 hosts in our environment, 3 have updated fine already and nothing has changed, so a little confused. I’ve also been updating one host at a time.


3 replies

Userlevel 1
Badge +4

Hello,

Can you please share me with the screenshot of the host from IPMI page you see this boot error? We can try checking the health of the Satadom to see if it is fine. Please open a case with Nutanix Support to troubleshoot further

As I said, its a black screen with just the words ‘boot error’ on it, nothing else.

How do I check the Satadom?

No support unforinitually, hence why I’ve come to the forums to resolve this.

 

So I tried to recover the boot with the “Repair host boot device” option, booted from the phoenix.iso and uploaded the “VMware-VMvisor-Installer-7.0U2a-17867351.x86_64” and even tried the “VMware-VMvisor-Installer-7.0.0-15843807.x86_64” but they keep failing with “log.FATAL("Failed to commit cow disk image.")”

 

I also tried to create an iso from the other hosts using the following guide, https://portal.nutanix.com/page/documents/kbs/details?targetId=kA032000000TUksCAG

Tried to reinstall esxi but this fails too with the same error, below is some more of the error details, google is not showing much on this either.

 

2021-11-23 11:19:37,277Z DEBUG foundationapi.upload_local called with
2021-11-23 11:19:37,277Z DEBUG {u'installer_type': u'esx', u'localpath': u'/home/nutanix/software_downloads/hypervisor_installer/VMware-VMvisor-Installer-7.0.0-15843807.x86_64.iso'}
2021-11-23 11:19:37,278Z DEBUG link created
2021-11-23 11:19:40,573Z DEBUG foundationapi.image_nodes called with
2021-11-23 11:19:40,573Z DEBUG {u'cvm_gateway': u'10.31.80.1', u'blocks': [{u'nodes': [{u'node_position': u'B', u'hypervisor': u'esx', u'image_now': True, u'svm_version': u'el7.3-release-fraser-6.0.1.7-stable-927717a4d1e8b021d465bd5bdfa314183c40608e', u'cvm_gb_ram': 48, u'nos_version': u'6.0.1.7', u'current_cvm_vlan_tag': 1435, u'cvm_ip': u'10.31.80.139', u'hypervisor_ip': u'10.31.80.133', u'svm_install_type': None}], u'block_id': u'15SM60340028'}], u'hypervisor_netmask': u'255.255.255.0', u'nos_version': u'6.0.1.7', u'cvm_netmask': u'255.255.255.0', u'skip_hypervisor': False, u'hypervisor_gateway': u'10.31.80.1', u'hypervisor_iso': {u'esx': u'VMware-VMvisor-Installer-7.0.0-15843807.x86_64.iso'}, u'clusters': [], u'hypervisor_nameserver': u'10.31.80.3'}
2021-11-23 11:21:12,294Z INFO Model detected: NX-3060-G4
2021-11-23 11:21:13,273Z INFO vpd_info {'node_serial': 'OM157S010050', 'node_position': u'B', 'vpd_method': None, 'cluster_id': 33764, 'rackable_unit_serial': u'15SM60340028'}
2021-11-23 11:21:13,278Z INFO Using node_serial from FRU
2021-11-23 11:21:13,284Z INFO Using block_id from FRU
2021-11-23 11:21:13,289Z INFO Using cluster_id from FRU
2021-11-23 11:21:13,293Z INFO Using node_position from FRU
2021-11-23 11:21:13,297Z INFO node_serial = OM157S010050, node_uuid = None, block_id = 15SM60340028, cluster_id = 33764, model = USE_LAYOUT, model_string = NX-3060-G4, node_position = B
2021-11-23 11:21:13,300Z INFO Running updated Phoenix
2021-11-23 11:21:14,265Z INFO This node doesn't support VMD
2021-11-23 11:21:14,269Z INFO Skipping BIOS settings updates for VMD: Skipping BIOS settings updates for VMD
2021-11-23 11:21:14,272Z INFO Getting NOS version from the CVM
2021-11-23 11:21:14,545Z INFO CVM release version is el7.3-release-fraser-6.0.1.7-stable-927717a4d1e8b021d465bd5bdfa314183c40608e, nos version is 6.0.1.7
2021-11-23 11:21:14,550Z INFO Attempting to get nos tarball from CVM
2021-11-23 11:21:14,566Z INFO Found nos 6.0.1.7 on the cvm
2021-11-23 11:21:32,671Z INFO Generated /mnt/nos_from_svm/6.0.1.7/nutanix_installer_package.tar from cvm successfully
2021-11-23 11:21:32,928Z INFO Downloading hypervisor iso
2021-11-23 11:21:32,932Z INFO Downloading hypervisor iso: Downloading hypervisor iso
2021-11-23 11:21:32,938Z INFO Downloading file 'VMware-VMvisor-Installer-7.0.0-15843807.x86_64.iso' with size: 367314944 bytes.
2021-11-23 11:21:33,967Z INFO Downloading driver package from url http://10.31.80.138:8000/files/tmp/sessions/20211123-111940-1/files/driver_package_SMIPMI_esx.tar.gz
2021-11-23 11:21:33,975Z INFO Downloading file 'driver_package.tar.gz' with size: 5327818 bytes.
2021-11-23 11:21:33,995Z INFO Completed file transfer: Completed file transfer
2021-11-23 11:21:36,608Z INFO Skipping rdma checks as rdma_passthrough is disabled
2021-11-23 11:21:36,626Z DEBUG Available memory 251.795730591, Max allowed memory for CVM: 245.795730591
2021-11-23 11:21:36,629Z DEBUG Using new CVM sizing policy: False
2021-11-23 11:21:36,632Z DEBUG Determining numa node for allocating vcpu
2021-11-23 11:21:36,635Z DEBUG Determining numa node for storage controller: 1000:0097:0
2021-11-23 11:21:36,751Z DEBUG Numa node for storage controller 1000:0097:0 is 0
2021-11-23 11:21:36,755Z DEBUG Storage controller(s) are on numa node 0
2021-11-23 11:21:36,786Z DEBUG Computed CVM size to be vcpu 8, mem 48
2021-11-23 11:21:36,789Z DEBUG Enabling numa because sufficient memory and CPUs available on a numa node 0
2021-11-23 11:21:36,806Z INFO Running CVM Installer: Running CVM Installer
2021-11-23 11:21:36,809Z INFO Running CVM Installer
2021-11-23 11:21:37,082Z INFO Extracting the SVM installer into memory. This will take some time...
2021-11-23 11:21:41,136Z INFO ESX iso unpacked in /tmp/tmpKhrKNF
2021-11-23 11:21:41,420Z INFO Installation Device = /dev/sdg
2021-11-23 11:21:41,423Z INFO Creating raw disk at /phoenix/imaging_helper/disk.raw
2021-11-23 11:21:41,431Z INFO blockdev --getsize64 returned 64023257088
2021-11-23 11:21:41,434Z INFO Reserved 20GiB for a VMFS partition
2021-11-23 11:21:41,439Z INFO Advising ESXi that boot drive size is 42548420608
2021-11-23 11:21:41,446Z INFO Creating cow disk at /phoenix/imaging_helper/disk.cow
2021-11-23 11:21:41,514Z INFO Executing /usr/bin/qemu-system-x86_64 -m 4096 -machine q35 -machine vmport=off -enable-kvm -drive file=/phoenix/imaging_helper/disk.cow,format=qcow2 -cdrom /phoenix/imaging_helper/installer.iso -netdev user,id=net0,net=192.168.5.0/24 -device vmxnet3,netdev=net0,id=net0,addr=1d.0,mac=0c:c4:7a:bc:5b:04 -vnc :1 -boot order=d -pidfile installer_vm.pid -daemonize -cpu host,+vmx -smp 8 \
2021-11-23 11:21:41,563Z INFO Installer VM is now running the installation
2021-11-23 11:21:41,566Z INFO Installer VM running with PID = 8674
2021-11-23 11:22:11,593Z INFO Loaded config parameters successfully
2021-11-23 11:22:11,595Z INFO [30/1830] Hypervisor installation in progress
2021-11-23 11:22:41,630Z INFO [60/1830] Hypervisor installation in progress
2021-11-23 11:23:07,945Z INFO Installing ESXi: Installing ESXi
2021-11-23 11:23:11,665Z INFO [90/1830] Hypervisor installation in progress
2021-11-23 11:23:41,698Z INFO [120/1830] Hypervisor installation in progress
2021-11-23 11:24:11,732Z INFO [150/1830] Hypervisor installation in progress
2021-11-23 11:24:41,766Z INFO [180/1830] Hypervisor installation in progress
2021-11-23 11:24:54,066Z INFO Installed ESXi successfully: Installed ESXi successfully
2021-11-23 11:25:11,800Z INFO [210/1830] Hypervisor installation in progress
2021-11-23 11:25:11,803Z INFO Installer VM finished in 210.345644236s.
2021-11-23 11:25:11,807Z INFO Hypervisor installation is done
2021-11-23 11:25:11,810Z INFO Rebasing cow disk at /phoenix/imaging_helper/disk.cow to /dev/sdg
2021-11-23 11:25:11,822Z INFO Commiting cow disk at /phoenix/imaging_helper/disk.cow
2021-11-23 12:05:11,885Z INFO Commited in 2400.1s
2021-11-23 12:05:11,889Z CRITICAL Failed to commit cow disk image.
2021-11-23 12:05:11,892Z INFO Imaging thread 'hypervisor' failed with reason [Traceback (most recent call last):
File "/root/phoenix/esx.py", line 338, in image
self.__image_boot_disk()
File "/root/phoenix/esx.py", line 205, in __image_boot_disk
if not vm.install_os():
File "/root/phoenix/imaging_helper/installer_vm.py", line 174, in install_os
self._rebase_and_commit_cow_disk()
File "/root/phoenix/imaging_helper/installer_vm.py", line 264, in _rebase_and_commit_cow_disk
log.FATAL("Failed to commit cow disk image.")
File "/root/phoenix/log.py", line 261, in FATAL
_fatal(msg)
File "/root/phoenix/log.py", line 77, in _fatal
sys.exit(1)
SystemExit: 1
]
2021-11-23 12:05:11,896Z CRITICAL Imaging thread 'hypervisor' failed with reason [Traceback (most recent call last):
File "/root/phoenix/esx.py", line 338, in image
self.__image_boot_disk()
File "/root/phoenix/esx.py", line 205, in __image_boot_disk
if not vm.install_os():
File "/root/phoenix/imaging_helper/installer_vm.py", line 174, in install_os
self._rebase_and_commit_cow_disk()
File "/root/phoenix/imaging_helper/installer_vm.py", line 264, in _rebase_and_commit_cow_disk
log.FATAL("Failed to commit cow disk image.")
File "/root/phoenix/log.py", line 261, in FATAL
_fatal(msg)
File "/root/phoenix/log.py", line 77, in _fatal
sys.exit(1)
SystemExit: 1
]

Am at a loss as to how to get this node back up and running again

 

thanks!

so, it looks like the iso I was using was incomplete. I generated a new iso on another node that had lots of space and it was a lot larger.

Installing went through fully although very slowly, but on the requested reboot I got the below and not sure what needs to be done or what failed to set the correct setting here.

 

Reply