Question

Complete newbie not sure how to get to the config screen


Hi all the expert,

I’m completely new to Nutanix.

I tried to install the Nutanix Community Edition onto my existing rack mounted HPE ProLiant DL380 Gen9 server, so I followed this YouTube video, but it’s definitely outdated, since the links shown on the video are different to the download links available from the post from 3 years ago (created on 29th September 2020)

 

So I have downloaded the installer ISO (version ce/2023.03.01/phoenix-ce2.0-fraser-6.5.2-stable-fnd-5.3.4-x86_64.iso) and nothing else, since the other download links are for Windows, but my purpose is to install Nutanix itself onto the host directly to replace ESXi completely, so I thought the VirtIO for Windows ISO and AHV Plugin (.msi files) are not really applicable in my case.

So it get stuck at the getting files screen for a long time, and then it errored out as the screenshot at the bottom, then a root login line showed up, I can’t even get to the grey screen to select the hypervisor and configure the IP like on the Getting Started tutorial.

I have checked the hardware compatibility here, my server should be supported

Could anyone advise what have I missed? Note, since the existing HPE server is rack mounted high up, I can’t get to the back of the server.

 


13 replies

I am told to not use the iso mount in the iLO console but write the installer to a USB stick.  I am going down this same path as well with DL360Gen9 v4 CPUs and a mix of HDD and SSD disks.  I can get the iso to start but then it fails with can’t find iso installer or something like that.  I did put my RAID into HBA mode.   I will post back when I try legacy BIOS vs UEFI. 

Legacy BIOS still returns this error for me.

 

Badge +1

Hi, 

I also the have the issue installing the CE edition. I’m  installing on bare metal DL380 from USB stick. The “Get AOS version” is taking very long time and I can select on the grey screen one time on 10 tries. 

I’m getting crazy. I tried to put disk in RAID-0, on RAID-5, just select 3 disk,… Nothing work.

Yesterday, I had the chance to have the ability to put ip,… twice.

The first time when the installation was finished, no boot device,

The second one, installation “sucessful” too, I took the entire night to try to start the service without any success. 

I’m working since several years with VMWare and I want to learn more about Nutanix that is for me a good alternative to vxrail for customers. But the first met is quite discouraging. I’m trying to install it since 6 days. 

I read the Hardware requirement, docs, forums,… but nothing is working on my lab right now.

Has someone some advices to avoid to give up? 

Thanks

 

Badge +1

New error this morning after having retry.

 

Userlevel 4
Badge +4

Hi, 

I also the have the issue installing the CE edition. I’m  installing on bare metal DL380 from USB stick. The “Get AOS version” is taking very long time and I can select on the grey screen one time on 10 tries. 

I’m getting crazy. I tried to put disk in RAID-0, on RAID-5, just select 3 disk,… Nothing work.

Yesterday, I had the chance to have the ability to put ip,… twice.

The first time when the installation was finished, no boot device,

The second one, installation “sucessful” too, I took the entire night to try to start the service without any success. 

I’m working since several years with VMWare and I want to learn more about Nutanix that is for me a good alternative to vxrail for customers. But the first met is quite discouraging. I’m trying to install it since 6 days. 

I read the Hardware requirement, docs, forums,… but nothing is working on my lab right now.

Has someone some advices to avoid to give up? 

Thanks

 

My first piece of advice is to update all firmware, especially the RAID controller.

Put your disks into JBOD mode, and select a boot disk in the RAID-controller settings if possible (use this disk for AHV during setup).

From my own experience in a bare-metal CE deployment, firmware is essential to getting the cluster up and running without any lags.

 

 

Userlevel 5
Badge +5

@Judasani make sure the disk you’re using for the hypervisor boot disk is the smallest disk in the system.   It appears that you’re running into a known issue at the end of the install where the system is not properly re-identifying the AHV boot disk to perform some AHV customization steps.

I also recommend doing the install in Legacy boot mode vs. UEFI.

 

Badge +1

Thanks a lot @ktelep and  @mikkisse.

I changed the disk in my Raid Controller to Primary boot. It’s better but not finished :)

I checked now the hades.out log and saw that the /dev/sda4 is already mounted. 

I tried to umount it but it was not a good idea. 

I reinstalled everything (again 🤣) but the behavior is the same. 

One step further but not yet the end of the journey.

 

 

Userlevel 5
Badge +5

We see this often on HP Servers, the RAID controller spits out duplicate Serial numbers for the drives.  If you look at the GUI during the installation you probably will notice that either they have NO serial numbers, or all the serials are the same.   If any CVM or Data disk have the same serial you’ll see this issue.

Thankfully, it’s not difficult to fix:

  1. SSH into the AHV host (username root, password: nutanix/4u)
  2. Get the name of the CVM from AHV via “virsh list --all”
  3. Shutdown the CVM
  4. Modify the CVM xml file with “virsh edit <name of CVM from Step 1>”
  5. Scroll down to the <disk> sections that have your devices with duplicate serials
  6. Modify JUST the serial listed in-between the <serial> </serial> tags.  Just change a digit or something, do NOT change anything else, be sure it is unique.   The editor is vi based BTW.
  7. Save the file (esc, wq, enter)
  8. Start the cvm (virsh start <name of CVM from Step 1>”

Note we’re just changing the serial # presented to the CVM, it does not have to reflect the actual serial number of the disk.   When I’m building virtual clusters I just use random strings of characters.

The fix for this will be in the next release of CE.

Badge +1

We see this often on HP Servers, the RAID controller spits out duplicate Serial numbers for the drives.  If you look at the GUI during the installation you probably will notice that either they have NO serial numbers, or all the serials are the same.   If any CVM or Data disk have the same serial you’ll see this issue.

Thankfully, it’s not difficult to fix:

  1. SSH into the AHV host (username root, password: nutanix/4u)
  2. Get the name of the CVM from AHV via “virsh list --all”
  3. Shutdown the CVM
  4. Modify the CVM xml file with “virsh edit <name of CVM from Step 1>”
  5. Scroll down to the <disk> sections that have your devices with duplicate serials
  6. Modify JUST the serial listed in-between the <serial> </serial> tags.  Just change a digit or something, do NOT change anything else, be sure it is unique.   The editor is vi based BTW.
  7. Save the file (esc, wq, enter)
  8. Start the cvm (virsh start <name of CVM from Step 1>”

Note we’re just changing the serial # presented to the CVM, it does not have to reflect the actual serial number of the disk.   When I’m building virtual clusters I just use random strings of characters.

The fix for this will be in the next release of CE.

 

@ktelep you’re amazing :) 

So i was able this night to put a AHV and a CVM running thanks to you! 

I created a cluster (with the command :  cluster -s 172.16.11.130 --redundancy_factor=1 create)

I saw the the unexpected login screen of Prism but… disillusionment 😅 I cannot logged on.

I check on the logs (in data/logs), internet, … I could not be able to determine why i got this error on prism and cannot logon.

The first time i come on the Prism GUI i got this message :

 

I connected to CVM to perform a cluster status and remember that I maybe forgot to start the cluster. 

So after a cluster start command i got this error (error that I get all the time since this morning on my different re-installs 😅)

Cluster start and cluster status showing:

Each time I checked the genesis service appears to be up : 

 

I’m not far from the victoryI think. Can I still take advantage of your kindness? 🤗

If my cluster is working, I can start my poc and demonstrate that VMWare can be replaced by Nutanix 🤗😎

 

Have a nice week-end

Badge +1

We see this often on HP Servers, the RAID controller spits out duplicate Serial numbers for the drives.  If you look at the GUI during the installation you probably will notice that either they have NO serial numbers, or all the serials are the same.   If any CVM or Data disk have the same serial you’ll see this issue.

Thankfully, it’s not difficult to fix:

  1. SSH into the AHV host (username root, password: nutanix/4u)
  2. Get the name of the CVM from AHV via “virsh list --all”
  3. Shutdown the CVM
  4. Modify the CVM xml file with “virsh edit <name of CVM from Step 1>”
  5. Scroll down to the <disk> sections that have your devices with duplicate serials
  6. Modify JUST the serial listed in-between the <serial> </serial> tags.  Just change a digit or something, do NOT change anything else, be sure it is unique.   The editor is vi based BTW.
  7. Save the file (esc, wq, enter)
  8. Start the cvm (virsh start <name of CVM from Step 1>”

Note we’re just changing the serial # presented to the CVM, it does not have to reflect the actual serial number of the disk.   When I’m building virtual clusters I just use random strings of characters.

The fix for this will be in the next release of CE.

 

@ktelep you’re amazing :) 

So i was able this night to put a AHV and a CVM running thanks to you! 

I created a cluster (with the command :  cluster -s 172.16.11.130 --redundancy_factor=1 create)

I saw the the unexpected login screen of Prism but… disillusionment 😅 I cannot logged on.

I check on the logs (in data/logs), internet, … I could not be able to determine why i got this error on prism and cannot logon.

The first time i come on the Prism GUI i got this message :

 

I connected to CVM to perform a cluster status and remember that I maybe forgot to start the cluster. 

So after a cluster start command i got this error (error that I get all the time since this morning on my different re-installs 😅)

Cluster start and cluster status showing:

Each time I checked the genesis service appears to be up : 

 

I’m not far from the victoryI think. Can I still take advantage of your kindness? 🤗

If my cluster is working, I can start my poc and demonstrate that VMWare can be replaced by Nutanix 🤗😎

 

Have a nice week-end

Update of the sunday 😅 :

After a long logs review and two reboot: it got rid of the genesis start issue

 

Services seems to be started and Prism is reachable! 😎

 

I will go further in the configuration and try to migrate some VM from my lab ! 

Thanks again for your help @ktelep !

It maybe wasn’t easy to install but Nutanix community seems very friendly! 💪

We see this often on HP Servers, the RAID controller spits out duplicate Serial numbers for the drives.  If you look at the GUI during the installation you probably will notice that either they have NO serial numbers, or all the serials are the same.   If any CVM or Data disk have the same serial you’ll see this issue.

Thankfully, it’s not difficult to fix:

  1. SSH into the AHV host (username root, password: nutanix/4u)
  2. Get the name of the CVM from AHV via “virsh list --all”
  3. Shutdown the CVM
  4. Modify the CVM xml file with “virsh edit <name of CVM from Step 1>”
  5. Scroll down to the <disk> sections that have your devices with duplicate serials
  6. Modify JUST the serial listed in-between the <serial> </serial> tags.  Just change a digit or something, do NOT change anything else, be sure it is unique.   The editor is vi based BTW.
  7. Save the file (esc, wq, enter)
  8. Start the cvm (virsh start <name of CVM from Step 1>”

Note we’re just changing the serial # presented to the CVM, it does not have to reflect the actual serial number of the disk.   When I’m building virtual clusters I just use random strings of characters.

The fix for this will be in the next release of CE.

 

How do I SSH to the host which is not at all installed to do the changes?

Installation started from USB, IP assigned from DHCP to the installation host. tried to SSH to this, failed,

 

Tried breaking out of the installer by using ctrl+c and then virsh command, result - command not found.

Appreciate if someone have workaround for this HPE servers duplicate serial number bug.

Userlevel 5
Badge +5

The installer needs to complete first and the system reboot to get to the point where the duplicate serial bug can be fixed.  Does your installation actually finish and system reboot into AHV?

If you don’t mind starting a new thread and then tagging me in it with the details of which hardware your using, screen shot of your drive selection screen and where the installation is failing, that would be the best way for us to help.

The installer needs to complete first and the system reboot to get to the point where the duplicate serial bug can be fixed.  Does your installation actually finish and system reboot into AHV?

If you don’t mind starting a new thread and then tagging me in it with the details of which hardware your using, screen shot of your drive selection screen and where the installation is failing, that would be the best way for us to help.

I don’t mind starting another thread, but I believe this will benefit the future tech people to find the answer in one page.

FYAI, This is a HPE Proliant DL380p Gen8 Server with 1*500GB SATA SSD in RAID0 mode, 1*300GB SAS HDD in RAID0 mode and 2*300GB SAS HDD in RAID1 mode with a installer USB drive and another exfat formatted USB drive.

 

Even with the duplicate serial numbers, installation was successful and once it restarted, server is displaying no boot drive.

 

Reply