Cluster not fully starting in CE | Nutanix Community
Skip to main content

HP DL380P Gen 8

2x Xeon E5 2667

128GB RAM

boot: 128GB SSD

hypervisor: 1TB SSD

storage: 1TB HDD

=============================

Thanks in advance for any advice! Everything installs fine. I selected the ‘single node cluster’ option in installer and put my local DNS server. I waited the ‘15 minutes after installation’ for the cluster to be automatically created.

=============================

I do see ports opening on the CVM but not 8000:

Starting Nmap 7.94 ( https://nmap.org ) at 2024-08-15 16:58 Atlantic Daylight Time
Nmap scan report for 10.110.10.101
Host is up (0.0033s latency).
Not shown: 960 filtered tcp ports (no-response), 33 closed tcp ports (reset)
PORT     STATE SERVICE
22/tcp   open  ssh
111/tcp  open  rpcbind
2099/tcp open  h2250-annex-g
2100/tcp open  amiganetfs
2103/tcp open  zephyr-clt
9876/tcp open  sd
9877/tcp open  x510
MAC Address: 50:6B:8D:F7:85:3A (Nutanix)

=============================

$ cluster status

2024-08-15 12:06:08,460Z INFO MainThread zookeeper_session.py:191 cluster is attempting to connect to Zookeeper
2024-08-15 12:06:08,469Z INFO Dummy-1 zookeeper_session.py:625 ZK session establishment complete, sessionId=0x19155ecd3200009, negotiated timeout=20 secs
2024-08-15 12:06:08,478Z INFO MainThread cluster:2943 Executing action status on SVMs 10.110.10.101
The state of the cluster: start
Lockdown mode: Disabled

        CVM: 10.110.10.101 Up, ZeusLeader
                                Zeus   UP       4378, 4431, 4432, 4435, 4445, 4462]
                    SysStatCollector DOWN       5]
                           IkatProxy DOWN       ]
                    IkatControlPlane DOWN       ]
                       SSLTerminator DOWN       ]
                      SecureFileSync DOWN       ]
                              Medusa DOWN       ]
                  DynamicRingChanger DOWN       ]
                              Pithos DOWN       ]
                          InsightsDB DOWN       ]
                              Athena DOWN       O]
                             Mercury DOWN       ]
                              Mantle DOWN       ]
                            Stargate DOWN       ]
                InsightsDataTransfer DOWN       ]
                               Ergon DOWN       >]
                             GoErgon DOWN       ]
                             Cerebro DOWN       ]
                             Chronos DOWN       ]
                             Curator DOWN       ]
                               Prism DOWN       ]
                                Hera DOWN       ]
                                 CIM DOWN       ]
                        AlertManager DOWN       ]
                            Arithmos DOWN       n]
                             Catalog DOWN       ]
                           Acropolis DOWN       b]
                               Uhura DOWN       ]
                   NutanixGuestTools DOWN       ]
                          MinervaCVM DOWN       ]
                       ClusterConfig DOWN       ]
                         APLOSEngine DOWN       ]
                               APLOS DOWN       P]
                     PlacementSolver DOWN       ]
                               Lazan DOWN       ]
                             Polaris DOWN       ]
                              Delphi DOWN       ]
                            Security DOWN       W]
                                Flow DOWN       ]
                             Anduril DOWN       ]
                               XTrim DOWN       ]
                       ClusterHealth DOWN       ]
2024-08-15 12:06:10,756Z INFO MainThread cluster:3104 Success!

=====================================

$cluster start just posts this ad nauseum:

2024-08-15 12:15:38,980Z INFO MainThread zookeeper_session.py:191 cluster is attempting to connect to Zookeeper
2024-08-15 12:15:38,988Z INFO Dummy-1 zookeeper_session.py:625 ZK session establishment complete, sessionId=0x19155ecd3200014, negotiated timeout=20 secs
2024-08-15 12:15:38,989Z INFO MainThread cluster:2943 Executing action start on SVMs 10.110.10.101
Waiting on 10.110.10.101 (Up, ZeusLeader) to start:  SysStatCollector IkatProxy IkatControlPlane SSLTerminator SecureFileSync Medusa DynamicRingChanger Pithos InsightsDB Athena Mercury Mantle Stargate InsightsDataTransfer Ergon GoErgon Cerebro Chronos Curator Prism Hera CIM AlertManager Arithmos Catalog Acropolis Uhura NutanixGuestTools MinervaCVM ClusterConfig APLOSEngine APLOS PlacementSolver Lazan Polaris Delphi Security Flow Anduril XTrim ClusterHealth

Waiting on 10.110.10.101 (Up, ZeusLeader) to start:  SysStatCollector IkatProxy IkatControlPlane SSLTerminator SecureFileSync Medusa DynamicRingChanger Pithos InsightsDB Athena Mercury Mantle Stargate InsightsDataTransfer Ergon GoErgon Cerebro Chronos Curator Prism Hera CIM AlertManager Arithmos Catalog Acropolis Uhura NutanixGuestTools MinervaCVM ClusterConfig APLOSEngine APLOS PlacementSolver Lazan Polaris Delphi Security Flow Anduril XTrim ClusterHealth

Waiting on 10.110.10.101 (Up, ZeusLeader) to start:  SysStatCollector IkatProxy IkatControlPlane SSLTerminator SecureFileSync Medusa DynamicRingChanger Pithos InsightsDB Athena Mercury Mantle Stargate InsightsDataTransfer Ergon GoErgon Cerebro Chronos Curator Prism Hera CIM AlertManager Arithmos Catalog Acropolis Uhura NutanixGuestTools MinervaCVM ClusterConfig APLOSEngine APLOS PlacementSolver Lazan Polaris Delphi Security Flow Anduril XTrim ClusterHealth

Waiting on 10.110.10.101 (Up, ZeusLeader) to start:  SysStatCollector IkatProxy IkatControlPlane SSLTerminator SecureFileSync Medusa DynamicRingChanger Pithos InsightsDB Athena Mercury Mantle Stargate InsightsDataTransfer Ergon GoErgon Cerebro Chronos Curator Prism Hera CIM AlertManager Arithmos Catalog Acropolis Uhura NutanixGuestTools MinervaCVM ClusterConfig APLOSEngine APLOS PlacementSolver Lazan Polaris Delphi Security Flow Anduril XTrim ClusterHealth

Waiting on 10.110.10.101 (Up, ZeusLeader) to start:  SysStatCollector IkatProxy IkatControlPlane SSLTerminator SecureFileSync Medusa DynamicRingChanger Pithos InsightsDB Athena Mercury Mantle Stargate InsightsDataTransfer Ergon GoErgon Cerebro Chronos Curator Prism Hera CIM AlertManager Arithmos Catalog Acropolis Uhura NutanixGuestTools MinervaCVM ClusterConfig APLOSEngine APLOS PlacementSolver Lazan Polaris Delphi Security Flow Anduril XTrim ClusterHealth

^C2024-08-15 12:15:52,147Z WARNING MainThread cluster:3139 Exiting on Ctrl-C

==================================

$ watch -d genesis status

Every 2.0s: genesis status                                                                                                           Thu Aug 15 13:45:44 2024

2024-08-15 13:45:45.829793: Services running on this node:
  foundation: w]
  genesis: s2824, 2994, 3018, 3019]
  zookeeper: U4378, 4431, 4432, 4435, 4445, 4462]

 

==================================

I also noticed that the host has the DNS address supplied in the setup, but the CVM does not, and for the life of me I cant elevate to root on the CVM to edit the /etc/resolv.conf file.

 

Thank you!

 

 

 

In the future, for CE help,; I’d recommend posting into the CE specific forum, we tend to be quicker in responses over there for these types of questions.

This looks like the CVM first boot scripts didn’t complete 100% for some reason or another.  The “Create Single Node Cluster” function has been problematic, so I would suggest that you  redeploy the node, but do not select “create single node cluster”. Make sure the CVM and the Host iPs are in the same network.

Once the installation completes, and when you ssh into the CVM and run the ‘cluster status’ command, it may take a little bit, but it will return “cluster is not configured” once everything is ready.

Run the cluster creation command:

‘’cluster -s <IP address of CVM> --redundancy_factor=1 create”
 

You should never modify any of the system files directly on a Nutanix node, they are all managed by the system itself.  When the cluster creation completes it handles any additional DNS entries and configurations that need to be created.

The port you’re really looking for on the CVM for it to be up and running is 9440, not 8000.


@ktelep thanks for the input. Apologies for posting this to the wrong forum.

I figured out the problem.

Solution:

My servers have P420i RAID cards. I made a RAID 0 array for each drive and installed CE on that way. This will not work. These RAID cards need to be in HBA mode to pass the disks through to the OS. The easiest way to configure this is to create a bootable USB with the HP Service Pack for Proliant ISO (SPP). You need an HP account to download this ISO officially but its very easy to find the ISO unofficially, just make sure the hash matches the HP site.

Abstract:

    Create bootable SPP USB
    Enter CLI and execute command to put RAID card into HBA mode
    Install Nutanix

Secondary abstract:

    Create bootable SPP USB
    Enter CLI and execute command to put RAID card into HBA mode
    Google why its not executing
    Update the server component firmware with SPP
    Enter CLI and execute command to put RAID card into HBA mode
    Install Nutanix
    Google why its not booting
    Install Nutanix and use a USB for the boot drive because you cant boot from HBA mode
    Party

Detailed steps:

    Download latest SPP ISO (8.1)
    Download and install HP USB creation too and create bootable USB with the SPP lSO: https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-b3eceaa86de349e1
    Disconnect all drives from the RAID card
    Plug the USB into the server and boot from it
    Select the ‘interactive’ or ‘manual’ option
    When you are at the main menu you will see three options. At this menu press Ctrl+Alt+D+B+X to get into a CLI
    type ‘hpssacli’ and hit enter
    tyep ‘ctrl all show status’ and hit enter to find out the slot that your RAID card is on (mine was 0)
    type ‘controller slot=0 modify hbamode=on forced’
        Your slot number may be different. Use the slot number shown in previous command.
    type ‘exit’ and hit enter
    shutdown server
    install drives
    Restart server
    Install Nutanix

Problems that can arise:

‘controller slot=0 modify hbamode=on forced’ command/feature unknown/unavailable:

You can be met with an error message saying that HBA mode is unavailable. Use the SPP to update the server components fully. You will see that the RAID card firmware updates to 8.x or better. Once all these updates have been done restart the server and go through the above steps. The command ‘controller slot=0 modify hbamode=on forced’ should execute now.

Once youve installed Nutanix CE the server will not boot

YOU CANNOT BOOT FROM HBA MODE

You will need to boot from a USB, an SD card on the board, or a directly attached SATA drive (remove an optical drive or something). You cannot boot from drives connected to the RAID card while it is in HBA mode. I dont know about other cards, but this is the case for the P420i.

I ended up using one of these for my boot drive

They are small and difficult to dislodge mistakenly.

Good luck, have fun.


Solution: https://next.nutanix.com/discussion-forum-14/installing-nutanix-ce-with-p420i-raid-card-43400