Skip to main content
Question

Cluster creation stuck after Foundation – Ergon service never comes up (Genesis loop, RPC/NFS checked)

  • December 20, 2025
  • 7 replies
  • 73 views

Hello Nutanix Community,

I’m reaching out for help after spending a significant amount of time troubleshooting an issue where a Nutanix cluster cannot complete initialization. I would truly appreciate any insight from the community, as I believe I have exhausted all standard troubleshooting paths.

This cluster was previously created successfully and had been running in production without issues.

However, due to security-related changes in our IT infrastructure, we were required to re-address the entire environment, including:

  • CVM IPs

  • Hypervisor IPs

  • IPMI IPs

Because of this requirement, I intentionally chose the cleanest and officially recommended approach, which was:

  1. Reclaim licenses

  2. Stop the cluster

  3. Destroy the cluster

  4. Rebuild the cluster from scratch using Nutanix Foundation

This was not an accidental failure during a first-time installation.
The issue only started after performing a controlled teardown and reinstallation due to the IP address changes.
 

Support experience and request for community help

I have already contacted Nutanix Support in my region regarding this issue.
Unfortunately, I was informed that this case is considered out of scope, and therefore no further assistance could be provided.

I must say that this is deeply disappointing, as we have renewed Maintenance & Support (MA) every year and have always expected to receive full technical support when encountering critical issues such as this — especially on hardware and clusters that were previously running successfully.

At this point, I sincerely hope that the Nutanix community can help provide insight, guidance, or direction to resolve this problem.

Any advice, experience, or suggestion would be greatly appreciated.
Thank you very much in advance for your time and support.

 

 

7 replies

Forum|alt.badge.img+3
  • Outrider
  • December 22, 2025

Hi, 
what is your hardware model on your three servers?
what version of AOS you have tried ?what is the version of foundation you have tried ?
what is the hypervisor you have used ?
what is the network connectivity between three nodes (active-passive, LACP)?
are you able to ping each CVM, hypervisor (from each node to other nodes, and to gateway)

 


  • Author
  • Adventurer
  • December 24, 2025

Hi Jamali,
thank you very much for your response. Please find the details below:
1. Hardware model : NX-1175S-G7

2. AOS versions : 6.10.1
3. Foundation versions  : 5.10
4. Hypervisor : AHV-20230302.103003
5 network configuration : 

  • Switch ports are configured as standard VLAN trunks

  • On the AHV/CVM side, NIC bonding is active-backup

  • MTU is 1500, consistent across all nodes and switch ports

6. Connectivity tests

Yes, all connectivity checks pass:

  • All CVMs can ping each other

  • All hypervisors can ping each other

  • CVM ↔ Hypervisor communication works

  • All nodes can ping the gateway

  • No packet loss observed


Forum|alt.badge.img+3
  • Outrider
  • December 28, 2025

honesty It is strange, but Is it possible to do below activity?

are you able to build a single or two node cluster?
have you tried using different version of foundation and AOS/AHV version.


JeroenTielen
Forum|alt.badge.img+8
  • Vanguard
  • December 28, 2025

@kittikhun do you have more interfaces in the nodes then the two, which are in a bond? Ifso, try to run the crashcart before creating the cluster. 


JeroenTielen
Forum|alt.badge.img+8
  • Vanguard
  • December 28, 2025

O, and when the cluster is starting grap a cup of coffee. This can take some minutes. 


  • Author
  • Adventurer
  • January 4, 2026

jamali.ahmad Thanks for the suggestion.
I have already tried multiple combinations of Foundation versions as well as different AOS/AHV versions, but I’m still hitting the same issue during cluster creation.

I haven’t tried building a 1-node or 2-node cluster yet. I wanted to confirm whether it’s a supported or recommended approach to first create a 2-node cluster and then add the third node afterward.

My concern is around RF2—would starting with two nodes cause any limitations or issues when scaling out to three nodes later?

I’d appreciate your guidance on whether this is a valid troubleshooting step.


  • Author
  • Adventurer
  • January 4, 2026

JeroenTielen Thanks for the suggestion, and just to clarify:
the interfaces are not configured with LACP.

Each node has two NICs connected, using the default active-backup bonding mode (no aggregation on the switch side). The switch ports are configured as simple VLAN trunks, not LACP.

There are no additional active interfaces beyond these two.

I’ll still double-check the interface state and try running the crashcart before the next cluster creation attempt, just to rule this out.