Skip to main content
Question

CVM crash: faulty bridge-configs , Acropolis failures... help needed :-)

  • May 8, 2026
  • 1 reply
  • 12 views

 

Good morning all,

 

i have the problem, that one CVM out of a 3-Node-Cluster drives me crazy…

 

For some reason, the CVM went unresponsive regarding it´s cluster-services…

 

  1. CVM itself is available via SSH and I´m able to login using the Nutanix-User
  2. CVM cluster status says:

     

nutanix@NTNX-xxx-B-CVM:172.18.95.76:~$ cluster status | grep -v UP
2026-05-08 04:50:48,386Z INFO MainThread zookeeper_session.py:226 Ignoring passed host_port_list: zk1:9876,zk2:9876,zk3:9876 because the passed host_port_list appears to have been copied from the environment variable or gflag.
2026-05-08 04:50:48,387Z INFO MainThread zookeeper_session.py:296 cluster is attempting to connect to Zookeeper (unestablished session (object 0x7f6bd2b5d220)), host port list zk1:9876,zk2:9876,zk3:9876
2026-05-08 04:50:48,387Z INFO MainThread patterns.py:75 Creating a new instance for ZookeeperSession[('client_id', None), ('connection_timeout', None), ('host_port_list', 'zk1:9876,zk2:9876,zk3:9876'), ('use_zk_mt', None)]
2026-05-08 04:51:08,388Z ERROR MainThread configuration.py:168 Could not get Zookeeper connection with host_port_list: zk1:9876,zk2:9876,zk3:9876
2026-05-08 04:51:08,389Z INFO MainThread cluster:3645 Executing action status on SVMs 172.16.202.1,172.16.202.3,172.16.202.5
2026-05-08 04:51:23,396Z INFO MainThread cluster:2588 Waiting for response from 172.16.202.3
2026-05-08 04:51:38,396Z INFO MainThread cluster:2588 Waiting for response from 172.16.202.3
2026-05-08 04:51:48,406Z INFO MainThread cluster:3809 Success!
The state of the cluster: Unknown
Lockdown mode: Disabled

        CVM: 172.18.95.74 [172.16.202.1] Down

        CVM: 172.18.95.78 [172.16.202.5] Down
The state of the cluster: Zookeeper is down. Is the cluster configured?
Lockdown mode: Disabled

 

  1.  listing the Interfaces is also interesting:

nutanix@NTNX-CZUD3G025D-B-CVM:172.18.95.76:~$ manage_ovs show_interfaces
Failed to fetch gflags. Acropolis service might be down: HTTPConnectionPool(host='127.0.0.1', port=2030): Max retries exceeded with url: /h/gflags?show=hypervisor_username (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efbfe372850>: Failed to establish a new connection: [Errno 111] Connection refused')).
Failed to execute ovs-appctl command: bond/show br1-up
lacp_status is missing in output of 'ovs-appctl bond/show br1-up'
Failed to execute ovs-appctl command: bond/show br0-up
lacp_status is missing in output of 'ovs-appctl bond/show br0-up'
name  mode link speed
eth2 25000 True 25000
eth3 25000 True 25000
eth4 25000 True 25000
eth5 25000 True 25000
eth6 25000 True 25000
eth7 25000 True 25000
eth8 25000 True 25000
eth9 25000 True 25000

#  it seems that we lost eth0 & eth1 ? 

eth0 was part of the br0  , eth1 was part of the br1 , Backplane-Network is/was on br1

 

Do you have any ideas ?

 

Thank you for any help..

 

 

 

1 reply

jarrodl
Forum|alt.badge.img+2
  • Vanguard
  • May 8, 2026

If this is a production system you should consider engaging support.