Solved

Problems connecting PE to PC after upgrade


Hi,

we just upgraded our prism central instance to the most current community edition version of Nutanix (2020.09.16). We also upgraded our cluster successfully. For this to happen I had to unregister the cluster from prism central first (with the help of: https://portal.nutanix.com/page/documents/kbs/details?targetId=kA00e000000XeZjCAK). After everything was updated I wanted to register the cluster again in prism central, but after some time I get the error message “Error: Failed to create remote connection”. I found this KB article: https://portal.nutanix.com/page/documents/kbs/details?targetId=kA00e0000009BwxCAE and executed all the steps therein (which are mostly the steps I already had done), but the issue persists.

 

On the prism central VM I can see the following entries in the “prism_gateway.log” file when trying to register the cluster:
 

WARN  2021-05-12 15:17:55,410Z http-nio-0.0.0.0-9081-exec-8 [] commands.auth.PAMAuthenticationProvider.checkFailedAttempts:395 Failed to check the Failed Attempts..                          
WARN 2021-05-12 15:17:55,437Z http-nio-0.0.0.0-9081-exec-8 [] util.common.ArithmosIDFServiceUtil.getMappedFilteredCriteria:130 Entity type node
INFO 2021-05-12 15:17:55,446Z http-nio-0.0.0.0-9081-exec-8 [] commands.upgrade.ValidateCompatibility.doExecute:96 Checking PE compatibility, installed PC version: 2020.09.16, target PE vers
ion: 2020.09.16
INFO 2021-05-12 15:17:55,448Z http-nio-0.0.0.0-9081-exec-8 [] upgrade.retreivers.RetrieverImpl.getHttpsConnection:378 Retrieving staging server metadata from https://release-api.nutanix.com
/api/v2/prismcentral/2020.09.16?stage=GA

ERROR 2021-05-12 15:17:55,589Z http-nio-0.0.0.0-9081-exec-8 [] commands.multicluster.PortalCompatibilityMatrix.isNosVersionCompatible:66 Unable to fetch versions from portal
java.lang.NullPointerException
at com.nutanix.prism.commands.upgrade.retreivers.RetrieverImpl.getSoftwareVersions(RetrieverImpl.java:18

<further stacktrace messages>


WARN 2021-05-12 15:17:55,591Z http-nio-0.0.0.0-9081-exec-8 [] commands.multicluster.PortalCompatibilityMatrix.isNosVersionCompatible:72 falling back to zkNode compatibility for pc version 2
020.09.16 and software versions from portal null
WARN 2021-05-12 15:17:55,592Z http-nio-0.0.0.0-9081-exec-8 [] commands.multicluster.ZkNodeCompatibilityMatrix.isNosVersionCompatible:61 falling back to static compatibility for pc version 2
020.09.16 and software versions from zknode at path prism_central
INFO 2021-05-12 15:17:55,597Z http-nio-0.0.0.0-9081-exec-8 [] commands.multicluster.StaticCompatibilityMatrix.getCeVersionMap:239 Got CE version map with 53 entries
INFO 2021-05-12 15:17:55,598Z http-nio-0.0.0.0-9081-exec-8 [] commands.multicluster.StaticCompatibilityMatrix.getNosVersionForCE:227 getNosVersionForCE(2020.09.16) - containsKey: true, ver:
5.18.0.6
INFO 2021-05-12 15:17:55,599Z http-nio-0.0.0.0-9081-exec-8 [] commands.multicluster.StaticCompatibilityMatrix.isNosVersionCompatible:181 PC is compatible with PE of version 2020.09.16

WARN 2021-05-12 15:17:55,856Z http-nio-0.0.0.0-9081-exec-4 [] commands.auth.PAMAuthenticationProvider.checkFailedAttempts:395 Failed to check the Failed Attempts..
INFO 2021-05-12 15:17:55,861Z http-nio-0.0.0.0-9081-exec-4 [] multicluster.registration.AddClusterExternalState.prepareToExecute:114 Prism Element version is lesser than 5.10 : 2020.09.16
INFO 2021-05-12 15:17:55,864Z http-nio-0.0.0.0-9081-exec-4 [] commands.upgrade.ValidateCompatibility.doExecute:96 Checking PE compatibility, installed PC version: 2020.09.16, target PE vers
ion: 2020.09.16
INFO 2021-05-12 15:17:55,866Z http-nio-0.0.0.0-9081-exec-4 [] upgrade.retreivers.RetrieverImpl.getHttpsConnection:378 Retrieving staging server metadata from https://release-api.nutanix.com
/api/v2/prismcentral/2020.09.16?stage=GA
ERROR 2021-05-12 15:17:55,989Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.PortalCompatibilityMatrix.isNosVersionCompatible:66 Unable to fetch versions from portal
java.lang.NullPointerException
at com.nutanix.prism.commands.upgrade.retreivers.RetrieverImpl.getSoftwareVersions(RetrieverImpl.java:188)
at com.nutanix.prism.commands.multicluster.PortalCompatibilityMatrix.isNosVersionCompatible(PortalCompatibilityMatrix.java:64)

<further stacktrace messages>


WARN 2021-05-12 15:17:55,990Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.PortalCompatibilityMatrix.isNosVersionCompatible:72 falling back to zkNode compatibility for pc version 2
020.09.16 and software versions from portal null
WARN 2021-05-12 15:17:55,991Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.ZkNodeCompatibilityMatrix.isNosVersionCompatible:61 falling back to static compatibility for pc version 2
020.09.16 and software versions from zknode at path prism_central
INFO 2021-05-12 15:17:55,995Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.StaticCompatibilityMatrix.getCeVersionMap:239 Got CE version map with 53 entries
INFO 2021-05-12 15:17:55,998Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.StaticCompatibilityMatrix.getNosVersionForCE:227 getNosVersionForCE(2020.09.16) - containsKey: true, ver:
5.18.0.6
INFO 2021-05-12 15:17:55,999Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.StaticCompatibilityMatrix.isNosVersionCompatible:181 PC is compatible with PE of version 2020.09.16
INFO 2021-05-12 15:17:56,015Z Curator-PathChildrenCache-0 [] commands.multicluster.ZkChildrenListener.childEvent:102 Cluster External State Cache Value added : 0005ad13-49c9-ed61-7f19-a4bf0
167629f
INFO 2021-05-12 15:17:56,066Z http-nio-0.0.0.0-9081-exec-4 [] multicluster.registration.AddClusterExternalState.getClusterDataStateBuilder:312 the specified node does not exist
INFO 2021-05-12 15:17:56,095Z http-nio-0.0.0.0-9081-exec-4 [] multicluster.registration.AddClusterExternalState.initClusterDataState:286 Created the cluster data state entity with entity i
d : 0005ad13-49c9-ed61-7f19-a4bf0167629f in IDf successfully
WARN 2021-05-12 15:17:56,339Z http-nio-0.0.0.0-9081-exec-3 [] commands.auth.PAMAuthenticationProvider.checkFailedAttempts:395 Failed to check the Failed Attempts..
INFO 2021-05-12 15:17:56,342Z http-nio-0.0.0.0-9081-exec-3 [] prism.proxy.LocalGenericProxy.<init>:94 Constructing proxy to Genesis service at 127.0.0.1:2100
WARN 2021-05-12 15:17:56,462Z http-nio-0.0.0.0-9081-exec-6 [] util.common.ArithmosIDFServiceUtil.getMappedFilteredCriteria:130 Entity type node
WARN 2021-05-12 15:17:56,464Z http-nio-0.0.0.0-9081-exec-1 [] util.common.ArithmosIDFServiceUtil.getMappedFilteredCriteria:130 Entity type node
INFO 2021-05-12 15:18:01,144Z pool-11-thread-1 [] prism.init.UnregistrationFixerTask.run:158 Fixing unregistration for set: []
WARN 2021-05-12 15:18:03,625Z http-nio-0.0.0.0-9081-exec-2 [] util.common.ArithmosIDFServiceUtil.getMappedFilteredCriteria:130 Entity type node
ERROR 2021-05-12 15:18:13,936Z pool-15-thread-1 [] commands.license.LicenseManagingZkImpl.getLicenseFromLicenseFile:141 Exception while reading license file
com.nutanix.prism.base.zk.ProtobufZNodeManagementException$NoNodeException: the specified node does not exist
at com.nutanix.prism.base.zk.ProtobufManagingZookeeperImpl.read(ProtobufManagingZookeeperImpl.java:77)
at com.nutanix.prism.commands.license.LicenseManagingZkImpl.getLicenseFromLicenseFile(LicenseManagingZkImpl.java:139)
at com.nutanix.prism.background.multicluster.ZeusDataSender.buildClusterDataIncrement(ZeusDataSender.java:267)
at com.nutanix.prism.background.multicluster.ZeusDataSender.run(ZeusDataSender.java:108)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /appliance/logical/license/license_file
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1184)
at com.nutanix.prism.base.zk.BasicConnectionManagingZkClient.getData(BasicConnectionManagingZkClient.java:157)
at com.nutanix.prism.base.zk.ProtobufManagingZookeeperImpl.read(ProtobufManagingZookeeperImpl.java:65)
... 6 more


WARN 2021-05-12 15:17:55,990Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.PortalCompatibilityMatrix.isNosVersionCompatible:72 falling back to zkNode compatibility for pc version 2
020.09.16 and software versions from portal null
WARN 2021-05-12 15:17:55,991Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.ZkNodeCompatibilityMatrix.isNosVersionCompatible:61 falling back to static compatibility for pc version 2
020.09.16 and software versions from zknode at path prism_central
INFO 2021-05-12 15:17:55,995Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.StaticCompatibilityMatrix.getCeVersionMap:239 Got CE version map with 53 entries
INFO 2021-05-12 15:17:55,998Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.StaticCompatibilityMatrix.getNosVersionForCE:227 getNosVersionForCE(2020.09.16) - containsKey: true, ver:
5.18.0.6
INFO 2021-05-12 15:17:55,999Z http-nio-0.0.0.0-9081-exec-4 [] commands.multicluster.StaticCompatibilityMatrix.isNosVersionCompatible:181 PC is compatible with PE of version 2020.09.16
INFO 2021-05-12 15:17:56,015Z Curator-PathChildrenCache-0 [] commands.multicluster.ZkChildrenListener.childEvent:102 Cluster External State Cache Value added : 0005ad13-49c9-ed61-7f19-a4bf0
167629f
INFO 2021-05-12 15:17:56,066Z http-nio-0.0.0.0-9081-exec-4 [] multicluster.registration.AddClusterExternalState.getClusterDataStateBuilder:312 the specified node does not exist
INFO 2021-05-12 15:17:56,095Z http-nio-0.0.0.0-9081-exec-4 [] multicluster.registration.AddClusterExternalState.initClusterDataState:286 Created the cluster data state entity with entity i
d : 0005ad13-49c9-ed61-7f19-a4bf0167629f in IDf successfully
WARN 2021-05-12 15:17:56,339Z http-nio-0.0.0.0-9081-exec-3 [] commands.auth.PAMAuthenticationProvider.checkFailedAttempts:395 Failed to check the Failed Attempts..
INFO 2021-05-12 15:17:56,342Z http-nio-0.0.0.0-9081-exec-3 [] prism.proxy.LocalGenericProxy.<init>:94 Constructing proxy to Genesis service at 127.0.0.1:2100
WARN 2021-05-12 15:17:56,462Z http-nio-0.0.0.0-9081-exec-6 [] util.common.ArithmosIDFServiceUtil.getMappedFilteredCriteria:130 Entity type node
WARN 2021-05-12 15:17:56,464Z http-nio-0.0.0.0-9081-exec-1 [] util.common.ArithmosIDFServiceUtil.getMappedFilteredCriteria:130 Entity type node
INFO 2021-05-12 15:18:01,144Z pool-11-thread-1 [] prism.init.UnregistrationFixerTask.run:158 Fixing unregistration for set: []
WARN 2021-05-12 15:18:03,625Z http-nio-0.0.0.0-9081-exec-2 [] util.common.ArithmosIDFServiceUtil.getMappedFilteredCriteria:130 Entity type node
ERROR 2021-05-12 15:18:13,936Z pool-15-thread-1 [] commands.license.LicenseManagingZkImpl.getLicenseFromLicenseFile:141 Exception while reading license file
com.nutanix.prism.base.zk.ProtobufZNodeManagementException$NoNodeException: the specified node does not exist
at com.nutanix.prism.base.zk.ProtobufManagingZookeeperImpl.read(ProtobufManagingZookeeperImpl.java:77)
at com.nutanix.prism.commands.license.LicenseManagingZkImpl.getLicenseFromLicenseFile(LicenseManagingZkImpl.java:139)
at com.nutanix.prism.background.multicluster.ZeusDataSender.buildClusterDataIncrement(ZeusDataSender.java:267)
at com.nutanix.prism.background.multicluster.ZeusDataSender.run(ZeusDataSender.java:108)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /appliance/logical/license/license_file
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1184)
at com.nutanix.prism.base.zk.BasicConnectionManagingZkClient.getData(BasicConnectionManagingZkClient.java:157)
at com.nutanix.prism.base.zk.ProtobufManagingZookeeperImpl.read(ProtobufManagingZookeeperImpl.java:65)
... 6 more


INFO 2021-05-12 15:19:01,144Z pool-11-thread-1 [] prism.init.UnregistrationFixerTask.run:158 Fixing unregistration for set: []
WARN 2021-05-12 15:19:59,285Z http-nio-0.0.0.0-9081-exec-2 [] commands.auth.PAMAuthenticationProvider.checkFailedAttempts:395 Failed to check the Failed Attempts..
INFO 2021-05-12 15:19:59,290Z http-nio-0.0.0.0-9081-exec-2 [] commands.multicluster.DeleteClusterExternalState.doExecute:99 Unregistering Prism Element from Multicluster
INFO 2021-05-12 15:19:59,291Z http-nio-0.0.0.0-9081-exec-2 [] commands.multicluster.DeleteClusterExternalState.unregisterPrismElementFromMulticluster:122 Scheduling reap on demand.
INFO 2021-05-12 15:19:59,292Z pool-299-thread-1 [] prism.init.UnregistrationFixerTask.run:155 Invoked thread for specific cluster 0005ad13-49c9-ed61-7f19-a4bf0167629f
INFO 2021-05-12 15:19:59,293Z pool-299-thread-1 [] prism.init.UnregistrationFixerTask.deleteTempUser:319 com.nutanix.prism.init.UnregistrationFixerTask updating the /userrepository zk-node.
INFO 2021-05-12 15:19:59,336Z pool-299-thread-1 [] prism.proxy.AlertProxyImpl.log:59 AlertProxyImpl manageAlerts request {"operation": "kDelete","get_entities_with_metrics_arg": {"query": {
"where_clause": {"comparison_expr": {"lhs": {"leaf": {"column": "cluster"}},"operator": "kAny","rhs": {"leaf": {"value": {"str_list": {"value_list": ["0005ad13-49c9-ed61-7f19-a4bf0167629f"]}
}}}}},"group_by": {"raw_limit": {"limit": 100000}}}}}
INFO 2021-05-12 15:19:59,658Z pool-299-thread-1 [] prism.proxy.AlertProxyImpl.manageAlerts:177 AlertProxyImpl manageAlerts response num_successful_updates: 93

 

Does anyone have an idea how to fix this or where to look for further info?

 

Best Regards

Nick

icon

Best answer by thirdeyenick 17 May 2021, 09:07

Thanks, we actually noticed that a remote connection wasn’t possible, because we had blocking long running tasks running in Nutanix (cerberus uncharge tasks). They prevented the ‘create remote connection’ tasks from being started. After we killed all those cerberus tasks, a remote connection was possible without a problem. So all the thrown exceptions which one can see in the prism_gateway.log file are unrelated.

View original

This topic has been closed for comments

3 replies

Userlevel 5
Badge +16

If solution presented in knowledge base doesn’t help I doubt anything else will work. Maybe try to create new PC(Prism Central). Something might go wrong during upgrade.

Thanks, we actually noticed that a remote connection wasn’t possible, because we had blocking long running tasks running in Nutanix (cerberus uncharge tasks). They prevented the ‘create remote connection’ tasks from being started. After we killed all those cerberus tasks, a remote connection was possible without a problem. So all the thrown exceptions which one can see in the prism_gateway.log file are unrelated.

Userlevel 5
Badge +16

Great that you found the solution, can you mark  your reply as an answer so people with similar issues can find the solution faster? Thank you.