Question

upgrade failed

  • 27 July 2019
  • 7 replies
  • 10 views

Badge
Hello
I just tried upgrading to the latest version of Nutanix Community Edition and now i cannot access web GUI. i can ping and ssh to it but web gui will not load and i have no idea what state it is in. Can anyone help?
I tried searching the forums but my understanding of the cli is basic

admin@:192.168.200.101:~$ ncli host edit id=00056718-d111-3013-359e-94c69119bbd0::2 enable-maintenance-mode=false

Id : 00056718-d111-3013-359e-94c69119bbd0::2
Uuid : e9b4174a-3d12-4f40-946a-04a055ad865a
Name : NTNX-89900b4a-A
IPMI Address :
Controller VM Address : 192.168.200.101
Controller VM NAT Address :
Controller VM NAT PORT :
Hypervisor Address : 192.168.200.100
Hypervisor Version : Nutanix 20180123.170
Host Status : NORMAL
Oplog Disk Size : 25.36 GiB (27,229,698,293 bytes) (6.4%)
Under Maintenance Mode : false (-)
Metadata store status : Metadata store enabled on the node
Node Position : Node physical position can't be displayed for this model. Please refer to Prism UI for this information.
Node Serial (UUID) : e9b4174a-3d12-4f40-946a-04a055ad865a
Block Serial (Model) : 89900b4a (CommunityEdition)

This topic has been closed for comments

7 replies

Badge
The state of the cluster: start
Lockdown mode: Disabled

CVM: 192.168.200.101 Up, ZeusLeader
Zeus UP [3190, 3220, 3221, 3222, 3268, 3286]
Scavenger UP [3952, 3986, 3987, 3988]
SSLTerminator UP [17232, 17275, 17279, 17280]
SecureFileSync UP [17244, 17301, 17302, 17303]
Medusa UP [17560, 17611, 17612, 17671, 18438]
DynamicRingChanger UP [19476, 19539, 19541, 19828]
Pithos UP [19481, 19588, 19589, 19702]
Mantle UP [19512, 19654, 19655, 19736]
Hera UP [19546, 19663, 19664, 19665]
Stargate UP [20856, 20887, 20888, 21017, 21018]
InsightsDB UP [22057, 22212, 22213, 22430]
InsightsDataTransfer UP [22095, 22231, 22232, 22398, 22399, 22400, 22401]
Ergon UP [22129, 22266, 22267, 22271]
Cerebro UP [22157, 22276, 22277, 23036]
Chronos UP [22190, 22302, 22303, 22551]
Curator UP [22205, 22376, 22377, 22636]
Athena UP [22244, 22381, 22382, 22383]
Prism UP [22627, 22823, 22824, 23097, 25137, 25283]
CIM UP [22717, 22947, 22948, 23074]
AlertManager UP [22827, 23044, 23045, 23176]
Arithmos UP [22989, 23106, 23107, 23482]
Catalog UP [23069, 23158, 23159, 23160]
Acropolis UP [23165, 23278, 23279, 23282]
Uhura UP [23179, 23316, 23317, 23319]
Snmp UP [23250, 23405, 23406, 23409]
SysStatCollector UP [23311, 23480, 23481, 23484]
Tunnel UP [23390, 23527, 23528]
Janus UP [23436, 23610, 23611]
NutanixGuestTools UP [23596, 23722, 23723, 23895]
MinervaCVM UP [25881, 25969, 25970, 25972]
ClusterConfig UP [25915, 26024, 26025, 26026]
Mercury UP [25939, 26014, 26015, 26085]
APLOSEngine UP [25973, 26050, 26051, 26052]
APLOS UP [26670, 26761, 26762, 26763]
Lazan UP [26684, 26794, 26795, 26797]
Delphi UP [26710, 26812, 26813, 26814]
ClusterHealth UP [26741, 26920, 26921]
2019-07-27 16:49:09 INFO cluster:2751 Success!
admin@:192.168.200.101:~$ ncli host ls | egrep "Maint| Id| Name"
Id : 00056718-d111-3013-359e-94c69119bbd0::2
Name : NTNX-89900b4a-A
Under Maintenance Mode : false (-)
Badge
any help would be greatly appreciated
Userlevel 5
Badge +9
Hi,
with "will not load" do you mean that when you connect to the right IP there is no response or do you get any errors? As Prism seems to be up after cluster restart (judging from the output posted) the issue might be for example firewall settings or something else preventing access to it.
Which browser do you use? Have you checked NTP settings? Search the forum for example "Prism not accessible", "prism connect" or similar to see any similar threads.
Badge
I have checked ntp. its 1 hour out

admin@:192.168.200.101:~$ ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
+192.168.200.12 216.239.35.0 2 u 2 64 377 0.410 -0.524 0.104
*ntp2.owennelson 85.199.214.98 2 u 7 64 377 11.974 -0.152 0.195
+183.ip-51-89-15 91.121.7.182 3 u 3 64 377 8.810 -0.110 0.117
LOCAL(0) .LOCL. 10 l 878 64 0 0.000 0.000 0.000
Badge
/health_checks/hypervisor_checks/ahv_read_only_fs_check [ INFO ]
----------------------------------------------------------------------------------------------------------------------------+
/health_checks/system_checks/default_password_check [ INFO ]
----------------------------------------------------------------------------------------------------------------------------+
/health_checks/stargate_checks/ondisk_dedup_enabled_check [ INFO ]
----------------------------------------------------------------------------------------------------------------------------+
/health_checks/system_checks/hostname_resolution_check [ WARN ]
----------------------------------------------------------------------------------------------------------------------------+
/health_checks/system_checks/chassis_cpus_type_check [ WARN ]
----------------------------------------------------------------------------------------------------------------------------+
/health_checks/hypervisor_checks/host_cpu_contention [ WARN ]
----------------------------------------------------------------------------------------------------------------------------+
/health_checks/network_checks/mellanox_nic_status_check [ ERR ]
----------------------------------------------------------------------------------------------------------------------------+
/health_checks/system_checks/host_cpu_frequency_check [ ERR ]
----------------------------------------------------------------------------------------------------------------------------+

Detailed information for ahv_read_only_fs_check:
Node 192.168.200.101:
INFO: Found the following read-only filesystem(s) at 192.168.200.100:
tmpfs /sys/fs/cgroup tmpfs ro,seclabel,nosuid,nodev,noexec,mode=755 0 0

Refer to KB 4897 (http://portal.nutanix.com/kb/4897) for details on ahv_read_only_fs_check or Recheck with: ncc health_checks hypervisor_checks ahv_read_only_fs_check --cvm_list=192.168.200.101

Detailed information for ondisk_dedup_enabled_check:
Node 192.168.200.101:
INFO: Cluster does not satisfy the pre-requisite for enabling deduplication.
Refer to KB 1851 (http://portal.nutanix.com/kb/1851) for details on ondisk_dedup_enabled_check or Recheck with: ncc health_ch ecks stargate_checks ondisk_dedup_enabled_check

Detailed information for hostname_resolution_check:
Node 192.168.200.101:
WARN: Unable to get configured fqdn for host: 192.168.200.100
Refer to KB 1709 (http://portal.nutanix.com/kb/1709) for details on hostname_resolution_check or Recheck with: ncc health_che cks system_checks hostname_resolution_check --cvm_list=192.168.200.101

Detailed information for chassis_cpus_type_check:
Node 192.168.200.101:
WARN: chassis 13 has non-Intel CPUs. Non-Intel CPUs are on nodes 192.168.200.101.
Refer to KB 3288 (http://portal.nutanix.com/kb/3288) for details on chassis_cpus_type_check or Recheck with: ncc health_check s system_checks chassis_cpus_type_check

Detailed information for host_cpu_contention:
Node 192.168.200.101:
WARN: High host CPU utilization on host 192.168.200.100: 98 (Threshold: 75).
Refer to KB 2797 (http://portal.nutanix.com/kb/2797) for details on host_cpu_contention or Recheck with: ncc health_checks hy pervisor_checks host_cpu_contention --cvm_list=192.168.200.101

Detailed information for mellanox_nic_status_check:
Node 192.168.200.101:
ERR : node (service_vm_id: 2) : Error while trying to get NIC information
Refer to KB 4114 (http://portal.nutanix.com/kb/4114) for details on mellanox_nic_status_check or Recheck with: ncc health_che cks network_checks mellanox_nic_status_check --cvm_list=192.168.200.101

Detailed information for host_cpu_frequency_check:
Node 192.168.200.101:
ERR : Error while getting host CPU frequency range. bash: cpupower: command not found

Refer to KB 5542 (http://portal.nutanix.com/kb/5542) for details on host_cpu_frequency_check or Recheck with: ncc health_chec ks system_checks host_cpu_frequency_check --cvm_list=192.168.200.101
+-----------------------+
| State | Count |
+-----------------------+
| Pass | 181 |
| Info | 3 |
| Warning | 3 |
| Error | 2 |
| Total Plugins | 190 |
+-----------------------+
Plugin output written to /home/nutanix/data/logs/ncc-output-latest.log
Userlevel 1
Badge +5
Out of curiosity, are you seeing errors like these in data/logs/prism_gateway.log on the Prism leader?

WARN 2019-07-29 06:32:34,991 http-nio-127.0.0.1-9081-exec-5 commands.auth.CACAuthenticationProvider.authenticate:86 CAC not supported of user without domain name
INFO 2019-07-29 06:32:34,991 http-nio-127.0.0.1-9081-exec-5 com.nutanix.syslog.generateSyslog:16 An unsuccessful login attempt was made with username: NTNX_SESSION_META=invalid; NTNX_SVC_META=.route1 from IP: 10.47.3.33 and browser: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0

I'm having a similar problem; just upgraded to 20180501 (working my way up to the latest version, I'm several revs behind because every time I upgrade I run into problems, and likely as not end up having to destroy and rebuild my cluster) and now can't get into prism. I also had one host that was having major issues with itself, finally gave up and removed it from the cluster via nCLI.

I've restarted Prism on the whole cluster several times, no dice. My VMs seem to be running more or less fine, though.
Badge
im running this on an Intel NUC