Problem upgrade AHV | Nutanix Community
Skip to main content
Question

Problem upgrade AHV


Forum|alt.badge.img+1

We have upgrade a old cluster from AHV 6.5.6.6 to 6.5.6.7 but one node are’nt upgrade. And I can perfom the upgrade by web. Is it possible to upgrade one node by cli ?

 

Moreover can upgrade to the ahv 7 ?

 

Kind regards

14 replies

JeroenTielen
Forum|alt.badge.img+8
  • Vanguard
  • 1341 replies
  • February 4, 2025

I'm a bit confused. Do you want to upgrade AOS or AHV? As AHV 7 doesn't exist. It is AOS 7 or AHV 10. 

 

Can you post the error you received? 


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 5, 2025

Hi Jereon,

 

Yes Sorry, i think I made a mistake :D.

 

The error message is  NTNX_LCM_UPGRADE_FAILURE_ALERT : Upgrade operation failed.

 

Upgrade operation failed : Error LCM failed performing action enter_host_mm in phase PreActions on ip address 192.168.59.13. Failed with error 'Timed out putting host 192.168.59.12 into maintenance mode. _submit_maintenance_mode_task failed with error code 21, error message HostEvacuationFailure: Failed to evacuate 2/2 VMs:

- 2: UncaughtException: Traceback (most recent call last):

File "build/bdist.linux-x86_64/egg/ergon/client/legacy/base_task.py", line 533, in _resume

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/host.py", line 80, in wrapper

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 360, in _run

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 513, in _migrate_loop

File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__

self.gen.next()

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 494, in _migrate_loop

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 1018, in _migrate

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 55, in wrapper

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/libvirt_connection.py", line 294, in wrapper

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 342, in migrate_vm

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 232, in _migrate_vm

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 982, in _init_migrate_vm

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/disk_mixin.py", line 205, in write_frodo_disk_map

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/disk_mixin.py", line 382, in _get_disk_dev_map

File "build/bdist.linux-x86_64/egg/acropolis/vmdisk/manager.py", line 79, in target_name

AttributeError: 'NoneType' object has no attribute 'iscsi_target_name'

.The changes made by pre-actions on node have been automatically reverted.'

 

two node are update but the third no.

 

kind regards


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 5, 2025

Hi Jereon,

 

The error message is NTNX_LCM_UPGRADE_FAILURE_ALERT : Upgrade operation failed.

 

with this description log :

Upgrade operation failed : Error LCM failed performing action enter_host_mm in phase PreActions on ip address 192.168.59.13. Failed with error 'Timed out putting host 192.168.59.12 into maintenance mode. _submit_maintenance_mode_task failed with error code 21, error message HostEvacuationFailure: Failed to evacuate 2/2 VMs:

- 2: UncaughtException: Traceback (most recent call last):

File "build/bdist.linux-x86_64/egg/ergon/client/legacy/base_task.py", line 533, in _resume

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/host.py", line 80, in wrapper

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 360, in _run

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 513, in _migrate_loop

File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__

self.gen.next()

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 494, in _migrate_loop

File "build/bdist.linux-x86_64/egg/acropolis/vm/migrate_task.py", line 1018, in _migrate

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 55, in wrapper

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/libvirt_connection.py", line 294, in wrapper

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 342, in migrate_vm

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 232, in _migrate_vm

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/migrate_mixin.py", line 982, in _init_migrate_vm

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/disk_mixin.py", line 205, in write_frodo_disk_map

File "build/bdist.linux-x86_64/egg/acropolis/hypervisor/kvm/disk_mixin.py", line 382, in _get_disk_dev_map

File "build/bdist.linux-x86_64/egg/acropolis/vmdisk/manager.py", line 79, in target_name

AttributeError: 'NoneType' object has no attribute 'iscsi_target_name'

.The changes made by pre-actions on node have been automatically reverted.'


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 5, 2025

Hi Jeroen,

 

The error message is :

Upgrade operation failed : Error LCM failed performing action enter_host_mm in phase PreActions on ip address 192.168.59.13. Failed with error 'Timed out putting host 192.168.59.12 into maintenance mode. _submit_maintenance_mode_task failed with error code 21, error message HostEvacuationFailure: Failed to evacuate 2/2 VMs:


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 5, 2025

Hi Jeroen,

this is a part of the error message on the prism element

Upgrade operation failed : Error LCM failed performing action enter_host_mm in phase PreActions on ip address 192.168.59.13. Failed with error 'Timed out putting host 192.168.59.12 into maintenance mode. _submit_maintenance_mode_task failed with error code 21, error message HostEvacuationFailure: Failed to evacuate 2/2 VMs:

 

Kind regards


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 6, 2025

Hi Jeroen,

 

Sorry for late, I post message, but didn’t appear.

 

The error message is :  Upgrade operation failed : Error LCM failed performing action enter_host_mm in phase PreActions on ip address xxx.xxx.xxx.xxx. Failed with error 'Timed out putting host xxx.xxx.xxx.xxx into maintenance mode. _submit_maintenance_mode_task failed with error code 21, error message 


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 6, 2025

Hi Jereon,

 

Sorry for late. My message don’t post.

 

Error message is  :

 

failed with error code 21, error message HostEvacuationFailure: Failed to evacuate 2/2 VMs


manfred
Forum|alt.badge.img+7
  • Trailblazer
  • 23 replies
  • February 7, 2025

Hi,

you may try to move the VMs manually to see why the host evacuation fails.it can be due to resource constraints or the vms are pinned to only one host.
 


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 9, 2025

Hi Manfred,

 

Thanks you for your answer, yes I would like to do this. but after move this vm, how can i relaunch the upgrade ?

 

Kind regards


just evacuate and then start the upgrade again, it should work for you.

BTW, AOS is different from AHV, when you talk about AOS, it means the os on top of CVM .AHV is the name of the hypervisor in nutanix.they are not same.


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 17, 2025

Hi Jamali,


Thanks you for your answers, however I can’t upgrade again. I can’t relaunch it in the prism element

 

Kind regards


what is the LCM is showing you ?

you need to provide more details.

send below command output as well:

connect to one of CVM as ssh and run below commands:

#cs |  grep -v UP

#svmips -d

#nodetool -h 0 ring

#zeus_config_printer | is_degraded

#lcm_auto_upgrade_status

#lcm_upgrade_status

 

 

Regards,


Forum|alt.badge.img+1
  • Author
  • Trailblazer
  • 17 replies
  • February 17, 2025

Hi,

#cs |  grep -v UP

2025-02-17 14:34:03,429Z INFO MainThread zookeeper_session.py:191 cluster is attempting to connect to Zookeeper
2025-02-17 14:34:03,431Z INFO Dummy-1 zookeeper_session.py:625 ZK session establishment complete, sessionId=0x294adecca72fea9, negotiated timeout=20 secs
2025-02-17 14:34:03,436Z INFO MainThread cluster:2943 Executing action status on SVMs 192.168.59.13,192.168.59.16,192.168.59.19
The state of the cluster: start
Lockdown mode: Disabled

        CVM: 192.168.59.13 Up, ZeusLeader

        CVM: 192.168.59.16 Up

        CVM: 192.168.59.19 Up
2025-02-17 14:34:06,455Z INFO MainThread cluster:3104 Success!

#svmips -d

192.168.59.13 -> 6
192.168.59.16 -> 7
192.168.59.19 -> 8

#nodetool -h 0 ring

192.168.59.16   Up     Normal     16.35 GB        33.33%  
192.168.59.13   Up     Normal     16.39 GB        33.33%  
192.168.59.19   Up     Normal     18.79 GB        33.33%

 

#zeus_config_printer | is_degraded

nothing

 

#lcm_auto_upgrade_status

******* LCM framework auto update status *********


It is a 3 node cluster


Mapping of svmips to node uuids:

192.168.59.13 -> 26376f3d-e64b-4806-9ec9-edb3da697b0e
192.168.59.16 -> bd9d8928-6886-4619-a8e1-3b0d30d11ac0
192.168.59.19 -> 2cb6880a-44e8-4964-ab79-79f4faf44e42

Lcm leader is at 192.168.59.13


LCM update intent:
3.1.56788

LCM version as per zknodes /appliance/logical/lcm/update/*
26376f3d-e64b-4806-9ec9-edb3da697b0e -> 3.1.56788
2cb6880a-44e8-4964-ab79-79f4faf44e42 -> 3.1.56788
bd9d8928-6886-4619-a8e1-3b0d30d11ac0 -> 3.1.56788

LCM version as per ~/cluster/config/lcm/version.txt
================== 192.168.59.16 =================
3.1.56788
================== 192.168.59.19 =================
3.1.56788
================== 192.168.59.13 =================
3.1.56788

LCM schema version:
1.48

md5sum of ~/cluster/config/lcm/idf/* on all nodes ************
================== 192.168.59.16 =================
73cb27921c8725f14b7be9bfa7cb35d7  /home/nutanix/cluster/config/lcm/idf/entity_attribute_config.proto.template
fd51fb3bd8017a98e1d6daa6fdcbb812  /home/nutanix/cluster/config/lcm/idf/entity_type_config.proto.template

================== 192.168.59.19 =================
73cb27921c8725f14b7be9bfa7cb35d7  /home/nutanix/cluster/config/lcm/idf/entity_attribute_config.proto.template
fd51fb3bd8017a98e1d6daa6fdcbb812  /home/nutanix/cluster/config/lcm/idf/entity_type_config.proto.template

================== 192.168.59.13 =================
73cb27921c8725f14b7be9bfa7cb35d7  /home/nutanix/cluster/config/lcm/idf/entity_attribute_config.proto.template
fd51fb3bd8017a98e1d6daa6fdcbb812  /home/nutanix/cluster/config/lcm/idf/entity_type_config.proto.template


******* LCM Mercury enabled? *********

True

******* LCM Mercury configuration status *********


26376f3d-e64b-4806-9ec9-edb3da697b0e -> 1.3
2cb6880a-44e8-4964-ab79-79f4faf44e42 -> 1.3
bd9d8928-6886-4619-a8e1-3b0d30d11ac0 -> 1.3

 When was LCM framework updated and genesis restarted?
================== 192.168.59.16 =================
================== 192.168.59.19 =================
================== 192.168.59.13 =================

 Genesis restarts on all nodes happened at:
================== 192.168.59.16 =================
================== 192.168.59.19 =================
================== 192.168.59.13 =================

 LCM version from logs:
================== 192.168.59.16 =================
================== 192.168.59.19 =================
================== 192.168.59.13 =================

Did LCM service start on all nodes / errors in starting LCM ?
================== 192.168.59.16 =================
================== 192.168.59.19 =================
================== 192.168.59.13 =================
Last service started on all nodes
================== 192.168.59.16 =================
================== 192.168.59.19 =================
================== 192.168.59.13 =================

 

#lcm_upgrade_status

Running python version - Python 2.7.5
Could not get the metric information for the following operation list:
 set([u'df680093-1f13-482a-491e-ff3d876b00f4'])
LCM autoupdate is not in progress
Incomplete upgrades which block 1-click:
[{'entity_class': u'Hypervisor', 'entity_model': u'AHV hypervisor'}]

Ongoing upgrades in current batch:
No upgrade is in progress

Finished upgrades:
Up to 5 previously finished upgrade batches listed in descending order of upgrade start time:
---------------------------------
Upgrade of entity Hypervisor(AHV hypervisor), on host (192.168.59.18) from version [el7.nutanix.20220304.478] to version [el7.nutanix.20220304.511] started on 2025-01-28-18:02:58, finished on 2025-01-28-18:19:51
Upgrade of entity Hypervisor(AHV hypervisor), on host (192.168.59.15) from version [el7.nutanix.20220304.478] to version [el7.nutanix.20220304.511] started on 2025-01-28-17:44:58, finished on 2025-01-28-18:00:37
Upgrade of entity LICENSING SERVICE(Licensing), on host (192.168.59.12) from version [LM.2022.2.1] to version [LM.2022.2.5] started on 2025-01-28-17:40:18, finished on 2025-01-28-17:40:33
Upgrade of entity LICENSING SERVICE(Licensing), on host (192.168.59.18) from version [LM.2022.2.1] to version [LM.2022.2.5] started on 2025-01-28-17:39:47, finished on 2025-01-28-17:40:09
Upgrade of entity LICENSING SERVICE(Licensing), on host (192.168.59.15) from version [LM.2022.2.1] to version [LM.2022.2.5] started on 2025-01-28-17:39:21, finished on 2025-01-28-17:39:39
Upgrade of entity Core Cluster(AOS), on host (192.168.59.18) from version [6.5.5.1] to version [6.5.6.6] started on 2025-01-28-16:59:30, finished on 2025-01-28-17:39:12
Upgrade of entity Core Cluster(Foundation), on host (192.168.59.18) from version [5.5] to version [5.7] started on 2025-01-28-16:52:55, finished on 2025-01-28-16:59:23
Upgrade of entity Core Cluster(FSM), on host (192.168.59.18) from version [4.4.0.1] to version [4.4.0.3] started on 2025-01-28-16:50:15, finished on 2025-01-28-16:52:48
Upgrade of entity Core Cluster(NCC), on host (192.168.59.18) from version [4.6.6.1] to version [4.6.6.3] started on 2025-01-28-16:49:13, finished on 2025-01-28-16:50:08
---------------------------------
Upgrade of entity Cluster Service(Foundation Platforms), on host (192.168.59.12) from version [2.14] to version [2.14.1] started on 2024-02-16-10:58:18, finished on 2024-02-16-10:58:49
Upgrade of entity Cluster Service(Foundation Platforms), on host (192.168.59.18) from version [2.14] to version [2.14.1] started on 2024-02-16-10:57:41, finished on 2024-02-16-10:58:11
Upgrade of entity Cluster Service(Foundation Platforms), on host (192.168.59.15) from version [2.14] to version [2.14.1] started on 2024-02-16-10:57:03, finished on 2024-02-16-10:57:34
---------------------------------
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.15) from version [PB60.001] to version [PB80.001] started on 2024-02-15-16:22:24, finished on 2024-02-15-16:28:52
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.15) from version [PB60.001] to version [PB80.001] started on 2024-02-15-16:16:54, finished on 2024-02-15-16:28:55
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.15) from version [PB60.001] to version [PB80.001] started on 2024-02-15-15:55:19, finished on 2024-02-15-16:09:40
Upgrade of entity BMCs (Redfish)(NX Gen11 BMC), on host (192.168.59.15) from version [07.14.01] to version [07.15.00] started on 2024-02-15-15:36:33, finished on 2024-02-15-15:55:16
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.18) from version [PB60.001] to version [PB80.001] started on 2024-02-15-15:28:29, finished on 2024-02-15-15:34:21
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.18) from version [PB60.001] to version [PB80.001] started on 2024-02-15-15:07:34, finished on 2024-02-15-15:21:39
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.18) from version [PB60.001] to version [PB80.001] started on 2024-02-15-15:07:34, finished on 2024-02-15-15:34:21
Upgrade of entity BMCs (Redfish)(NX Gen11 BMC), on host (192.168.59.18) from version [07.14.01] to version [07.15.00] started on 2024-02-15-14:49:02, finished on 2024-02-15-15:07:34
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.12) from version [PB60.001] to version [PB80.001] started on 2024-02-15-14:42:14, finished on 2024-02-15-14:48:07
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.12) from version [PB60.001] to version [PB80.001] started on 2024-02-15-14:20:41, finished on 2024-02-15-14:34:46
Upgrade of entity BIOS (Redfish)(BIOS Firmware Skylake X11DPT-B), on host (192.168.59.12) from version [PB60.001] to version [PB80.001] started on 2024-02-15-14:20:41, finished on 2024-02-15-14:48:07
Upgrade of entity BMCs (Redfish)(NX Gen11 BMC), on host (192.168.59.12) from version [07.14.01] to version [07.15.00] started on 2024-02-15-14:02:39, finished on 2024-02-15-14:20:40
---------------------------------
Upgrade of entity BMCs (Redfish)(NX Gen11 BMC), on host (192.168.59.15) from version [07.11.00] to version [07.14.01] started on 2024-02-15-13:28:26, finished on 2024-02-15-13:54:12
Upgrade of entity BMCs (Redfish)(NX Gen11 BMC), on host (192.168.59.18) from version [07.11.00] to version [07.14.01] started on 2024-02-15-13:02:58, finished on 2024-02-15-13:27:18
Upgrade of entity BMCs (Redfish)(NX Gen11 BMC), on host (192.168.59.12) from version [07.11.00] to version [07.14.01] started on 2024-02-15-12:37:47, finished on 2024-02-15-13:01:54


Forum|alt.badge.img
  • Adventurer
  • 7 replies
  • February 27, 2025

Hi ​@tgovin,

As checked above trail messages. I understood and would recommend following possible solutions:

 

  1. VM-host affinity rule: If you have configured any VM-host affinity rule on host -

    Solution: just remove/Disable it during upgrades.
     
  2. Foundation service is running on node(s): If foundation service is running on any host(s)

    Solution: Stop / Kill foundation service manually on node(s) safely 
     
  3. Node stuck in phoenix bootup image: 

    Solution: Run following command to exit node from phoenix boot

    Command:  python /phoenix/reboot_to_host.py

     
  4. Node in maintenance mode

    Solution: Exist node from maintenance mode

 

Thank you,


Reply