Foundation Installer fails on phoenix prompt. Debug indicate error on hardware layout.

  • 18 July 2017
  • 5 replies
  • 7844 views

Badge +2
Trying Baremetal Install on 3 Dell server with euphrates-5.0.3.1-stable (AHV)
IPMI & Hypervisor NIC are 1G capable & on same subnet as FoundationVM_3.8 (running on VMware Player on windows 2012). the windows box has Bonjour installed & IPv6 is enabled on the network.
Issues - (1) While running discovery in Java Applet no servers get discovered. (2) while running foundation & supplying IPMI, Hypervisor, CVM IPs, the IPMI gets accepted, the 3 servers reboot & initiate installation but leads to a point where they flash the Phoenix command prompt with message "You may ssh into this machine as soon as you bring up a network interface". What more to check. please suggest.

Reference - Debug Log from foundation run :


20170716 10:35:32 DEBUG Log from foundation.session.20170716-103532-5.node_10.224.93.75 is logged at /home/nutanix/foundation/log/20170716-103532-5/node_10.224.93.75.log20170716 10:35:32 INFO Validating parameters. This may take few minutes20170716 10:35:32 DEBUG Log from foundation.session.20170716-103532-5.node_10.224.93.76 is logged at /home/nutanix/foundation/log/20170716-103532-5/node_10.224.93.76.log20170716 10:35:32 INFO Validating parameters. This may take few minutes20170716 10:35:32 DEBUG Log from foundation.session.20170716-103532-5.node_10.224.93.77 is logged at /home/nutanix/foundation/log/20170716-103532-5/node_10.224.93.77.log20170716 10:35:32 INFO Validating parameters. This may take few minutes20170716 10:35:32 INFO Quick common validations is done20170716 10:35:32 DEBUG Clusters are: []Nodes are: [, , ]20170716 10:35:32 DEBUG Graph is:...20170716 10:35:32 INFO Session id: 20170716-103532-520170716 10:35:32 INFO Executing imaging graph...20170716 10:35:32 INFO Ensuring there is no Haswell, Broadwell mix in the same chassis for cluster test-NX20170716 10:35:32 DEBUG Setting state of from PENDING to RUNNING20170716 10:35:32 INFO Running 20170716 10:35:38 DEBUG Unable to ssh using private key20170716 10:35:41 ERROR Unable to ssh using password20170716 10:35:41 ERROR Unable to get a ssh session20170716 10:35:44 ERROR Unable to ssh using password20170716 10:35:44 ERROR Unable to get a ssh session20170716 10:35:44 ERROR Foundation is unable to determine the current state of the node with ip 10.224.93.75...20170716 10:36:02 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:02 DEBUG Cache HIT: key(_()_{'global_config': })20170716 10:36:02 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:02 INFO Scheduling tasks in parallel []20170716 10:36:02 INFO Completed 20170716 10:36:02 DEBUG Cache HIT: key(_()_{'global_config': })20170716 10:36:02 INFO Completed 20170716 10:36:02 DEBUG Setting state of from PENDING to RUNNING20170716 10:36:02 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:02 INFO Running 20170716 10:36:02 INFO Completed 20170716 10:36:02 INFO Scheduling tasks in parallel []20170716 10:36:02 INFO Node IP: CVM(10.224.93.75) HOST(10.224.93.68) IPMI(10.224.93.121)20170716 10:36:02 DEBUG Cache HIT: key(_(u'/home/nutanix/foundation/nos/nutanix_installer_package-release-euphrates-5.0.3.1-stable.tar',)_{})20170716 10:36:02 INFO NOS Version is: 5.0.3.120170716 10:36:02 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:02 INFO Completed 20170716 10:36:02 DEBUG Setting state of from PENDING to RUNNING20170716 10:36:02 INFO Running 20170716 10:36:02 INFO Scheduling tasks in parallel [, ]20170716 10:36:02 INFO Node IP: CVM(10.224.93.76) HOST(10.224.93.69) IPMI(10.224.93.122)20170716 10:36:02 DEBUG Cache HIT: key(_(u'/home/nutanix/foundation/nos/nutanix_installer_package-release-euphrates-5.0.3.1-stable.tar',)_{})20170716 10:36:02 INFO NOS Version is: 5.0.3.120170716 10:36:02 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:02 INFO Completed 20170716 10:36:02 DEBUG Setting state of from PENDING to RUNNING20170716 10:36:02 INFO Running 20170716 10:36:02 INFO Node IP: CVM(10.224.93.77) HOST(10.224.93.70) IPMI(10.224.93.123)20170716 10:36:02 DEBUG Cache HIT: key(_(u'/home/nutanix/foundation/nos/nutanix_installer_package-release-euphrates-5.0.3.1-stable.tar',)_{})20170716 10:36:02 INFO NOS Version is: 5.0.3.120170716 10:36:02 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:02 INFO Completed ...20170716 10:36:02 DEBUG Setting state of from PENDING to RUNNING20170716 10:36:02 INFO Running 20170716 10:36:02 INFO Attempting to detect device type on 10.224.93.12320170716 10:36:02 DEBUG factory mode is False20170716 10:36:02 INFO Checking if this is Quanta......20170716 10:36:10 DEBUG Command '['ipmitool', '-U', u'itops', '-P', u'12345678', '-H', '10.224.93.122', 'fru']' returned stdout:
stderr:Get Session Challenge command failedError: Unable to establish LAN session
return code: 1......20170716 10:36:10 INFO Checking if this is a Lenovo system.20170716 10:36:10 INFO Manufacturer ID = 67420170716 10:36:11 INFO Manufacturer ID = 67420170716 10:36:11 INFO Checking if this is software-only node20170716 10:36:11 INFO Manufacturer ID = 67420170716 10:36:11 DEBUG Starting new HTTPS connection (1): 10.224.93.12120170716 10:36:11 INFO Checking if this is software-only node20170716 10:36:11 DEBUG Starting new HTTPS connection (1): 10.224.93.12220170716 10:36:11 INFO Checking if this is software-only node20170716 10:36:11 DEBUG Starting new HTTPS connection (1): 10.224.93.12320170716 10:36:12 DEBUG https://10.224.93.121:443 "POST /nuova HTTP/1.1" 404 22320170716 10:36:12 INFO Checking if this is Dell.........20170716 10:36:15 DEBUG Command '['/opt/dell/srvadmin/sbin/racadm', '-r', '10.224.93.122', '-u', u'itops', '-p', u'12345678', 'getconfig', '-g', 'idRacInfo']' returned stdout:Security Alert: Certificate is invalid - self signed certificateContinuing execution. Use -S option for racadm to stop execution on certificate-related errors.
# idRacType=32# idRacProductInfo=Integrated Dell Remote Access Controller# idRacDescriptionInfo=This system component provides a complete set of remote management functions for Dell PowerEdge Servers# idRacVersionInfo=2.40.40.40# idRacBuildInfo=45# idRacName=cesin02mon02-idrac
stderr:/opt/dell/srvadmin/sbin/racadm: line 13: printf: 0xError: invalid hex number
return code: 0
20170716 10:36:15 INFO Detected class idrac7 for node with IPMI IP 10.224.93.12220170716 10:36:15 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:15 INFO Completed 20170716 10:36:15 INFO Scheduling tasks in parallel []20170716 10:36:15 DEBUG Setting state of from PENDING to RUNNING20170716 10:36:15 INFO Running 20170716 10:36:15 DEBUG Setting state of from RUNNING to FINISHED20170716 10:36:15 INFO Completed 20170716 10:36:15 INFO Scheduling tasks in parallel []20170716 10:36:15 DEBUG Setting state of from PENDING to RUNNING20170716 10:36:15 INFO Running 20170716 10:36:15 DEBUG Cache HIT: key(_(u'/home/nutanix/foundation/nos/nutanix_installer_package-release-euphrates-5.0.3.1-stable.tar',)_{})20170716 10:36:15 INFO Preparing NOS package and making node specific Phoenix image20170716 10:36:15 DEBUG Cache HIT: key(_()_{})20170716 10:36:15 DEBUG Cache HIT: key(_(u'/home/nutanix/foundation/nos/nutanix_installer_package-release-euphrates-5.0.3.1-stable.tar',)_{})20170716 10:36:15 DEBUG Cache HIT: key(_(u'/home/nutanix/foundation/nos/nutanix_installer_package-release-euphrates-5.0.3.1-stable.tar', u'5.0.3.1')_{})20170716 10:36:15 INFO NOS version is 5.0.3.120170716 10:36:15 INFO Powering off node...20170716 10:36:49 INFO /opt/dell/srvadmin/sbin/racadm -r 10.224.93.122 -u itops -p 12345678 serveraction powerup20170716 10:36:53 INFO Waiting for remote node to boot into phoenix, this may take 15 minutes...20170716 10:39:19 INFO /opt/dell/srvadmin/sbin/racadm -r 10.224.93.122 -u itops -p 12345678 remoteimage -d20170716 10:39:20 INFO phoenix_callback: greetings from phoenix20170716 10:39:20 INFO Rebooted into Phoenix successfully...20170716 10:39:33 DEBUG Setting state of from RUNNING to FINISHED20170716 10:39:33 INFO Scheduling tasks in parallel []20170716 10:39:33 DEBUG Setting state of from PENDING to RUNNING20170716 10:39:33 INFO Completed 20170716 10:39:33 INFO Running 20170716 10:39:33 INFO Rebooting into staging environment20170716 10:39:33 INFO Node with ip 10.224.93.77 is in phoenix. Generating hardware_config.json20170716 10:39:34 ERROR Command '/usr/bin/python /phoenix/layout/layout_finder.py local' returned error code 1stdout:
stderr:Traceback (most recent call last):File "/phoenix/layout/layout_finder.py", line 206, in write_layout("hardware_config.json", 1)File "/phoenix/layout/layout_finder.py", line 176, in write_layouttop = get_layout(node_position)File "/phoenix/layout/layout_finder.py", line 139, in get_layoutsystem_info_override, use_legacy=use_legacy)File "/phoenix/layout/layout_finder.py", line 64, in _find_model_matchraise NoMatchingModule()__main__.NoMatchingModule: Unable to match system information to layout module
...
20170716 10:39:34 ERROR Exception in running Traceback (most recent call last):File "/home/hudsonb/workspace/workspace/foundation_installer-3.8-universal/builds/build-3.8-release/foundation-python-tree/bdist.linux-x86_64/egg/foundation/imaging_step.py", line 126, in _runFile "/home/hudsonb/workspace/workspace/foundation_installer-3.8-universal/builds/build-3.8-release/foundation-python-tree/bdist.linux-x86_64/egg/foundation/decorators.py", line 76, in wrap_methodFile "/home/hudsonb/workspace/workspace/foundation_installer-3.8-universal/builds/build-3.8-release/foundation-python-tree/bdist.linux-x86_64/egg/foundation/imaging_step_pre_install.py", line 303, in runFile "/home/hudsonb/workspace/workspace/foundation_installer-3.8-universal/builds/build-3.8-release/foundation-python-tree/bdist.linux-x86_64/egg/foundation/imaging_step_pre_install.py", line 89, in _log_system_infoFile "/home/hudsonb/workspace/workspace/foundation_installer-3.8-universal/builds/build-3.8-release/foundation-python-tree/bdist.linux-x86_64/egg/foundation/config_manager.py", line 42, in __getattr__AttributeError: 'NodeConfig' object has no attribute 'hardware_config'20170716 10:39:34 DEBUG Setting state of from RUNNING to FAILED20170716 10:39:34 INFO Scheduling tasks in parallel [, ]20170716 10:39:34 DEBUG Setting state of from PENDING to NR20170716 10:39:34 WARNING Skipping because dependencies not met, failed tasks: []20170716 10:39:34 DEBUG Setting state of from PENDING to NR20170716 10:39:34 WARNING Skipping because dependencies not met, failed tasks: [].........20170716 10:39:39 INFO Scheduling tasks in parallel []20170716 10:39:39 DEBUG Setting state of from PENDING to NR20170716 10:39:39 WARNING Skipping because dependencies not met20170716 10:39:39 INFO Scheduling tasks in parallel []20170716 10:39:39 DEBUG Setting state of from PENDING to NR20170716 10:39:39 INFO Scheduling tasks in parallel []20170716 10:39:39 WARNING Skipping because dependencies not met20170716 10:39:39 DEBUG Setting state of from PENDING to NR20170716 10:39:39 WARNING Skipping because dependencies not met20170716 10:39:39 INFO Scheduling tasks in parallel []20170716 10:39:39 DEBUG Setting state of from PENDING to NR20170716 10:39:39 WARNING Skipping because dependencies not met20170716 10:39:39 INFO Imaging graph Executed

5 replies

Hi friend, did you got any answer for this?
I'm running on a similar issue, but did not found any useful information online.
Regards,
Alex.
Userlevel 1
Badge
Also getting similar issue with the HX 7520.
Userlevel 3
Badge +17
Also getting similar issue with the HX 7520.
Have this got fixed?
Userlevel 1
Badge
Would suggest you look at your Firewall/Anti-virus tool. It appears it was the Kapersky security suite on the laptop preventing the CVMs from communicating with the Foundation VM.
Turning the Agent off or relaxing the rules resolved this. Now my nodes are successfully booting into Phoenix.
Badge

up

Reply