The cause here is that manually updating the CVM's netmask will not update the value of 'external_subnet' in Zeus. This prevents the Data Services IP from communicating with the FSVMs and in turn not being able to mount the zpools.
Note: The proper way to update CVM IP and/or subnet mask configuration is documented here.
The cause here is that manually updating the CVM's netmask will not update the value of 'external_subnet' in Zeus. This prevents the Data Services IP from communicating with the FSVMs and in turn not being able to mount the zpools.
Note: The proper way to update CVM IP and/or subnet mask configuration is documented here.
For this node I’m not able to open the case due MA has expired. This is a PoC assets.
I would recommend to upgrade the cluster to AOS 5.10.7 or 5.11.1 because restarting genesis on a two node cluster with AOS 5.10.6 could lead to cluster instabillity.
After Upgrade ty to Re-Deploy the File Server using Prism and let us know.
Meanwhile I am searching for an alternate solution.
Regs.
I would recommend to upgrade the cluster to AOS 5.10.7 or 5.11.1 because restarting genesis on a two node cluster with AOS 5.10.6 could lead to cluster instabillity.
After Upgrade ty to Re-Deploy the File Server using Prism and let us know.
Meanwhile I am searching for an alternate solution.
However, during manual upgrade is error as below message.
020-04-14 12:03:24 WARNING preupgrade_checks.py:815 Skipping replication factor check since cluster is stopped 2020-04-14 12:03:25 INFO multihome_utils.py:146 Cluster does not have multi homed CVMs 2020-04-14 12:03:25 ERROR preupgrade_checks.py:163 Cannot upgrade two node cluster when cluster has a leader fixed. Current leader svm id: 4. Try again after some time , Please refer KB 6396 2020-04-14 12:03:25 INFO preupgrade_checks.py:978 Cluster is stopped, skipping under-replication test 2020-04-14 12:03:25 INFO preupgrade_checks.py:1849 Skipping version compatibility test 2020-04-14 12:03:25 WARNING preupgrade_checks.py:772 Cluster has less than 3 nodes. Downtime possible 2020-04-14 12:03:25 ERROR cluster_upgrade.py:352 Failure in pre-upgrade tests, errors Cannot upgrade two node cluster when cluster has a leader fixed. Current leader svm id: 4. Try again after some time , Please refer KB 6396 Signature validation Error for version 5.10.7 on svm 172.16.1.32. Error: Failed to verify NOS installer signature on svm 172.16.1.32, Please refer KB 6108 2020-04-14 12:03:25 ERROR cluster:1867 Failed to perform cluster upgrade 2020-04-14 12:03:25 ERROR cluster:2815 Operation failed
I’ve checked the MD5 it’s correct.
The ERROR message:
Cannot upgrade two node cluster when cluster has a leader fixed…
means that the cluster is under-replicated.
Curator is responsible for kicking off replication for all extent groups that are not adequately replicated. A Curator full scan is needed to replicate the under-replicated data.
Solution:
Refer to KB 2826. Wait for cluster data to be rebalanced across nodes and Current Fault Tolerance to show 1.
Once the curator scan has completed, run the pre-upgrade check again. It could be that it takes a couple of scans dependent on the number of underreplicated egroups.
Regs.
Antonio
The ERROR message:
Cannot upgrade two node cluster when cluster has a leader fixed…
means that the cluster is under-replicated.
Curator is responsible for kicking off replication for all extent groups that are not adequately replicated. A Curator full scan is needed to replicate the under-replicated data.
Solution:
Refer to KB 2826. Wait for cluster data to be rebalanced across nodes and Current Fault Tolerance to show 1.
Once the curator scan has completed, run the pre-upgrade check again. It could be that it takes a couple of scans dependent on the number of underreplicated egroups.
Regs.
Antonio
Hi @AntonioG
Sorry, It’s just upgrade successfully.
But the file server deployment still failed with same error.
I need more specific information regrading your cluster, could you please add the following:
Screenshot of the Create File Server Screen from Prism
From any CVM, please provide the output from the following commands:
Have you tried with a mathematically valid subnet?
You specified your network as 172.16.1.0 / 255.255.254.0
The network address cannot be x.x.1.0 for this subnet mask. That “1” in your third octet is a 1 in the last bit, 00000001, but that bit is masked in this configuration since your subnet mask is functionally aaaaaaaa.bbbbbbbb.cccccccX.XXXXXXXX (X being the masked bits).
172.16.1.0 can only be the network address of a 255.255.255.0 subnet or smaller.
I recognize correcting this configuration would mean shutting down the cluster and running the IP reconfig script. It also could mean some other adjustments to the network? Not sure what’s going on there.
I wouldn’t be at all surprised if this is why your network validation fails in the creation process.
Please also correct the following FATAL alarm reported :
FAIL: CVM is not uplinked to any 10Gbps nics on bridge/vSwitch br0.Node 172.16.1.32:FAIL: CVM is not uplinked to any 10Gbps nics on bridge/vSwitch br0.Refer to KB 1584 (http://portal.nutanix.com/kb/1584) for details on 10gbe_check
This will allow to have 10gb on vSwitch on the affected host (172.16.1.32) as the AHV Network Best Practices recommend.
Regs,
Antonio
Have you tried with a mathematically valid subnet?
You specified your network as 172.16.1.0 / 255.255.254.0
The network address cannot be x.x.1.0 for this subnet mask. That “1” in your third octet is a 1 in the last bit, 00000001, but that bit is masked in this configuration since your subnet mask is functionally aaaaaaaa.bbbbbbbb.cccccccX.XXXXXXXX (X being the masked bits).
172.16.1.0 can only be the network address of a 255.255.255.0 subnet or smaller.
I recognize correcting this configuration would mean shutting down the cluster and running the IP reconfig script. It also could mean some other adjustments to the network? Not sure what’s going on there.
I wouldn’t be at all surprised if this is why your network validation fails in the creation process.
Hi @JeremyJ You’re right. This is wrong.
First of all, this node was configure as 255.255.255.0 subnet. Then our office has change network address to mask 23.
So, I’ve run cluster reconfig and change zeus network address from 172.16.1.0/255.255.255.0 - 172.16.1.0/255.255.254.0
I’ll find maintenance room and change zeus external network address to 172.16.0.0/255.255.254.0 later.
Thank you.
@AntonioG As I told you this is non-production cluster. Just for PoC or internal testing in my office. So, I just used 2x1gb interfaces for connectivity.