Solved

OVS Networking - Unable to bridge eth0-4 for other networks - Error


Userlevel 3
Badge +16
What's my goal? Right. I have 3 networks that cannot talk to each other for "reasons". Mostly because we have a test system that utilizes the same IP addresses (ORACLE Reasons) and we test & demo software in Prod, Demo, & Test networks.

I just did a foundation on my system because I did an upgrade and the networking customization I had to do with OVS changed and wouldn't work anymore.

Now that I'm working on a new install, I'm having trouble configuring the network and getting issues I didn't have before.

I've created and recreated several times using different varieties of below commands and still same outcome. Some commands below are from various tries, but same result.


Error when powering on VM with virtual NIC
Operation failed: InternalException: kNetworkError: OVS error (10.1.16.103 create_local_port): Error not a valid bridge: br1


Host A - ovs show
[root@HOSTA ~]# ovs-vsctl show
48fd5ca1-d898-425e-9b11-484eaceaba94
Bridge "br0"
Port "br0-dhcp"
Interface "br0-dhcp"
type: vxlan
options: {key="1", remote_ip="10.1.16.111"}
Port "br0"
Interface "br0"
type: internal
Port "br0-up"
Interface "eth4"
Interface "eth5"
Port "vnet0"
Interface "vnet0"
Port "br0-arp"
Interface "br0-arp"
type: vxlan
options: {key="1", remote_ip="192.168.5.2"}
Bridge "br1"
Port "br1-up"
Interface "eth1"
Interface "eth0"
Port "br1"
Interface "br1"
type: internal
ovs_version: "2.5.0"
[root@HOSTA ~]#


CVM Commands:
allssh "manage_ovs --bridge_name br1 --bond_name br1-up --interfaces eth0,eth1 update_uplinks"
acli net.create test vswitch_name=br1 vlan=20


When adding to a powered on VM:
https://
https://



Steps: --------------------------------------------------------------
ovs-vsctl add-br br1
allssh "manage_ovs --bridge_name br1 --bond_name br1-up --interfaces eth0 update_uplinks"
acli net.create testing vswitch_name=br1 vlan=0



Output: HOST(s)
[root@HOSTB ~]# ovs-vsctl add-br br1

CVM:
allssh "manage_ovs --bridge_name br1 --bond_name br1-up --interfaces eth0 update_uplinks"

HOST:
[root@HOSTC ~]# ovs-vsctl show
7652eb1a-d640-4565-9bf6-d9a75ef61867
Bridge "br0"
Port "br0-dhcp"
Interface "br0-dhcp"
type: vxlan
options: {key="1", remote_ip="10.1.16.111"}
Port "vnet0"
Interface "vnet0"
Port "br0-up"
Interface "eth5"
Interface "eth4"
Port "br0"
Interface "br0"
type: internal
Port "br0-arp"
Interface "br0-arp"
type: vxlan
options: {key="1", remote_ip="192.168.5.2"}
Bridge "br1"
Port "br1"
Interface "br1"
type: internal
Port "eth0"
Interface "eth0"
ovs_version: "2.5.0"


Output: CMV(s)
nutanix@NTNX-J10AMX3-A-CVM:10.1.16.111:~$ allssh "manage_ovs --bridge_name br1 --bond_name br1-up --interfaces eth0 update_uplinks"
Executing manage_ovs --bridge_name br1 --bond_name br1-up --interfaces eth0 update_uplinks on the cluster
================== 10.1.16.111 =================
2018-06-13 18:17:22 INFO manage_ovs:363 Deleting OVS ports:
2018-06-13 18:17:22 INFO manage_ovs:373 Adding OVS port: eth0
2018-06-13 18:17:23 WARNING manage_ovs:429 Failed to get IP for br1, not sending gratuitous ARPs
Connection to 10.1.16.111 closed.
================== 10.1.16.112 =================
2018-06-13 18:17:30 INFO manage_ovs:363 Deleting OVS ports:
2018-06-13 18:17:30 INFO manage_ovs:373 Adding OVS port: eth0
2018-06-13 18:17:31 WARNING manage_ovs:429 Failed to get IP for br1, not sending gratuitous ARPs
Connection to 10.1.16.112 closed.
================== 10.1.16.113 =================
2018-06-13 18:17:39 INFO manage_ovs:363 Deleting OVS ports:
2018-06-13 18:17:39 INFO manage_ovs:373 Adding OVS port: eth0
2018-06-13 18:17:40 WARNING manage_ovs:429 Failed to get IP for br1, not sending gratuitous ARPs
Connection to 10.1.16.113 closed.
nutanix@NTNX-J10AMX3-A-CVM:10.1.16.111:~$ acli net.create testing vswitch_name=br1 vlan=0
nutanix@NTNX-J10AMX3-A-CVM:10.1.16.111:~$

When powering on VM with attached vNIC: Operation failed: InternalException: kNetworkError: OVS error (10.1.16.103 create_local_port): Error not a valid bridge: br1
icon

Best answer by Bensation 14 June 2018, 15:06

View original

This topic has been closed for comments

11 replies

Userlevel 3
Badge +16
I was hoping it would be as easy as this:
https://next.nutanix.com/installation-configuration-23/acropolis-hypervisor-best-practices-guide-how-to-create-vlan-on-a-new-bridge-5821
Userlevel 7
Badge +34
Hi @srslol

Lets see if we can get some eyes on this post. Thanks
Userlevel 2
Badge +9
Can you review KB 4611: https://portal.nutanix.com/#/page/kbs/details?targetId=kA032000000bn1OCAQ?

In previous cases I've seen following scenario 2 will resolve the issue. Also if it is AOS 5.5 or higher use the command `manage_ovs --bridge_name br1 create_single_bridge` from the local CVM to create br1.
Userlevel 3
Badge +16
Hi @Bensation, actually running Version5.1.3. I'm thinking it would be smart to jump up to 5.5.

This is my foundation config here.
https://
Userlevel 3
Badge +16
@Bensation --> Thanks for the link. I went through restarting cluster and got same result.

nutanix@NTNX-J10AMX4-A-CVM:10.1.16.112:~$ acli vm.list
VM name VM UUID
User-Sarah-W10 1253bc4b-d7f3-466e-8e8b-79e784e65445
test vlan 21 11992ba9-d1d6-489e-8a5e-9993a3dcae1c
nutanix@NTNX-J10AMX4-A-CVM:10.1.16.112:~$ acli vm.on "test vlan 21"
test vlan 21: pending
test vlan 21: kNetworkError: OVS error (10.1.16.103 create_local_port): Er[...]
----- test vlan 21 -----
kNetworkError: OVS error (10.1.16.103 create_local_port): Error not a valid bridge: br1


Ran through it again and got slightly different error this time:

nutanix@NTNX-J10AMX4-A-CVM:10.1.16.112:~$ acli vm.on "test vlan 21"
test vlan 21: pending
test vlan 21: kNetworkError: OVS error (10.1.16.103 create_local_port): Er[...]
----- test vlan 21 -----
kNetworkError: OVS error (10.1.16.103 create_local_port): Error not a valid bridge: br1
Userlevel 2
Badge +9
Can you confirm the following:
  1. Confirm the br1 exists on the node that the VM is booting on: "allssh manage_ovs show_bridges"
  2. Remove the bridge completely with "ovs-vsctl del-br br1" then recreating it with "ovs-vsctl add-br br1"? Then we apply the update_uplinks and net.create commands.
  3. Confirm that all hosts have connectivity to all CVMs. On each CVM run:
for i in `hostips`; do ping $i -c 3; done

If there is still an issue it would be best to have support take a look. Do you have a valid contract? If so, open a case and PM me the case number. We can hop on a WebEx to get it squared away.
Userlevel 3
Badge +16
I decided to remove my new bridge fully and do an upgrade to 5.5.2.2 from the Upgrade Software menu. I'm now upgrading Hypervisor. Once that's done, I'll retry everything from scratch using the above command "manage_ovs --bridge_name br1 create_single_bridge" and go down that path. If not, I'll open a case and do that, which I fully appreciate you offering to take a look!!!
Userlevel 7
Badge +34
Sounds good @srslol keep us posted and if you do call support consider sharing the solution on this post to help others in the community. ๐Ÿ‘
Userlevel 3
Badge +16
So, this is interesting. I upgraded AOS and Hypervisor.



I added the br1 with no problems with: allssh "manage_ovs --bridge_name br1 create_single_bridge"

When I go to add eth0 to br1 (new bridge) this happens:

nutanix@NTNX-J10AMX5-A-CVM:10.1.16.113:~$ allssh "manage_ovs --bridge_name br1 --interfaces eth0 update_uplinks"
================== 10.1.16.111 =================
2018-06-14 12:02:13 INFO manage_ovs:394 Deleting OVS ports:
2018-06-14 12:02:13 INFO manage_ovs:404 Adding OVS port: eth0
2018-06-14 12:02:14 WARNING manage_ovs:460 Failed to get IP for br1, not sending gratuitous ARPs
================== 10.1.16.112 =================
2018-06-14 12:02:24 INFO manage_ovs:394 Deleting OVS ports:
2018-06-14 12:02:24 INFO manage_ovs:404 Adding OVS port: eth0
2018-06-14 12:02:25 WARNING manage_ovs:460 Failed to get IP for br1, not sending gratuitous ARPs
================== 10.1.16.113 =================
2018-06-14 12:02:35 INFO manage_ovs:394 Deleting OVS ports:
2018-06-14 12:02:35 INFO manage_ovs:404 Adding OVS port: eth0
2018-06-14 12:02:36 WARNING manage_ovs:460 Failed to get IP for br1, not sending gratuitous ARPs


Doing a show on a host, I see it added.
Bridge "br1"
Port "br1"
Interface "br1"
type: internal
Port "br1-dhcp"
Interface "br1-dhcp"
type: vxlan
options: {key="2", remote_ip="10.1.16.112"}
Port "br1-arp"
Interface "br1-arp"
type: vxlan
options: {key="2", remote_ip="192.168.5.2"}
Port "br1.u"
Interface "br1.u"
type: patch
options: {peer="br.dmx.d.br1"}
Port "eth0"
Interface "eth0"

Now to add the vNic:
nutanix@NTNX-J10AMX4-A-CVM:10.1.16.112:~$ acli net.create demo vswitch_name=br1 vlan=20

Attaches just fine to the VM.


Okay. So. Uh, this was the point where I was expecting an error and it actually worked, got the vlan DHCP just fine and now my life is complete.

So upgrading did the trick.

You guys, thank you so much. I appreciate how much you guys help us newbies out. Really.
Userlevel 3
Badge +16
For fun, here are the commands I used:
allssh "manage_ovs --bridge_name br1 create_single_bridge"
allssh "manage_ovs --bridge_name br1 --interfaces eth1 update_uplinks"
acli net.create demo20 vswitch_name=br1 vlan=20
Error when powering on VM with virtual NIC
๏ปฟ
Operation failed: InternalException: kNetworkError: OVS error (10.1.16.103 create_local_port): Error not a valid bridge: br1


To resolve the issue, verify that the bridge is created on all AHV hosts within the cluster. See AHV Networking Best Practices Guide.
Verify that the bridge exists on all hosts in the cluster using the following command from any CVM in the cluster.
code:
nutanix@cvm$  allssh manage_ovs show_bridges


Scenario 1
In the following output, bridge br1 does not exist on host 10.1.1.12.
Output
code:
nutanix@cvm$ allssh manage_ovs show_bridges
Executing manage_ovs show_bridges on the cluster
================== 10.1.1.11 =================
Bridges:
br1
br0
Connection to 10.1.1.11 closed.
================== 10.1.1.12 =================
Bridges:
br0
Connection to 10.1.1.12 closed.
================== 10.1.1.13 =================
Bridges:
br1
br0
Connection to 10.1.1.13 closed.


Resolution
Follow the steps listed in AHV Networking Best Practices Guide to create bridge br1 on host 10.1.1.12.






Scenario 2
In the following output, bridge br1 exists on all hosts. But an error message still appears indicating that the bridge does not exist when powering on a VM.
Output
code:
nutanix@NTNX-17SM12345678-A-CVM:10.1.1.11:~$ allssh manage_ovs show_bridges
Executing manage_ovs show_bridges on the cluster
================== 10.1.1.11 =================
Bridges:
br1
br0
Connection to 10.1.1.11 closed.
================== 10.1.1.12 =================
Bridges:
br1
br0
Connection to 10.1.1.12 closed.
================== 10.1.1.13 =================
Bridges:
br1
br0
Connection to 10.1.1.13 closed.


Resolution

Restart the Acropolis service on the current Acropolis master CVM. In cases where the list of bridges is not updated until a host reconnects to the Acropolis master CVM, restart the Acropolis service on the master to update the bridges. This issue affects AOS version 5.0.x and 5.1.x.

1. Determine the current Acropolis master by running the following command from any CVM in the cluster.
code:
allssh "links -dump http:0:2030 | grep Master"



ยท Connect to the acropolis master CVM through SSH.
code:
ssh nutanix@10.1.1.13



ยท Verify all CVMs and cluster services are in the up state.
code:
cluster status



ยท On the acropolis master CVM, stop the Acropolis service and then restart the service.
code:
genesis stop acropolis; cluster start



Output

code:
nutanix@NTNX-17SM12345678-A-CVM:10.1.1.11:~$ allssh "links -dump http:0:2030 | grep Master"

Executing links -dump http:0:2030 | grep Master on the cluster
================== 10.1.1.11 =================
Acropolis Master: [5]10.1.1.11:2030
Connection to 10.1.1.11 closed.
================== 10.1.1.12 =================
Acropolis Master: [5]10.1.1.12:2030
Connection to 10.1.1.12 closed.
================== 10.1.1.13 =================
Connection to 10.1.1.13 closed.

nutanix@NTNX-17SM12345678-A-CVM:10.1.1.11:~$ ssh nutanix@10.1.1.13





nutanix@NTNX-17SM12345678-C-CVM:10.1.1.13:~$ cluster status | grep -v UP
The state of the cluster: start
Lockdown mode: Disabled

CVM: 10.1.1.11 Up, ZeusLeader

CVM: 10.1.1.12 Up

CVM: 10.1.1.13 Up

nutanix@NTNX-17SM12345678-C-CVM:10.1.1.13:~$ genesis stop acropolis; cluster start
2017-07-13 15:18:52.845699: Stopping acropolis (pids undefined9481, 29513, 29514, 29515])
. . .(output trimmed)
2017-07-13 15:18:57 INFO cluster:2188 Executing action start on SVMs 10.1.1.11,10.1.1.12,10.1.1.13
Waiting on 10.1.1.11 (Up, ZeusLeader) to start:
Waiting on 10.1.1.12 (Up) to start:
Waiting on 10.1.1.13 (Up) to start: Acropolis
. . .(output trimmed)
2017-07-13 15:19:11 INFO cluster:2299 Success!


nutanix@NTNX-17SM12345678-C-CVM:10.1.1.13:~$ acli vm.on TestVM1
TestVM1: pending
TestVM1: complete




In case the above-mentioned steps do not resolve the issue, consider engaging Nutanix Support at http://portal.nutanix.com.

ยท Power on the guest VM that generated the error. Power on through Prism or acli.
code:
acli vm.on