[Discuss] ARP monitor for OVS bridge over bonding

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [Discuss] ARP monitor for OVS bridge over bonding
@ 2024-09-10 10:17 Hangbin Liu
  2024-09-12 16:36 ` Jay Vosburgh
  0 siblings, 1 reply; 4+ messages in thread
From: Hangbin Liu @ 2024-09-10 10:17 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andy Gospodarek, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, Nikolay Aleksandrov, Simon Horman,
	Aaron Conole, Ilya Maximets, Adrian Moreno, Stanislas Faye

Hi all,

Recently, our customer got an issue with OVS bridge over bonding. e.g.

  eth0      eth1
   |         |
   -- bond0 --
        |
      br-ex (ovs-vsctl add-port br-ex bond0; ip addr add 192.168.1.1/24 dev br-ex)

Before sending arp message for bond slave detecting, the bond need to check
if the br-ex is in the same data path with bond0 via function
bond_verify_device_path(), which using netdev_for_each_upper_dev_rcu()
to check all upper devices. This works with normal bridge. But with ovs
bridge, the upper device is "ovs-system" instead of br-ex.

After talking with OVS developers. It turned out the real upper OVS topology
is looks like

              --------------------------------
              |                              |
  br-ex  -----+--      ovs-system            |
              |                              |
  br-int -----+--                            |
              |                              |
              |    bond0    eth2   veth42    |
              |      |       |       |       |
              |      |       |       |       |
              -------+-------+-------+--------
                     |       |       |
                  +--+--+  physical  |
                  |     |    link    |
                eth0  eth1          veth43

The br-ex is not upper link of bond0. ovs-system, instead, is the master
of bond0. This make us unable to make sure the br-ex and bond0 is in the
same datapath.

On the other hand, as Adrián Moreno said, the packets generated on br-ex
could be routed anywhere using OpenFlow rules (including eth2 in the
diagram). The same with normal bridge, with tc/netfilter rules, the packets
could also be routed to other interface instead of bond0.

So the rt interface checking in bond_arp_send_all() is not always correct.
Stanislas suggested adding a new parameter like 'arp monitor source interface'
to binding that the user could supply. Then we can do like
	If (rt->dst.dev == arp_src_iface->dev)
		goto found;

What do you think?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Discuss] ARP monitor for OVS bridge over bonding
  2024-09-10 10:17 [Discuss] ARP monitor for OVS bridge over bonding Hangbin Liu
@ 2024-09-12 16:36 ` Jay Vosburgh
  2024-09-14 10:01   ` Hangbin Liu
  2024-09-17  9:10   ` Adrián Moreno
  0 siblings, 2 replies; 4+ messages in thread
From: Jay Vosburgh @ 2024-09-12 16:36 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: netdev, Andy Gospodarek, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, Nikolay Aleksandrov, Simon Horman,
	Aaron Conole, Ilya Maximets, Adrian Moreno, Stanislas Faye

Hangbin Liu <liuhangbin@gmail.com> wrote:

>Hi all,
>
>Recently, our customer got an issue with OVS bridge over bonding. e.g.
>
>  eth0      eth1
>   |         |
>   -- bond0 --
>        |
>      br-ex (ovs-vsctl add-port br-ex bond0; ip addr add 192.168.1.1/24 dev br-ex)
>
>
>Before sending arp message for bond slave detecting, the bond need to check
>if the br-ex is in the same data path with bond0 via function
>bond_verify_device_path(), which using netdev_for_each_upper_dev_rcu()
>to check all upper devices. This works with normal bridge. But with ovs
>bridge, the upper device is "ovs-system" instead of br-ex.
>
>After talking with OVS developers. It turned out the real upper OVS topology
>is looks like
>
>              --------------------------------
>              |                              |
>  br-ex  -----+--      ovs-system            |
>              |                              |
>  br-int -----+--                            |
>              |                              |
>              |    bond0    eth2   veth42    |
>              |      |       |       |       |
>              |      |       |       |       |
>              -------+-------+-------+--------
>                     |       |       |
>                  +--+--+  physical  |
>                  |     |    link    |
>                eth0  eth1          veth43
>
>The br-ex is not upper link of bond0. ovs-system, instead, is the master
>of bond0. This make us unable to make sure the br-ex and bond0 is in the
>same datapath.

	I'm guessing that this is in the context of an openstack
deployment, as "br-ex" and "br-int" are names commonly chosen for the
OVS bridges in openstack.

	But, yes, OVS bridge configuration is very different from the
linux bridge, and the ARP monitor was not designed with OVS in mind.

	I'll also point out that OVS has its own bonding, although it
does not implement functionality equivalent to the ARP monitor.

	However, OVS does provide an implementation of RFC 5880 BFD
(Bidirectional Forwarding Detection).  The openstack deployments that
I'm familiar with typically use the kernel bonding in LACP mode along
with BFD.  Is there a reason that OVS + BFD is unsuitable for your
purposes?

>On the other hand, as Adrián Moreno said, the packets generated on br-ex
>could be routed anywhere using OpenFlow rules (including eth2 in the
>diagram). The same with normal bridge, with tc/netfilter rules, the packets
>could also be routed to other interface instead of bond0.

	True, and, at least in the openstack OVN/OVS deployments I'm
familiar with, heavy use of openflow rules is the usual configuration.
Those deployments also make use of tc rules for various purposes.

>So the rt interface checking in bond_arp_send_all() is not always correct.
>Stanislas suggested adding a new parameter like 'arp monitor source interface'
>to binding that the user could supply. Then we can do like
>	If (rt->dst.dev == arp_src_iface->dev)
>		goto found;
>
>What do you think?

	A single "arp_src_iface" parameter won't scale if there are
multiple ARP targets, as each target might need a different
"arp_src_iface."

	Also, the original purpose of bond_verify_device_path() is to
return VLAN tags in the device stack so that the ARP will be properly
tagged.

	I think what you're really asking for is a "I know what I'm
doing" option to bypass the checks in bond_arp_send_all().  That would
also skip the VLAN tag search, so it's not necessarily a perfect
solution.

	Before considering such a change, I'd like to know why OVS + BFD
over a kernel bond attached to the OVS bridge is unsuitable for your use
case, as that's a common configuration I've seen with OVS.

	-J

---
	-Jay Vosburgh, jv@jvosburgh.net


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Discuss] ARP monitor for OVS bridge over bonding
  2024-09-12 16:36 ` Jay Vosburgh
@ 2024-09-14 10:01   ` Hangbin Liu
  2024-09-17  9:10   ` Adrián Moreno
  1 sibling, 0 replies; 4+ messages in thread
From: Hangbin Liu @ 2024-09-14 10:01 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Andy Gospodarek, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Eric Dumazet, Nikolay Aleksandrov, Simon Horman,
	Aaron Conole, Ilya Maximets, Adrian Moreno, Stanislas Faye

On Thu, Sep 12, 2024 at 09:36:13AM -0700, Jay Vosburgh wrote:
> >
> >The br-ex is not upper link of bond0. ovs-system, instead, is the master
> >of bond0. This make us unable to make sure the br-ex and bond0 is in the
> >same datapath.
> 
> 	I'm guessing that this is in the context of an openstack
> deployment, as "br-ex" and "br-int" are names commonly chosen for the
> OVS bridges in openstack.

It's on a OCP (OpenShift Container Platform) that build with OVN Kubernetes.
> 
> 	But, yes, OVS bridge configuration is very different from the
> linux bridge, and the ARP monitor was not designed with OVS in mind.
> 
> 	I'll also point out that OVS has its own bonding, although it
> does not implement functionality equivalent to the ARP monitor.
> 
> 	However, OVS does provide an implementation of RFC 5880 BFD
> (Bidirectional Forwarding Detection).  The openstack deployments that
> I'm familiar with typically use the kernel bonding in LACP mode along
> with BFD.  Is there a reason that OVS + BFD is unsuitable for your
> purposes?

LACP need switch config. While arp monitor doesn't need any switch config.

> 	A single "arp_src_iface" parameter won't scale if there are
> multiple ARP targets, as each target might need a different
> "arp_src_iface."
> 
> 	Also, the original purpose of bond_verify_device_path() is to
> return VLAN tags in the device stack so that the ARP will be properly
> tagged.

Ah, yes, makes sense.

> 
> 	I think what you're really asking for is a "I know what I'm
> doing" option to bypass the checks in bond_arp_send_all().  That would
> also skip the VLAN tag search, so it's not necessarily a perfect
> solution.

Yes.
 
> 	Before considering such a change, I'd like to know why OVS + BFD
> over a kernel bond attached to the OVS bridge is unsuitable for your use
> case, as that's a common configuration I've seen with OVS.

As upper comment, this need switch config.

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Discuss] ARP monitor for OVS bridge over bonding
  2024-09-12 16:36 ` Jay Vosburgh
  2024-09-14 10:01   ` Hangbin Liu
@ 2024-09-17  9:10   ` Adrián Moreno
  1 sibling, 0 replies; 4+ messages in thread
From: Adrián Moreno @ 2024-09-17  9:10 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Hangbin Liu, netdev, Andy Gospodarek, David S . Miller,
	Jakub Kicinski, Paolo Abeni, Eric Dumazet, Nikolay Aleksandrov,
	Simon Horman, Aaron Conole, Ilya Maximets, Stanislas Faye

On Thu, Sep 12, 2024 at 09:36:13AM GMT, Jay Vosburgh wrote:
> Hangbin Liu <liuhangbin@gmail.com> wrote:
>
> >Hi all,
> >
> >Recently, our customer got an issue with OVS bridge over bonding. e.g.
> >
> >  eth0      eth1
> >   |         |
> >   -- bond0 --
> >        |
> >      br-ex (ovs-vsctl add-port br-ex bond0; ip addr add 192.168.1.1/24 dev br-ex)
> >
> >
> >Before sending arp message for bond slave detecting, the bond need to check
> >if the br-ex is in the same data path with bond0 via function
> >bond_verify_device_path(), which using netdev_for_each_upper_dev_rcu()
> >to check all upper devices. This works with normal bridge. But with ovs
> >bridge, the upper device is "ovs-system" instead of br-ex.
> >
> >After talking with OVS developers. It turned out the real upper OVS topology
> >is looks like
> >
> >              --------------------------------
> >              |                              |
> >  br-ex  -----+--      ovs-system            |
> >              |                              |
> >  br-int -----+--                            |
> >              |                              |
> >              |    bond0    eth2   veth42    |
> >              |      |       |       |       |
> >              |      |       |       |       |
> >              -------+-------+-------+--------
> >                     |       |       |
> >                  +--+--+  physical  |
> >                  |     |    link    |
> >                eth0  eth1          veth43
> >
> >The br-ex is not upper link of bond0. ovs-system, instead, is the master
> >of bond0. This make us unable to make sure the br-ex and bond0 is in the
> >same datapath.
>
> 	I'm guessing that this is in the context of an openstack
> deployment, as "br-ex" and "br-int" are names commonly chosen for the
> OVS bridges in openstack.
>
> 	But, yes, OVS bridge configuration is very different from the
> linux bridge, and the ARP monitor was not designed with OVS in mind.
>
> 	I'll also point out that OVS has its own bonding, although it
> does not implement functionality equivalent to the ARP monitor.
>
> 	However, OVS does provide an implementation of RFC 5880 BFD
> (Bidirectional Forwarding Detection).  The openstack deployments that
> I'm familiar with typically use the kernel bonding in LACP mode along
> with BFD.  Is there a reason that OVS + BFD is unsuitable for your
> purposes?
>
> >On the other hand, as Adrián Moreno said, the packets generated on br-ex
> >could be routed anywhere using OpenFlow rules (including eth2 in the
> >diagram). The same with normal bridge, with tc/netfilter rules, the packets
> >could also be routed to other interface instead of bond0.
>
> 	True, and, at least in the openstack OVN/OVS deployments I'm
> familiar with, heavy use of openflow rules is the usual configuration.
> Those deployments also make use of tc rules for various purposes.
>
> >So the rt interface checking in bond_arp_send_all() is not always correct.
> >Stanislas suggested adding a new parameter like 'arp monitor source interface'
> >to binding that the user could supply. Then we can do like
> >	If (rt->dst.dev == arp_src_iface->dev)
> >		goto found;
> >
> >What do you think?
>
> 	A single "arp_src_iface" parameter won't scale if there are
> multiple ARP targets, as each target might need a different
> "arp_src_iface."
>
> 	Also, the original purpose of bond_verify_device_path() is to
> return VLAN tags in the device stack so that the ARP will be properly
> tagged.
>
> 	I think what you're really asking for is a "I know what I'm
> doing" option to bypass the checks in bond_arp_send_all().  That would
> also skip the VLAN tag search, so it's not necessarily a perfect
> solution.

I agree this is a better approach than "arp_src_iface" and that it's
still not perfect. For OVS bridges, VLAN information is in userspace
so we don't have a good way of retrieving it.

Also, this flag would apply to all ARP targets although I cannot think
of any topology that would require monitoring addresses on OVS and non
OVS interfaces.

Another possible approach would be to internally encode what interfaces
types do honor the "stacking is datapath" assumption. I also dislike
this given the flexibility netfilter and ebpf (and OpenFlow for that
matter) have to create virtual datapaths independent from interface
stacking, even on bridges.

Thanks.
Adrián

>
> 	Before considering such a change, I'd like to know why OVS + BFD
> over a kernel bond attached to the OVS bridge is unsuitable for your use
> case, as that's a common configuration I've seen with OVS.
>
> 	-J
>
> ---
> 	-Jay Vosburgh, jv@jvosburgh.net
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-09-17  9:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-10 10:17 [Discuss] ARP monitor for OVS bridge over bonding Hangbin Liu
2024-09-12 16:36 ` Jay Vosburgh
2024-09-14 10:01   ` Hangbin Liu
2024-09-17  9:10   ` Adrián Moreno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).