netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v1 0/1] bonding: Adding limited support for ARP monitoring with ovs
@ 2025-04-11 17:48 David J Wilder
  2025-04-11 17:48 ` [PATCH net-next v1 1/1] bonding: Adding limmited " David J Wilder
  0 siblings, 1 reply; 13+ messages in thread
From: David J Wilder @ 2025-04-11 17:48 UTC (permalink / raw)
  To: netdev; +Cc: jv, wilder, pradeeps, pradeep

Configurations with the bonding driver and openvswitch are unable to use the
bond's "ARP Monitor" feature. If an ovs bridge sits above the bonding driver
use of ARP Monitoring results in the bond flapping between slaves.
bond_verify_device_path() gathers all vlan tags between the bond and the
interface the arp is to be routed by walking the list of adjacent net_device's.
When OVS is in the stack, this process breaks since ports on OVS bridge are not
linked as they are with other configurations.
   
This patch adds limited support for the ARP Monitoring feature when OVS is
configured above the bond. When no vlan tags are configured or when the tag
is added between the bond interface and the OVS bridge arp monitoring will
function correctly. The use of tags between the OVS bridge and the routed
interfaces are not supported.
   
For example:
bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
   
We recognize that some other advance network configurations (with-out
OVS) may encounter the same issue. This is not an attempt to provide a
generic solution, it will provide a solution for known use cases with OVS
and the bonding driver as used by OpenShift with OVN-Kubernetes.

OVS bonding with BFD was evaluated as a possible solution. There are some 
limitations to adopting it.

In our environment the hypervisor manages SR-IOV interfaces. OVS bonding
requires that all slave interfaces have promiscuous mode enabled. However, for
promiscuous mode to function the hypervisor must also enable promiscuous mode
on the VF. Unfortunately, the hypervisor allows only a single VF per PF to have
promiscuous mode enabled.

This is a real customer problem, and they have expressed a strong desire to
continue to use the bonding driver to maintain backward compatibility with
their existing setup.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-11 17:48 [PATCH net-next v1 0/1] bonding: Adding limited support for ARP monitoring with ovs David J Wilder
@ 2025-04-11 17:48 ` David J Wilder
  2025-04-11 23:57   ` Jay Vosburgh
  2025-04-15 13:58   ` Paolo Abeni
  0 siblings, 2 replies; 13+ messages in thread
From: David J Wilder @ 2025-04-11 17:48 UTC (permalink / raw)
  To: netdev; +Cc: jv, wilder, pradeeps, pradeep

Adding limited support for the ARP Monitoring feature when ovs is
configured above the bond. When no vlan tags are used in the configuration
or when the tag is added between the bond interface and the ovs bridge arp
monitoring will function correctly. The use of tags between the ovs bridge
and the routed interface are not supported.

For example:
1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.

Configurations #1 and #2 were tested and verified to function corectly.
In the second configuration the correct vlan tags were seen in the arp.

Signed-off-by: David J Wilder <wilder@us.ibm.com>
Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
---
 drivers/net/bonding/bond_main.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 950d8e4d86f8..6f71a567ba37 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
 	struct net_device *upper;
 	struct list_head  *iter;
 
-	if (start_dev == end_dev) {
+	/* If start_dev is an OVS port then we have encountered an openVswitch
+	 * bridge and can't go any further. The programming of the switch table
+	 * will determine what packets will be sent to the bond. We can make no
+	 * further assumptions about the network above the bond.
+	 */
+
+	if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
 		tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
 		if (!tags)
 			return ERR_PTR(-ENOMEM);
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-11 17:48 ` [PATCH net-next v1 1/1] bonding: Adding limmited " David J Wilder
@ 2025-04-11 23:57   ` Jay Vosburgh
  2025-04-12  0:29     ` Ilya Maximets
  2025-04-15 13:58   ` Paolo Abeni
  1 sibling, 1 reply; 13+ messages in thread
From: Jay Vosburgh @ 2025-04-11 23:57 UTC (permalink / raw)
  To: David J Wilder; +Cc: netdev, pradeeps, pradeep

David J Wilder <wilder@us.ibm.com> wrote:

>Adding limited support for the ARP Monitoring feature when ovs is
>configured above the bond. When no vlan tags are used in the configuration
>or when the tag is added between the bond interface and the ovs bridge arp
>monitoring will function correctly. The use of tags between the ovs bridge
>and the routed interface are not supported.

	Looking at the patch, it isn't really "adding support," but
rather is disabling the "is this IP address configured above the bond"
checks if the bond is a member of an OVS bridge.  It also seems like it
would permit any ARP IP target, as long as the address is configured
somewhere on the system.

	Stated another way, the route lookup in bond_arp_send_all() for
the target IP address must return a device, but the logic to match that
device to the interface stack above the bond will always succeed if the
bond is a member of any OVS bridge.

	For example, given:
	
[ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
eth2 IP=20.0.0.2

	Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
succeed after this patch is applied, and the bond would send ARPs for
20.0.0.2.

>For example:
>1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>
>Configurations #1 and #2 were tested and verified to function corectly.
>In the second configuration the correct vlan tags were seen in the arp.

	Assuming that I'm understanding the behavior correctly, I'm not
sure that "if OVS then do whatever" is the right way to go, particularly
since this would still exhibit mysterious failures if a VLAN is
configured within OVS itself (case 3, above).

	I understand that the architecture of OVS limits the ability to
do these sorts of checks, but I'm unconvinced that implementing this
support halfway is going to create more issues than it solves.

	Lastly, thinking out loud here, I'm generally loathe to add more
options to bonding, but I'm debating whether this would be worth an
"ovs-is-a-black-box" option somewhere, so that users would have to
opt-in to the OVS alternate realm.

	-J

>Signed-off-by: David J Wilder <wilder@us.ibm.com>
>Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>---
> drivers/net/bonding/bond_main.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 950d8e4d86f8..6f71a567ba37 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
> 	struct net_device *upper;
> 	struct list_head  *iter;
> 
>-	if (start_dev == end_dev) {
>+	/* If start_dev is an OVS port then we have encountered an openVswitch
>+	 * bridge and can't go any further. The programming of the switch table
>+	 * will determine what packets will be sent to the bond. We can make no
>+	 * further assumptions about the network above the bond.
>+	 */
>+
>+	if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
> 		tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
> 		if (!tags)
> 			return ERR_PTR(-ENOMEM);

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-11 23:57   ` Jay Vosburgh
@ 2025-04-12  0:29     ` Ilya Maximets
  2025-04-13  2:37       ` David Wilder
  0 siblings, 1 reply; 13+ messages in thread
From: Ilya Maximets @ 2025-04-12  0:29 UTC (permalink / raw)
  To: Jay Vosburgh, David J Wilder
  Cc: netdev, pradeeps, pradeep, Adrian Moreno, Hangbin Liu

On 4/12/25 1:57 AM, Jay Vosburgh wrote:
> David J Wilder <wilder@us.ibm.com> wrote:
> 
>> Adding limited support for the ARP Monitoring feature when ovs is
>> configured above the bond. When no vlan tags are used in the configuration
>> or when the tag is added between the bond interface and the ovs bridge arp
>> monitoring will function correctly. The use of tags between the ovs bridge
>> and the routed interface are not supported.
> 
> 	Looking at the patch, it isn't really "adding support," but
> rather is disabling the "is this IP address configured above the bond"
> checks if the bond is a member of an OVS bridge.  It also seems like it
> would permit any ARP IP target, as long as the address is configured
> somewhere on the system.
> 
> 	Stated another way, the route lookup in bond_arp_send_all() for
> the target IP address must return a device, but the logic to match that
> device to the interface stack above the bond will always succeed if the
> bond is a member of any OVS bridge.
> 
> 	For example, given:
> 	
> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
> eth2 IP=20.0.0.2
> 
> 	Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
> succeed after this patch is applied, and the bond would send ARPs for
> 20.0.0.2.
> 
>> For example:
>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>
>> Configurations #1 and #2 were tested and verified to function corectly.
>> In the second configuration the correct vlan tags were seen in the arp.
> 
> 	Assuming that I'm understanding the behavior correctly, I'm not
> sure that "if OVS then do whatever" is the right way to go, particularly
> since this would still exhibit mysterious failures if a VLAN is
> configured within OVS itself (case 3, above).

Note: vlan can also be pushed or removed by the OpenFlow pipeline within
openvswitch between the ovs-port and the bond0.  So, there is actually no
reliable way to detect the correct set of vlan tags that should be used.
And also, even if the IP is assigned to the ovs-port that is part of the
same OVS bridge, there is no guarantee that packets routed to that IP can
actually egress from the bond0, as the forwarding rules inside the OVS
datapath can be arbitrarily complex.

And all that is not limited to OVS even, as the cover letter mentions, TC
or nftables in the linux bridge or even eBPF or XDP programs are not that
different complexity-wise and can do most of the same things breaking the
assumptions bonding code makes.

> 
> 	I understand that the architecture of OVS limits the ability to
> do these sorts of checks, but I'm unconvinced that implementing this
> support halfway is going to create more issues than it solves.
> 
> 	Lastly, thinking out loud here, I'm generally loathe to add more
> options to bonding, but I'm debating whether this would be worth an
> "ovs-is-a-black-box" option somewhere, so that users would have to
> opt-in to the OVS alternate realm.

I agree that adding options is almost never a great solution.  But I had a
similar thought.  I don't think this option should be limited to OVS though,
as OVS is only one of the cases where the current verification logic is not
sufficient.

> 
> 	-J
> 
>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>> ---
>> drivers/net/bonding/bond_main.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 950d8e4d86f8..6f71a567ba37 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>> 	struct net_device *upper;
>> 	struct list_head  *iter;
>>
>> -	if (start_dev == end_dev) {
>> +	/* If start_dev is an OVS port then we have encountered an openVswitch

nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
'Open vSwitch' instead.

>> +	 * bridge and can't go any further. The programming of the switch table
>> +	 * will determine what packets will be sent to the bond. We can make no
>> +	 * further assumptions about the network above the bond.
>> +	 */
>> +
>> +	if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>> 		tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>> 		if (!tags)
>> 			return ERR_PTR(-ENOMEM);
> 
> ---
> 	-Jay Vosburgh, jv@jvosburgh.net

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-12  0:29     ` Ilya Maximets
@ 2025-04-13  2:37       ` David Wilder
  2025-04-15 20:09         ` Jay Vosburgh
  0 siblings, 1 reply; 13+ messages in thread
From: David Wilder @ 2025-04-13  2:37 UTC (permalink / raw)
  To: Ilya Maximets, Jay Vosburgh
  Cc: netdev@vger.kernel.org, pradeeps@linux.vnet.ibm.com,
	Pradeep Satyanarayana, Adrian Moreno Zapata, Hangbin Liu



>>> Adding limited support for the ARP Monitoring feature when ovs is
>>> configured above the bond. When no vlan tags are used in the configuration
>>> or when the tag is added between the bond interface and the ovs bridge arp
>>> monitoring will function correctly. The use of tags between the ovs bridge
>>> and the routed interface are not supported.
>>
>>       Looking at the patch, it isn't really "adding support," but
>> rather is disabling the "is this IP address configured above the bond"
>> checks if the bond is a member of an OVS bridge.  It also seems like it
>> would permit any ARP IP target, as long as the address is configured
>> somewhere on the system.
>>
>>       Stated another way, the route lookup in bond_arp_send_all() for
>> the target IP address must return a device, but the logic to match that
>> device to the interface stack above the bond will always succeed if the
>> bond is a member of any OVS bridge.
>>
>>       For example, given:
>>
>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>> eth2 IP=20.0.0.2
>>
>>       Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>> succeed after this patch is applied, and the bond would send ARPs for
>> 20.0.0.2.
>>
>>> For example:
>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>
>>> Configurations #1 and #2 were tested and verified to function corectly.
>>> In the second configuration the correct vlan tags were seen in the arp.
>>
>>       Assuming that I'm understanding the behavior correctly, I'm not
>> sure that "if OVS then do whatever" is the right way to go, particularly
>> since this would still exhibit mysterious failures if a VLAN is
>> configured within OVS itself (case 3, above).
>
> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
> openvswitch between the ovs-port and the bond0.  So, there is actually no
> reliable way to detect the correct set of vlan tags that should be used.
> And also, even if the IP is assigned to the ovs-port that is part of the
> same OVS bridge, there is no guarantee that packets routed to that IP can
> actually egress from the bond0, as the forwarding rules inside the OVS
>datapath can be arbitrarily complex.
>
> And all that is not limited to OVS even, as the cover letter mentions, TC
> or nftables in the linux bridge or even eBPF or XDP programs are not that
> different complexity-wise and can do most of the same things breaking the
> assumptions bonding code makes.
>
>>
>>       I understand that the architecture of OVS limits the ability to
>> do these sorts of checks, but I'm unconvinced that implementing this
>> support halfway is going to create more issues than it solves.
>>
>>       Lastly, thinking out loud here, I'm generally loathe to add more
>> options to bonding, but I'm debating whether this would be worth an
>> "ovs-is-a-black-box" option somewhere, so that users would have to
>> opt-in to the OVS alternate realm.

> I agree that adding options is almost never a great solution.  But I had a
> similar thought.  I don't think this option should be limited to OVS though,
>as OVS is only one of the cases where the current verification logic is not
>sufficient.
>

What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
 to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
 If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
 and use that instead of calling bond_verify_device_path(). An empty list would be valid.

>>
>>       -J
>>
>>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>>> ---
>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index 950d8e4d86f8..6f71a567ba37 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>>      struct net_device *upper;
>>>      struct list_head  *iter;
>>>
>>> -    if (start_dev == end_dev) {
>>> +    /* If start_dev is an OVS port then we have encountered an openVswitch
>
> nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
> 'Open vSwitch' instead.

>>> +     * bridge and can't go any further. The programming of the switch table
>>> +     * will determine what packets will be sent to the bond. We can make no
>>> +     * further assumptions about the network above the bond.
>>> +     */
>>> +
>>> +    if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>>              tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>>              if (!tags)
>>>                      return ERR_PTR(-ENOMEM);
>>
>> ---
>>       -Jay Vosburgh, jv@jvosburgh.net
>
> Best regards, Ilya Maximets.

David Wilder (wilder@us.ibm.com)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-11 17:48 ` [PATCH net-next v1 1/1] bonding: Adding limmited " David J Wilder
  2025-04-11 23:57   ` Jay Vosburgh
@ 2025-04-15 13:58   ` Paolo Abeni
  1 sibling, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2025-04-15 13:58 UTC (permalink / raw)
  To: David J Wilder, netdev; +Cc: jv, pradeeps, pradeep

On 4/11/25 7:48 PM, David J Wilder wrote:
> Adding limited support for the ARP Monitoring feature when ovs is
> configured above the bond. When no vlan tags are used in the configuration
> or when the tag is added between the bond interface and the ovs bridge arp
> monitoring will function correctly. The use of tags between the ovs bridge
> and the routed interface are not supported.
> 
> For example:
> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
> 
> Configurations #1 and #2 were tested and verified to function corectly.
> In the second configuration the correct vlan tags were seen in the arp.

If you are adding support for new setup types, you should additionally
implement some test-cases to cover them.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-13  2:37       ` David Wilder
@ 2025-04-15 20:09         ` Jay Vosburgh
  2025-04-15 22:13           ` David Wilder
  0 siblings, 1 reply; 13+ messages in thread
From: Jay Vosburgh @ 2025-04-15 20:09 UTC (permalink / raw)
  To: David Wilder
  Cc: Ilya Maximets, netdev@vger.kernel.org,
	pradeeps@linux.vnet.ibm.com, Pradeep Satyanarayana,
	Adrian Moreno Zapata, Hangbin Liu

David Wilder <wilder@us.ibm.com> wrote:

>
>
>>>> Adding limited support for the ARP Monitoring feature when ovs is
>>>> configured above the bond. When no vlan tags are used in the configuration
>>>> or when the tag is added between the bond interface and the ovs bridge arp
>>>> monitoring will function correctly. The use of tags between the ovs bridge
>>>> and the routed interface are not supported.
>>>
>>>       Looking at the patch, it isn't really "adding support," but
>>> rather is disabling the "is this IP address configured above the bond"
>>> checks if the bond is a member of an OVS bridge.  It also seems like it
>>> would permit any ARP IP target, as long as the address is configured
>>> somewhere on the system.
>>>
>>>       Stated another way, the route lookup in bond_arp_send_all() for
>>> the target IP address must return a device, but the logic to match that
>>> device to the interface stack above the bond will always succeed if the
>>> bond is a member of any OVS bridge.
>>>
>>>       For example, given:
>>>
>>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>>> eth2 IP=20.0.0.2
>>>
>>>       Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>>> succeed after this patch is applied, and the bond would send ARPs for
>>> 20.0.0.2.
>>>
>>>> For example:
>>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>>
>>>> Configurations #1 and #2 were tested and verified to function corectly.
>>>> In the second configuration the correct vlan tags were seen in the arp.
>>>
>>>       Assuming that I'm understanding the behavior correctly, I'm not
>>> sure that "if OVS then do whatever" is the right way to go, particularly
>>> since this would still exhibit mysterious failures if a VLAN is
>>> configured within OVS itself (case 3, above).
>>
>> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
>> openvswitch between the ovs-port and the bond0.  So, there is actually no
>> reliable way to detect the correct set of vlan tags that should be used.
>> And also, even if the IP is assigned to the ovs-port that is part of the
>> same OVS bridge, there is no guarantee that packets routed to that IP can
>> actually egress from the bond0, as the forwarding rules inside the OVS
>>datapath can be arbitrarily complex.
>>
>> And all that is not limited to OVS even, as the cover letter mentions, TC
>> or nftables in the linux bridge or even eBPF or XDP programs are not that
>> different complexity-wise and can do most of the same things breaking the
>> assumptions bonding code makes.
>>
>>>
>>>       I understand that the architecture of OVS limits the ability to
>>> do these sorts of checks, but I'm unconvinced that implementing this
>>> support halfway is going to create more issues than it solves.

	Re-reading my comment, I clearly meant "isn't going to create
more issues" here.

>>>       Lastly, thinking out loud here, I'm generally loathe to add more
>>> options to bonding, but I'm debating whether this would be worth an
>>> "ovs-is-a-black-box" option somewhere, so that users would have to
>>> opt-in to the OVS alternate realm.
>
>> I agree that adding options is almost never a great solution.  But I had a
>> similar thought.  I don't think this option should be limited to OVS though,
>>as OVS is only one of the cases where the current verification logic is not
>>sufficient.

	Agreed; I wasn't really thinking about the not-OVS cases when I
wrote that, but whatever is changed, if anything, should be generic.

>What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
> to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
> If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
> and use that instead of calling bond_verify_device_path(). An empty list would be valid.

	Hmm, that's ... not too bad; I was thinking more along the lines
of a "skip the checks" option, but this may be a much cleaner way to do
it.

	That said, are you thinking that bonding would add the VLAN
tags, or that some third party would add them?  I.e., are you trying to
accomodate the case wherein OVS, tc, or whatever, is adding a tag?

	-J

>>>
>>>       -J
>>>
>>>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>>>> ---
>>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>>> index 950d8e4d86f8..6f71a567ba37 100644
>>>> --- a/drivers/net/bonding/bond_main.c
>>>> +++ b/drivers/net/bonding/bond_main.c
>>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>>>      struct net_device *upper;
>>>>      struct list_head  *iter;
>>>>
>>>> -    if (start_dev == end_dev) {
>>>> +    /* If start_dev is an OVS port then we have encountered an openVswitch
>>
>> nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
>> 'Open vSwitch' instead.
>
>>>> +     * bridge and can't go any further. The programming of the switch table
>>>> +     * will determine what packets will be sent to the bond. We can make no
>>>> +     * further assumptions about the network above the bond.
>>>> +     */
>>>> +
>>>> +    if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>>>              tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>>>              if (!tags)
>>>>                      return ERR_PTR(-ENOMEM);
>>>
>>> ---
>>>       -Jay Vosburgh, jv@jvosburgh.net
>>
>> Best regards, Ilya Maximets.
>
>David Wilder (wilder@us.ibm.com)

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-15 20:09         ` Jay Vosburgh
@ 2025-04-15 22:13           ` David Wilder
  2025-04-16  0:35             ` Hangbin Liu
  2025-04-16  1:11             ` Jay Vosburgh
  0 siblings, 2 replies; 13+ messages in thread
From: David Wilder @ 2025-04-15 22:13 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Ilya Maximets, netdev@vger.kernel.org,
	pradeeps@linux.vnet.ibm.com, Pradeep Satyanarayana,
	Adrian Moreno Zapata, Hangbin Liu

>>>>> Adding limited support for the ARP Monitoring feature when ovs is
>>>>> configured above the bond. When no vlan tags are used in the configuration
>>>>> or when the tag is added between the bond interface and the ovs bridge arp
>>>>> monitoring will function correctly. The use of tags between the ovs bridge
>>>>> and the routed interface are not supported.
>>>>
>>>>       Looking at the patch, it isn't really "adding support," but
>>>> rather is disabling the "is this IP address configured above the bond"
>>>> checks if the bond is a member of an OVS bridge.  It also seems like it
>>>> would permit any ARP IP target, as long as the address is configured
>>>> somewhere on the system.
>>>>
>>>>       Stated another way, the route lookup in bond_arp_send_all() for
>>>> the target IP address must return a device, but the logic to match that
>>>> device to the interface stack above the bond will always succeed if the
>>>> bond is a member of any OVS bridge.
>>>>
>>>>       For example, given:
>>>>
>>>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>>>> eth2 IP=20.0.0.2
>>>>
>>>>       Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>>>> succeed after this patch is applied, and the bond would send ARPs for
>>>> 20.0.0.2.
>>>>
>>>>> For example:
>>>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>>>
>>>>> Configurations #1 and #2 were tested and verified to function corectly.
>>>>> In the second configuration the correct vlan tags were seen in the arp.
>>>>
>>>>       Assuming that I'm understanding the behavior correctly, I'm not
>>>> sure that "if OVS then do whatever" is the right way to go, particularly
>>>> since this would still exhibit mysterious failures if a VLAN is
>>>> configured within OVS itself (case 3, above).
>>>
>>> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
>>> openvswitch between the ovs-port and the bond0.  So, there is actually no
>>> reliable way to detect the correct set of vlan tags that should be used.
>>> And also, even if the IP is assigned to the ovs-port that is part of the
>>> same OVS bridge, there is no guarantee that packets routed to that IP can
>>> actually egress from the bond0, as the forwarding rules inside the OVS
>>>datapath can be arbitrarily complex.
>>>
>>> And all that is not limited to OVS even, as the cover letter mentions, TC
>>> or nftables in the linux bridge or even eBPF or XDP programs are not that
>>> different complexity-wise and can do most of the same things breaking the
>>> assumptions bonding code makes.
>>>
>>>>
>>>>       I understand that the architecture of OVS limits the ability to
>>>> do these sorts of checks, but I'm unconvinced that implementing this
>>>> support halfway is going to create more issues than it solves.
>
>    Re-reading my comment, I clearly meant "isn't going to create
>    more issues" here.
>
>>>>       Lastly, thinking out loud here, I'm generally loathe to add more
>>>> options to bonding, but I'm debating whether this would be worth an
>>>> "ovs-is-a-black-box" option somewhere, so that users would have to
>>>> opt-in to the OVS alternate realm.
>>
>>> I agree that adding options is almost never a great solution.  But I had a
>>> similar thought.  I don't think this option should be limited to OVS though,
>>>as OVS is only one of the cases where the current verification logic is not
>>>sufficient.
>
>        Agreed; I wasn't really thinking about the not-OVS cases when I
>wrote that, but whatever is changed, if anything, should be generic.

>>What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
>> to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
>> If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
>> and use that instead of calling bond_verify_device_path(). An empty list would be valid.

>        Hmm, that's ... not too bad; I was thinking more along the lines
>of a "skip the checks" option, but this may be a much cleaner way to do
>it.

>        That said, are you thinking that bonding would add the VLAN
>tags, or that some third party would add them?  I.e., are you trying to
>accomodate the case wherein OVS, tc, or whatever, is adding a tag?

It would be up to the administrator to add the list of tags to the arp_target list.
I am unsure how a third party could know what tags might be added by other components.
Knowing what tags to add to the list may be hard to figure out in a complicated configuration.
The upside is it should be possible to configure for any list of tags even if difficult.

A "skip the checks" option would be easier to code. If we skip the process of gathering tags
would that eliminate any configurations with any vlan tags?.
>
>        -J
>
>>>>
>>>>       -J
>>>>
>>>>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>>>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>>>>> ---
>>>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>>>> index 950d8e4d86f8..6f71a567ba37 100644
>>>>> --- a/drivers/net/bonding/bond_main.c
>>>>> +++ b/drivers/net/bonding/bond_main.c
>>>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>>>>      struct net_device *upper;
>>>>>      struct list_head  *iter;
>>>>>
>>>>> -    if (start_dev == end_dev) {
>>>>> +    /* If start_dev is an OVS port then we have encountered an openVswitch
>>>
>>> nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
>>> 'Open vSwitch' instead.
>>
>>>>> +     * bridge and can't go any further. The programming of the switch table
>>>>> +     * will determine what packets will be sent to the bond. We can make no
>>>>> +     * further assumptions about the network above the bond.
>>>>> +     */
>>>>> +
>>>>> +    if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>>>>              tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>>>>              if (!tags)
>>>>>                      return ERR_PTR(-ENOMEM);
>>>>
>>>> ---
>>>>       -Jay Vosburgh, jv@jvosburgh.net
>>>
>>> Best regards, Ilya Maximets.
>>
>>David Wilder (wilder@us.ibm.com)
>
>---
>        -Jay Vosburgh, jv@jvosburgh.net

David Wilder (wilder@us.ibm.com)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-15 22:13           ` David Wilder
@ 2025-04-16  0:35             ` Hangbin Liu
  2025-04-16  1:11             ` Jay Vosburgh
  1 sibling, 0 replies; 13+ messages in thread
From: Hangbin Liu @ 2025-04-16  0:35 UTC (permalink / raw)
  To: David Wilder
  Cc: Jay Vosburgh, Ilya Maximets, netdev@vger.kernel.org,
	pradeeps@linux.vnet.ibm.com, Pradeep Satyanarayana,
	Adrian Moreno Zapata

On Tue, Apr 15, 2025 at 10:13:18PM +0000, David Wilder wrote:
> >>> I agree that adding options is almost never a great solution.  But I had a
> >>> similar thought.  I don't think this option should be limited to OVS though,
> >>>as OVS is only one of the cases where the current verification logic is not
> >>>sufficient.
> >
> >        Agreed; I wasn't really thinking about the not-OVS cases when I
> >wrote that, but whatever is changed, if anything, should be generic.
> 
> >>What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
> >> to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
> >> If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
> >> and use that instead of calling bond_verify_device_path(). An empty list would be valid.
> 
> >        Hmm, that's ... not too bad; I was thinking more along the lines
> >of a "skip the checks" option, but this may be a much cleaner way to do
> >it.
> 
> >        That said, are you thinking that bonding would add the VLAN
> >tags, or that some third party would add them?  I.e., are you trying to
> >accomodate the case wherein OVS, tc, or whatever, is adding a tag?
> 
> It would be up to the administrator to add the list of tags to the arp_target list.
> I am unsure how a third party could know what tags might be added by other components.
> Knowing what tags to add to the list may be hard to figure out in a complicated configuration.
> The upside is it should be possible to configure for any list of tags even if difficult.
> 
> A "skip the checks" option would be easier to code. If we skip the process of gathering tags
> would that eliminate any configurations with any vlan tags?.

+1 for "skip the checks". If arp_ip_target=x.x.x.x[vlan,vlan,...] doesn't
ask bond to add vlan tags, there is no need to set the vlan in parameter.

Since OVS, tc, netfilter can add vlan or change the output ports. A "skip
the checks" setting looks more reasonable.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-15 22:13           ` David Wilder
  2025-04-16  0:35             ` Hangbin Liu
@ 2025-04-16  1:11             ` Jay Vosburgh
  2025-04-16 18:57               ` David Wilder
  1 sibling, 1 reply; 13+ messages in thread
From: Jay Vosburgh @ 2025-04-16  1:11 UTC (permalink / raw)
  To: David Wilder
  Cc: Ilya Maximets, netdev@vger.kernel.org,
	pradeeps@linux.vnet.ibm.com, Pradeep Satyanarayana,
	Adrian Moreno Zapata, Hangbin Liu

David Wilder <wilder@us.ibm.com> wrote:

>>>>>> Adding limited support for the ARP Monitoring feature when ovs is
>>>>>> configured above the bond. When no vlan tags are used in the configuration
>>>>>> or when the tag is added between the bond interface and the ovs bridge arp
>>>>>> monitoring will function correctly. The use of tags between the ovs bridge
>>>>>> and the routed interface are not supported.
>>>>>
>>>>>       Looking at the patch, it isn't really "adding support," but
>>>>> rather is disabling the "is this IP address configured above the bond"
>>>>> checks if the bond is a member of an OVS bridge.  It also seems like it
>>>>> would permit any ARP IP target, as long as the address is configured
>>>>> somewhere on the system.
>>>>>
>>>>>       Stated another way, the route lookup in bond_arp_send_all() for
>>>>> the target IP address must return a device, but the logic to match that
>>>>> device to the interface stack above the bond will always succeed if the
>>>>> bond is a member of any OVS bridge.
>>>>>
>>>>>       For example, given:
>>>>>
>>>>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>>>>> eth2 IP=20.0.0.2
>>>>>
>>>>>       Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>>>>> succeed after this patch is applied, and the bond would send ARPs for
>>>>> 20.0.0.2.
>>>>>
>>>>>> For example:
>>>>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>>>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>>>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>>>>
>>>>>> Configurations #1 and #2 were tested and verified to function corectly.
>>>>>> In the second configuration the correct vlan tags were seen in the arp.
>>>>>
>>>>>       Assuming that I'm understanding the behavior correctly, I'm not
>>>>> sure that "if OVS then do whatever" is the right way to go, particularly
>>>>> since this would still exhibit mysterious failures if a VLAN is
>>>>> configured within OVS itself (case 3, above).
>>>>
>>>> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
>>>> openvswitch between the ovs-port and the bond0.  So, there is actually no
>>>> reliable way to detect the correct set of vlan tags that should be used.
>>>> And also, even if the IP is assigned to the ovs-port that is part of the
>>>> same OVS bridge, there is no guarantee that packets routed to that IP can
>>>> actually egress from the bond0, as the forwarding rules inside the OVS
>>>>datapath can be arbitrarily complex.
>>>>
>>>> And all that is not limited to OVS even, as the cover letter mentions, TC
>>>> or nftables in the linux bridge or even eBPF or XDP programs are not that
>>>> different complexity-wise and can do most of the same things breaking the
>>>> assumptions bonding code makes.
>>>>
>>>>>
>>>>>       I understand that the architecture of OVS limits the ability to
>>>>> do these sorts of checks, but I'm unconvinced that implementing this
>>>>> support halfway is going to create more issues than it solves.
>>
>>    Re-reading my comment, I clearly meant "isn't going to create
>>    more issues" here.
>>
>>>>>       Lastly, thinking out loud here, I'm generally loathe to add more
>>>>> options to bonding, but I'm debating whether this would be worth an
>>>>> "ovs-is-a-black-box" option somewhere, so that users would have to
>>>>> opt-in to the OVS alternate realm.
>>>
>>>> I agree that adding options is almost never a great solution.  But I had a
>>>> similar thought.  I don't think this option should be limited to OVS though,
>>>>as OVS is only one of the cases where the current verification logic is not
>>>>sufficient.
>>
>>        Agreed; I wasn't really thinking about the not-OVS cases when I
>>wrote that, but whatever is changed, if anything, should be generic.
>
>>>What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
>>> to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
>>> If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
>>> and use that instead of calling bond_verify_device_path(). An empty list would be valid.
>
>>        Hmm, that's ... not too bad; I was thinking more along the lines
>>of a "skip the checks" option, but this may be a much cleaner way to do
>>it.
>
>>        That said, are you thinking that bonding would add the VLAN
>>tags, or that some third party would add them?  I.e., are you trying to
>>accomodate the case wherein OVS, tc, or whatever, is adding a tag?
>
>It would be up to the administrator to add the list of tags to the
>arp_target list.  I am unsure how a third party could know what tags
>might be added by other components.  Knowing what tags to add to the
>list may be hard to figure out in a complicated configuration.  The
>upside is it should be possible to configure for any list of tags even
>if difficult.

	I suspect I wasn't clear in my question; what I'm asking is what
you envision for the implementation within bonding for the "[vlan,vlan]"
part, and by "third party," I mean "not bonding," so OVS, tc, etc.

	Would bonding need to add the tags in the list, or expect one of
the thiry parties to do it, or some kind of mix?

	Does bonding need to know what tags any third party adds?  I.e.,
for your case 3, above, wherein OVS adds a tag, why does that fail to
work?

>A "skip the checks" option would be easier to code. If we skip the
>process of gathering tags would that eliminate any configurations with
>any vlan tags?.

	So, yes, it would be easier to implement, and, no, I think a
simple "skip the checks" switch could be implemented such that it still
runs the device path and gathers the tags, but doesn't care if the end
of the device path matches.

	That said, such an implementation is effectively the same as
your original patch, except with an option instead of an OVS-ness check,
and I'm still undecided on whether that's better than something that
requires more specific configuration.

	-J


>>>>>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>>>>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>>>>>> ---
>>>>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>>>>> index 950d8e4d86f8..6f71a567ba37 100644
>>>>>> --- a/drivers/net/bonding/bond_main.c
>>>>>> +++ b/drivers/net/bonding/bond_main.c
>>>>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>>>>>      struct net_device *upper;
>>>>>>      struct list_head  *iter;
>>>>>>
>>>>>> -    if (start_dev == end_dev) {
>>>>>> +    /* If start_dev is an OVS port then we have encountered an openVswitch
>>>>
>>>> nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
>>>> 'Open vSwitch' instead.
>>>
>>>>>> +     * bridge and can't go any further. The programming of the switch table
>>>>>> +     * will determine what packets will be sent to the bond. We can make no
>>>>>> +     * further assumptions about the network above the bond.
>>>>>> +     */
>>>>>> +
>>>>>> +    if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>>>>>              tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>>>>>              if (!tags)
>>>>>>                      return ERR_PTR(-ENOMEM);
>>>>>
>>>>> ---
>>>>>       -Jay Vosburgh, jv@jvosburgh.net
>>>>
>>>> Best regards, Ilya Maximets.
>>>
>>>David Wilder (wilder@us.ibm.com)
>>
>>---
>>        -Jay Vosburgh, jv@jvosburgh.net
>
>David Wilder (wilder@us.ibm.com)

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-16  1:11             ` Jay Vosburgh
@ 2025-04-16 18:57               ` David Wilder
  2025-04-18  3:59                 ` Jay Vosburgh
  0 siblings, 1 reply; 13+ messages in thread
From: David Wilder @ 2025-04-16 18:57 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Ilya Maximets, netdev@vger.kernel.org,
	pradeeps@linux.vnet.ibm.com, Pradeep Satyanarayana,
	Adrian Moreno Zapata, Hangbin Liu

>>>>>>> Adding limited support for the ARP Monitoring feature when ovs is
>>>>>>> configured above the bond. When no vlan tags are used in the configuration
>>>>>>> or when the tag is added between the bond interface and the ovs bridge arp
>>>>>>> monitoring will function correctly. The use of tags between the ovs bridge
>>>>>>> and the routed interface are not supported.
>>>>>>
>>>>>>       Looking at the patch, it isn't really "adding support," but
>>>>>> rather is disabling the "is this IP address configured above the bond"
>>>>>> checks if the bond is a member of an OVS bridge.  It also seems like it
>>>>>> would permit any ARP IP target, as long as the address is configured
>>>>>> somewhere on the system.
>>>>>>
>>>>>>       Stated another way, the route lookup in bond_arp_send_all() for
>>>>>> the target IP address must return a device, but the logic to match that
>>>>>> device to the interface stack above the bond will always succeed if the
>>>>>> bond is a member of any OVS bridge.
>>>>>>
>>>>>>       For example, given:
>>>>>>
>>>>>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>>>>>> eth2 IP=20.0.0.2
>>>>>>
>>>>>>       Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>>>>>> succeed after this patch is applied, and the bond would send ARPs for
>>>>>> 20.0.0.2.
>>>>>>
>>>>>>> For example:
>>>>>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>>>>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>>>>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>>>>>
>>>>>>> Configurations #1 and #2 were tested and verified to function corectly.
>>>>>>> In the second configuration the correct vlan tags were seen in the arp.
>>>>>>
>>>>>>       Assuming that I'm understanding the behavior correctly, I'm not
>>>>>> sure that "if OVS then do whatever" is the right way to go, particularly
>>>>>> since this would still exhibit mysterious failures if a VLAN is
>>>>>> configured within OVS itself (case 3, above).
>>>>>
>>>>> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
>>>>> openvswitch between the ovs-port and the bond0.  So, there is actually no
>>>>> reliable way to detect the correct set of vlan tags that should be used.
>>>>> And also, even if the IP is assigned to the ovs-port that is part of the
>>>>> same OVS bridge, there is no guarantee that packets routed to that IP can
>>>>> actually egress from the bond0, as the forwarding rules inside the OVS
>>>>>datapath can be arbitrarily complex.
>>>>>
>>>>> And all that is not limited to OVS even, as the cover letter mentions, TC
>>>>> or nftables in the linux bridge or even eBPF or XDP programs are not that
>>>>> different complexity-wise and can do most of the same things breaking the
>>>>> assumptions bonding code makes.
>>>>>
>>>>>>
>>>>>>       I understand that the architecture of OVS limits the ability to
>>>>>> do these sorts of checks, but I'm unconvinced that implementing this
>>>>>> support halfway is going to create more issues than it solves.
>>>
>>>    Re-reading my comment, I clearly meant "isn't going to create
>>>    more issues" here.
>>>
>>>>>>       Lastly, thinking out loud here, I'm generally loathe to add more
>>>>>> options to bonding, but I'm debating whether this would be worth an
>>>>>> "ovs-is-a-black-box" option somewhere, so that users would have to
>>>>>> opt-in to the OVS alternate realm.
>>>>
>>>>> I agree that adding options is almost never a great solution.  But I had a
>>>>> similar thought.  I don't think this option should be limited to OVS though,
>>>>>as OVS is only one of the cases where the current verification logic is not
>>>>>sufficient.
>>>
>>>        Agreed; I wasn't really thinking about the not-OVS cases when I
>>>wrote that, but whatever is changed, if anything, should be generic.
>>
>>>>What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
>>>> to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
>>>> If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
>>>> and use that instead of calling bond_verify_device_path(). An empty list would be valid.
>>
>>>        Hmm, that's ... not too bad; I was thinking more along the lines
>>>of a "skip the checks" option, but this may be a much cleaner way to do
>>>it.
>>
>>>        That said, are you thinking that bonding would add the VLAN
>>>tags, or that some third party would add them?  I.e., are you trying to
>>>accomodate the case wherein OVS, tc, or whatever, is adding a tag?
>>
>>It would be up to the administrator to add the list of tags to the
>>arp_target list.  I am unsure how a third party could know what tags
>>might be added by other components.  Knowing what tags to add to the
>>list may be hard to figure out in a complicated configuration.  The
>>upside is it should be possible to configure for any list of tags even
>>if difficult.
>
>  I suspect I wasn't clear in my question; what I'm asking is what
>you envision for the implementation within bonding for the "[vlan,vlan]"
>part, and by "third party," I mean "not bonding," so OVS, tc, etc.
>
>  Would bonding need to add the tags in the list, or expect one of
>the thiry parties to do it, or some kind of mix?

Bonding needs to add all the tags. I am just optionally replacing
the collection of tags by bond_verify_device_path() with a list of tags
supplied in the arp_target list. (Optional Per target).

To be clear, if you chose to supply tags manually, you need to supply
all the tags needed for that target,  not just the tags bonding could not
figure out on its own. An empty list [] would be valid and would just cause
bonding to skip tag collection.

If OVS adds a tag it would be up to the user to know that and update
the bonding configuration to include all tags for the target.  

>
>   Does bonding need to know what tags any third party adds?  I.e.,
>for your case 3, above, wherein OVS adds a tag, why does that fail to
>work?

Today it fails since bond_verify_device_path() cant see the tags
added by or above OVS.  Adding a list of tags [vlan vlan] or [] would effectively 
do the the same as the "skip the checks" option.  But allows a way to make
case 3 work.

>
>>A "skip the checks" option would be easier to code. If we skip the
>>process of gathering tags would that eliminate any configurations with
>>any vlan tags?.
>
>  So, yes, it would be easier to implement, and, no, I think a
>simple "skip the checks" switch could be implemented such that it still
>runs the device path and gathers the tags, but doesn't care if the end
>of the device path matches.
>
>  That said, such an implementation is effectively the same as
>your original patch, except with an option instead of an OVS-ness check,
>and I'm still undecided on whether that's better than something that
>requires more specific configuration.

Ah,  ok I get it.  

The "skip the checks" option has the advantage in simplicity and will
fix the problem I started out solving.  The downside is it wont solve more
complex configurations Ilya is concerned with (see case 3). I am all for starting
with the "kiss" approach, we can always pivot to something more complex later if we have
the demand.

>
>-J


>>>>>>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>>>>>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>>>>>>> ---
>>>>>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>>>>>> index 950d8e4d86f8..6f71a567ba37 100644
>>>>>>> --- a/drivers/net/bonding/bond_main.c
>>>>>>> +++ b/drivers/net/bonding/bond_main.c
>>>>>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>>>>>>      struct net_device *upper;
>>>>>>>      struct list_head  *iter;
>>>>>>>
>>>>>>> -    if (start_dev == end_dev) {
>>>>>>> +    /* If start_dev is an OVS port then we have encountered an openVswitch
>>>>>
>>>>> nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
>>>>> 'Open vSwitch' instead.
>>>>
>>>>>>> +     * bridge and can't go any further. The programming of the switch table
>>>>>>> +     * will determine what packets will be sent to the bond. We can make no
>>>>>>> +     * further assumptions about the network above the bond.
>>>>>>> +     */
>>>>>>> +
>>>>>>> +    if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>>>>>>              tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>>>>>>              if (!tags)
>>>>>>>                      return ERR_PTR(-ENOMEM);
>>>>>>
>>>>>> ---
>>>>>>       -Jay Vosburgh, jv@jvosburgh.net
>>>>>
>>>>> Best regards, Ilya Maximets.
>>>>
>>>>David Wilder (wilder@us.ibm.com)
>>>
>>>---
>>>        -Jay Vosburgh, jv@jvosburgh.net
>>
>>David Wilder (wilder@us.ibm.com)
>
>---
>  -Jay Vosburgh, jv@jvosburgh.net


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-16 18:57               ` David Wilder
@ 2025-04-18  3:59                 ` Jay Vosburgh
  2025-04-18 19:20                   ` David Wilder
  0 siblings, 1 reply; 13+ messages in thread
From: Jay Vosburgh @ 2025-04-18  3:59 UTC (permalink / raw)
  To: David Wilder
  Cc: Ilya Maximets, netdev@vger.kernel.org,
	pradeeps@linux.vnet.ibm.com, Pradeep Satyanarayana,
	Adrian Moreno Zapata, Hangbin Liu

David Wilder <wilder@us.ibm.com> wrote:

>>>>>>>> Adding limited support for the ARP Monitoring feature when ovs is
>>>>>>>> configured above the bond. When no vlan tags are used in the configuration
>>>>>>>> or when the tag is added between the bond interface and the ovs bridge arp
>>>>>>>> monitoring will function correctly. The use of tags between the ovs bridge
>>>>>>>> and the routed interface are not supported.
>>>>>>>
>>>>>>>       Looking at the patch, it isn't really "adding support," but
>>>>>>> rather is disabling the "is this IP address configured above the bond"
>>>>>>> checks if the bond is a member of an OVS bridge.  It also seems like it
>>>>>>> would permit any ARP IP target, as long as the address is configured
>>>>>>> somewhere on the system.
>>>>>>>
>>>>>>>       Stated another way, the route lookup in bond_arp_send_all() for
>>>>>>> the target IP address must return a device, but the logic to match that
>>>>>>> device to the interface stack above the bond will always succeed if the
>>>>>>> bond is a member of any OVS bridge.
>>>>>>>
>>>>>>>       For example, given:
>>>>>>>
>>>>>>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>>>>>>> eth2 IP=20.0.0.2
>>>>>>>
>>>>>>>       Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>>>>>>> succeed after this patch is applied, and the bond would send ARPs for
>>>>>>> 20.0.0.2.
>>>>>>>
>>>>>>>> For example:
>>>>>>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>>>>>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>>>>>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>>>>>>
>>>>>>>> Configurations #1 and #2 were tested and verified to function corectly.
>>>>>>>> In the second configuration the correct vlan tags were seen in the arp.
>>>>>>>
>>>>>>>       Assuming that I'm understanding the behavior correctly, I'm not
>>>>>>> sure that "if OVS then do whatever" is the right way to go, particularly
>>>>>>> since this would still exhibit mysterious failures if a VLAN is
>>>>>>> configured within OVS itself (case 3, above).
>>>>>>
>>>>>> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
>>>>>> openvswitch between the ovs-port and the bond0.  So, there is actually no
>>>>>> reliable way to detect the correct set of vlan tags that should be used.
>>>>>> And also, even if the IP is assigned to the ovs-port that is part of the
>>>>>> same OVS bridge, there is no guarantee that packets routed to that IP can
>>>>>> actually egress from the bond0, as the forwarding rules inside the OVS
>>>>>>datapath can be arbitrarily complex.
>>>>>>
>>>>>> And all that is not limited to OVS even, as the cover letter mentions, TC
>>>>>> or nftables in the linux bridge or even eBPF or XDP programs are not that
>>>>>> different complexity-wise and can do most of the same things breaking the
>>>>>> assumptions bonding code makes.
>>>>>>
>>>>>>>
>>>>>>>       I understand that the architecture of OVS limits the ability to
>>>>>>> do these sorts of checks, but I'm unconvinced that implementing this
>>>>>>> support halfway is going to create more issues than it solves.
>>>>
>>>>    Re-reading my comment, I clearly meant "isn't going to create
>>>>    more issues" here.
>>>>
>>>>>>>       Lastly, thinking out loud here, I'm generally loathe to add more
>>>>>>> options to bonding, but I'm debating whether this would be worth an
>>>>>>> "ovs-is-a-black-box" option somewhere, so that users would have to
>>>>>>> opt-in to the OVS alternate realm.
>>>>>
>>>>>> I agree that adding options is almost never a great solution.  But I had a
>>>>>> similar thought.  I don't think this option should be limited to OVS though,
>>>>>>as OVS is only one of the cases where the current verification logic is not
>>>>>>sufficient.
>>>>
>>>>        Agreed; I wasn't really thinking about the not-OVS cases when I
>>>>wrote that, but whatever is changed, if anything, should be generic.
>>>
>>>>>What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
>>>>> to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
>>>>> If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
>>>>> and use that instead of calling bond_verify_device_path(). An empty list would be valid.
>>>
>>>>        Hmm, that's ... not too bad; I was thinking more along the lines
>>>>of a "skip the checks" option, but this may be a much cleaner way to do
>>>>it.
>>>
>>>>        That said, are you thinking that bonding would add the VLAN
>>>>tags, or that some third party would add them?  I.e., are you trying to
>>>>accomodate the case wherein OVS, tc, or whatever, is adding a tag?
>>>
>>>It would be up to the administrator to add the list of tags to the
>>>arp_target list.  I am unsure how a third party could know what tags
>>>might be added by other components.  Knowing what tags to add to the
>>>list may be hard to figure out in a complicated configuration.  The
>>>upside is it should be possible to configure for any list of tags even
>>>if difficult.
>>
>>  I suspect I wasn't clear in my question; what I'm asking is what
>>you envision for the implementation within bonding for the "[vlan,vlan]"
>>part, and by "third party," I mean "not bonding," so OVS, tc, etc.
>>
>>  Would bonding need to add the tags in the list, or expect one of
>>the thiry parties to do it, or some kind of mix?
>
>Bonding needs to add all the tags. I am just optionally replacing
>the collection of tags by bond_verify_device_path() with a list of tags
>supplied in the arp_target list. (Optional Per target).
>
>To be clear, if you chose to supply tags manually, you need to supply
>all the tags needed for that target,  not just the tags bonding could not
>figure out on its own. An empty list [] would be valid and would just cause
>bonding to skip tag collection.
>
>If OVS adds a tag it would be up to the user to know that and update
>the bonding configuration to include all tags for the target.  

	If OVS adds a tag, and the ARP traverses through OVS, why would
bonding need to add that tag?  Wouldn't OVS add the tag again?

>>   Does bonding need to know what tags any third party adds?  I.e.,
>>for your case 3, above, wherein OVS adds a tag, why does that fail to
>>work?
>
>Today it fails since bond_verify_device_path() cant see the tags
>added by or above OVS.  Adding a list of tags [vlan vlan] or [] would effectively 
>do the the same as the "skip the checks" option.  But allows a way to make
>case 3 work.
>
>>
>>>A "skip the checks" option would be easier to code. If we skip the
>>>process of gathering tags would that eliminate any configurations with
>>>any vlan tags?.
>>
>>  So, yes, it would be easier to implement, and, no, I think a
>>simple "skip the checks" switch could be implemented such that it still
>>runs the device path and gathers the tags, but doesn't care if the end
>>of the device path matches.
>>
>>  That said, such an implementation is effectively the same as
>>your original patch, except with an option instead of an OVS-ness check,
>>and I'm still undecided on whether that's better than something that
>>requires more specific configuration.
>
>Ah,  ok I get it.  
>
>The "skip the checks" option has the advantage in simplicity and will
>fix the problem I started out solving.  The downside is it wont solve more
>complex configurations Ilya is concerned with (see case 3). I am all for starting
>with the "kiss" approach, we can always pivot to something more complex later if we have
>the demand.

	Modulo some of the implementation details from above, I'm more
inclined at the moment towards the "specify the tags" solution (specify
all the tags), rather than the "skip the checks" option.

	The reason is that, once we add an option, it generally cannot
ever be removed, and so there isn't really a "pivot" in the sense that
an existing option would ever go away.  In that case, the only way
forward would be to add another option (the "specify the tags" one), and
now we've got two different options for the same thing that work
differently.  I'd like to avoid that.

	-J

>>
>>-J
>
>
>>>>>>>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>>>>>>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>>>>>>>> ---
>>>>>>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>>>>>>> index 950d8e4d86f8..6f71a567ba37 100644
>>>>>>>> --- a/drivers/net/bonding/bond_main.c
>>>>>>>> +++ b/drivers/net/bonding/bond_main.c
>>>>>>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>>>>>>>      struct net_device *upper;
>>>>>>>>      struct list_head  *iter;
>>>>>>>>
>>>>>>>> -    if (start_dev == end_dev) {
>>>>>>>> +    /* If start_dev is an OVS port then we have encountered an openVswitch
>>>>>>
>>>>>> nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
>>>>>> 'Open vSwitch' instead.
>>>>>
>>>>>>>> +     * bridge and can't go any further. The programming of the switch table
>>>>>>>> +     * will determine what packets will be sent to the bond. We can make no
>>>>>>>> +     * further assumptions about the network above the bond.
>>>>>>>> +     */
>>>>>>>> +
>>>>>>>> +    if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>>>>>>>              tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>>>>>>>              if (!tags)
>>>>>>>>                      return ERR_PTR(-ENOMEM);
>>>>>>>
>>>>>>> ---
>>>>>>>       -Jay Vosburgh, jv@jvosburgh.net
>>>>>>
>>>>>> Best regards, Ilya Maximets.
>>>>>
>>>>>David Wilder (wilder@us.ibm.com)
>>>>
>>>>---
>>>>        -Jay Vosburgh, jv@jvosburgh.net
>>>
>>>David Wilder (wilder@us.ibm.com)
>>
>>---
>>  -Jay Vosburgh, jv@jvosburgh.net
>

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP monitoring with ovs.
  2025-04-18  3:59                 ` Jay Vosburgh
@ 2025-04-18 19:20                   ` David Wilder
  0 siblings, 0 replies; 13+ messages in thread
From: David Wilder @ 2025-04-18 19:20 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Ilya Maximets, netdev@vger.kernel.org,
	pradeeps@linux.vnet.ibm.com, Pradeep Satyanarayana,
	Adrian Moreno Zapata, Hangbin Liu

>>>>>>>>> Adding limited support for the ARP Monitoring feature when ovs is
>>>>>>>>> configured above the bond. When no vlan tags are used in the configuration
>>>>>>>>> or when the tag is added between the bond interface and the ovs bridge arp
>>>>>>>>> monitoring will function correctly. The use of tags between the ovs bridge
>>>>>>>>> and the routed interface are not supported.
>>>>>>>>
>>>>>>>>       Looking at the patch, it isn't really "adding support," but
>>>>>>>> rather is disabling the "is this IP address configured above the bond"
>>>>>>>> checks if the bond is a member of an OVS bridge.  It also seems like it
>>>>>>>> would permit any ARP IP target, as long as the address is configured
>>>>>>>> somewhere on the system.
>>>>>>>>
>>>>>>>>       Stated another way, the route lookup in bond_arp_send_all() for
>>>>>>>> the target IP address must return a device, but the logic to match that
>>>>>>>> device to the interface stack above the bond will always succeed if the
>>>>>>>> bond is a member of any OVS bridge.
>>>>>>>>
>>>>>>>>       For example, given:
>>>>>>>>
>>>>>>>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>>>>>>>> eth2 IP=20.0.0.2
>>>>>>>>
>>>>>>>>       Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>>>>>>>> succeed after this patch is applied, and the bond would send ARPs for
>>>>>>>> 20.0.0.2.
>>>>>>>>
>>>>>>>>> For example:
>>>>>>>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>>>>>>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>>>>>>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>>>>>>>
>>>>>>>>> Configurations #1 and #2 were tested and verified to function corectly.
>>>>>>>>> In the second configuration the correct vlan tags were seen in the arp.
>>>>>>>>
>>>>>>>>       Assuming that I'm understanding the behavior correctly, I'm not
>>>>>>>> sure that "if OVS then do whatever" is the right way to go, particularly
>>>>>>>> since this would still exhibit mysterious failures if a VLAN is
>>>>>>>> configured within OVS itself (case 3, above).
>>>>>>>
>>>>>>> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
>>>>>>> openvswitch between the ovs-port and the bond0.  So, there is actually no
>>>>>>> reliable way to detect the correct set of vlan tags that should be used.
>>>>>>> And also, even if the IP is assigned to the ovs-port that is part of the
>>>>>>> same OVS bridge, there is no guarantee that packets routed to that IP can
>>>>>>> actually egress from the bond0, as the forwarding rules inside the OVS
>>>>>>>datapath can be arbitrarily complex.
>>>>>>>
>>>>>>> And all that is not limited to OVS even, as the cover letter mentions, TC
>>>>>>> or nftables in the linux bridge or even eBPF or XDP programs are not that
>>>>>>> different complexity-wise and can do most of the same things breaking the
>>>>>>> assumptions bonding code makes.
>>>>>>>
>>>>>>>>
>>>>>>>>       I understand that the architecture of OVS limits the ability to
>>>>>>>> do these sorts of checks, but I'm unconvinced that implementing this
>>>>>>>> support halfway is going to create more issues than it solves.
>>>>>
>>>>>    Re-reading my comment, I clearly meant "isn't going to create
>>>>>    more issues" here.
>>>>>
>>>>>>>>       Lastly, thinking out loud here, I'm generally loathe to add more
>>>>>>>> options to bonding, but I'm debating whether this would be worth an
>>>>>>>> "ovs-is-a-black-box" option somewhere, so that users would have to
>>>>>>>> opt-in to the OVS alternate realm.
>>>>>>
>>>>>>> I agree that adding options is almost never a great solution.  But I had a
>>>>>>> similar thought.  I don't think this option should be limited to OVS though,
>>>>>>>as OVS is only one of the cases where the current verification logic is not
>>>>>>>sufficient.
>>>>>
>>>>>        Agreed; I wasn't really thinking about the not-OVS cases when I
>>>>>wrote that, but whatever is changed, if anything, should be generic.
>>>>
>>>>>>What if we build on the arp_ip_target setting.  Allow for a list of vlan tags
>>>>>> to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
>>>>>> If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
>>>>>> and use that instead of calling bond_verify_device_path(). An empty list would be valid.
>>>>
>>>>>        Hmm, that's ... not too bad; I was thinking more along the lines
>>>>>of a "skip the checks" option, but this may be a much cleaner way to do
>>>>>it.
>>>>
>>>>>        That said, are you thinking that bonding would add the VLAN
>>>>>tags, or that some third party would add them?  I.e., are you trying to
>>>>>accomodate the case wherein OVS, tc, or whatever, is adding a tag?
>>>>
>>>>It would be up to the administrator to add the list of tags to the
>>>>arp_target list.  I am unsure how a third party could know what tags
>>>>might be added by other components.  Knowing what tags to add to the
>>>>list may be hard to figure out in a complicated configuration.  The
>>>>upside is it should be possible to configure for any list of tags even
>>>>if difficult.
>>>
>>>  I suspect I wasn't clear in my question; what I'm asking is what
>>>you envision for the implementation within bonding for the "[vlan,vlan]"
>>>part, and by "third party," I mean "not bonding," so OVS, tc, etc.
>>>
>>>  Would bonding need to add the tags in the list, or expect one of
>>>the thiry parties to do it, or some kind of mix?
>>
>>Bonding needs to add all the tags. I am just optionally replacing
>>the collection of tags by bond_verify_device_path() with a list of tags
>>supplied in the arp_target list. (Optional Per target).
>>
>>To be clear, if you chose to supply tags manually, you need to supply
>>all the tags needed for that target,  not just the tags bonding could not
>>figure out on its own. An empty list [] would be valid and would just cause
>>bonding to skip tag collection.
>>
>>If OVS adds a tag it would be up to the user to know that and update
>>the bonding configuration to include all tags for the target.
>
> If OVS adds a tag, and the ARP traverses through OVS, why would
>bonding need to add that tag?  Wouldn't OVS add the tag again?
>

I though the arp probes are sent directly out from bonding, so it wont
pass through OVS, therefore bonding needs to add all the tags to the probes.

>>>   Does bonding need to know what tags any third party adds?  I.e.,
>>>for your case 3, above, wherein OVS adds a tag, why does that fail to
>>>work?
>>
>>Today it fails since bond_verify_device_path() cant see the tags
>>added by or above OVS.  Adding a list of tags [vlan vlan] or [] would effectively
>>do the the same as the "skip the checks" option.  But allows a way to make
>>case 3 work.
>>
>>>
>>>>A "skip the checks" option would be easier to code. If we skip the
>>>>process of gathering tags would that eliminate any configurations with
>>>>any vlan tags?.
>>>
>>>  So, yes, it would be easier to implement, and, no, I think a
>>>simple "skip the checks" switch could be implemented such that it still
>>>runs the device path and gathers the tags, but doesn't care if the end
>>>of the device path matches.
>>>
>>>  That said, such an implementation is effectively the same as
>>>your original patch, except with an option instead of an OVS-ness check,
>>>and I'm still undecided on whether that's better than something that
>>>requires more specific configuration.
>>
>>Ah,  ok I get it.
>>
>>The "skip the checks" option has the advantage in simplicity and will
>>fix the problem I started out solving.  The downside is it wont solve more
>>complex configurations Ilya is concerned with (see case 3). I am all for starting
>>with the "kiss" approach, we can always pivot to something more complex later if we have
>>the demand.
>
>   Modulo some of the implementation details from above, I'm more
>inclined at the moment towards the "specify the tags" solution (specify
>all the tags), rather than the "skip the checks" option.
>
>   The reason is that, once we add an option, it generally cannot
>ever be removed, and so there isn't really a "pivot" in the sense that
>an existing option would ever go away.  In that case, the only way
>forward would be to add another option (the "specify the tags" one), and
>now we've got two different options for the same thing that work
>differently.  I'd like to avoid that.

Good point,  Agreed, "specify the tags" is the way to go.

As a starting point I am thinking:

+struct arp_target {
+       __be32 target_ip;
+       struct bond_vlan_tag *tags;
+};
+
 struct bond_params {
        int mode;
        int xmit_policy;
dd@@ -135,7 +140,7 @@ struct bond_params {
        int ad_select;
        char primary[IFNAMSIZ];
        int primary_reselect;
-       __be32 arp_targets[BOND_MAX_ARP_TARGETS];
+       struct arp_target arp_targets[BOND_MAX_ARP_TARGETS];
        int tx_queues;
        int all_slaves_active;
        int resend_igmp;

Parse the list of address and tags into the array of struct arp_target.
Then bond_verify_device_path() will return arp_targets[i]->tags if it exists
or build its own list of tags as it always did.

>
>-J
>
>>>
>>>-J
>>
>>
>>>>>>>>> Signed-off-by: David J Wilder <wilder@us.ibm.com>
>>>>>>>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
>>>>>>>>> ---
>>>>>>>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>>>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>>>>>>>> index 950d8e4d86f8..6f71a567ba37 100644
>>>>>>>>> --- a/drivers/net/bonding/bond_main.c
>>>>>>>>> +++ b/drivers/net/bonding/bond_main.c
>>>>>>>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>>>>>>>>      struct net_device *upper;
>>>>>>>>>      struct list_head  *iter;
>>>>>>>>>
>>>>>>>>> -    if (start_dev == end_dev) {
>>>>>>>>> +    /* If start_dev is an OVS port then we have encountered an openVswitch
>>>>>>>
>>>>>>> nit: Strange choice to capitalize 'V'.  It should be all lowercase or a full
>>>>>>> 'Open vSwitch' instead.
>>>>>>
>>>>>>>>> +     * bridge and can't go any further. The programming of the switch table
>>>>>>>>> +     * will determine what packets will be sent to the bond. We can make no
>>>>>>>>> +     * further assumptions about the network above the bond.
>>>>>>>>> +     */
>>>>>>>>> +
>>>>>>>>> +    if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>>>>>>>>              tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>>>>>>>>              if (!tags)
>>>>>>>>>                      return ERR_PTR(-ENOMEM);
>>>>>>>>
>>>>>>>> ---
>>>>>>>>       -Jay Vosburgh, jv@jvosburgh.net
>>>>>>>
>>>>>>> Best regards, Ilya Maximets.
>>>>>>
>>>>>>David Wilder (wilder@us.ibm.com)
>>>>>
>>>>>---
>>>>>        -Jay Vosburgh, jv@jvosburgh.net
>>>>
>>>>David Wilder (wilder@us.ibm.com)
>>>
>>>---
>>>  -Jay Vosburgh, jv@jvosburgh.net
>>
>
>---
>	-Jay Vosburgh, jv@jvosburgh.net
    David Wilder (wilder@us.ibm.com)


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-04-18 19:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-11 17:48 [PATCH net-next v1 0/1] bonding: Adding limited support for ARP monitoring with ovs David J Wilder
2025-04-11 17:48 ` [PATCH net-next v1 1/1] bonding: Adding limmited " David J Wilder
2025-04-11 23:57   ` Jay Vosburgh
2025-04-12  0:29     ` Ilya Maximets
2025-04-13  2:37       ` David Wilder
2025-04-15 20:09         ` Jay Vosburgh
2025-04-15 22:13           ` David Wilder
2025-04-16  0:35             ` Hangbin Liu
2025-04-16  1:11             ` Jay Vosburgh
2025-04-16 18:57               ` David Wilder
2025-04-18  3:59                 ` Jay Vosburgh
2025-04-18 19:20                   ` David Wilder
2025-04-15 13:58   ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).