* [PATCH net] bonding: fix arp requests sends with isolated routes
@ 2014-02-14 15:59 François Cachereul
  2014-02-17  9:36 ` Veaceslav Falico
  2014-02-17 19:56 ` David Miller
  0 siblings, 2 replies; 6+ messages in thread
From: François Cachereul @ 2014-02-14 15:59 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek; +Cc: netdev
Make arp_send_all() try to send arp packets through slave devices event
if no route to arp_ip_target is found. This is useful when the route
is in an isolated routing table with routing rule parameters like oif or
iif in which case ip_route_output() return an error.
Thus, the arp packet is send without vlan and with the bond ip address
as sender.
Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
---
This previously worked, the problem was added in 2.6.35 with vlan 0
added by default when the module 8021q is loaded. Before that no route
lookup was done if the bond device did not have any vlan. The problem
now exists event if the module 8021q is not loaded.
 drivers/net/bonding/bond_main.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 8676649..300e5b8 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2168,17 +2168,19 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
 		pr_debug("basa: target %pI4\n", &targets[i]);
 
+		vlan_id = 0;
+
 		/* Find out through which dev should the packet go */
 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
 				     RTO_ONLINK, 0);
 		if (IS_ERR(rt)) {
 			pr_debug("%s: no route to arp_ip_target %pI4\n",
 				 bond->dev->name, &targets[i]);
-			continue;
+			/* no route found, trying with bond->dev */
+			addr = bond_confirm_addr(bond->dev, targets[i], 0);
+			goto rt_err_try;
 		}
 
-		vlan_id = 0;
-
 		/* bond device itself */
 		if (rt->dst.dev == bond->dev)
 			goto found;
@@ -2232,6 +2234,7 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 found:
 		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
 		ip_rt_put(rt);
+rt_err_try:
 		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
 			      addr, vlan_id);
 	}
-- 
1.7.10.4
^ permalink raw reply related	[flat|nested] 6+ messages in thread
* Re: [PATCH net] bonding: fix arp requests sends with isolated routes
  2014-02-14 15:59 [PATCH net] bonding: fix arp requests sends with isolated routes François Cachereul
@ 2014-02-17  9:36 ` Veaceslav Falico
  2014-02-17 11:07   ` François Cachereul
  2014-02-17 19:56 ` David Miller
  1 sibling, 1 reply; 6+ messages in thread
From: Veaceslav Falico @ 2014-02-17  9:36 UTC (permalink / raw)
  To: François Cachereul; +Cc: Jay Vosburgh, Andy Gospodarek, netdev
On Fri, Feb 14, 2014 at 04:59:23PM +0100, François Cachereul wrote:
>Make arp_send_all() try to send arp packets through slave devices event
>if no route to arp_ip_target is found. This is useful when the route
>is in an isolated routing table with routing rule parameters like oif or
>iif in which case ip_route_output() return an error.
>Thus, the arp packet is send without vlan and with the bond ip address
>as sender.
I'm not sure I understand it completely, specifically I don't really
understand the term "isolated routing table". Do you mean that it's an
routing table different from local/main, which is enabled by
CONFIG_IP_MULTIPLE_TABLES=y ? I think they should be all 'catched' by
ip_route_output(), or am I missing something?
Anyway, with this fix bonding will send packets even if it doesn't find
route AND src addr (bond_confirm_addr() can return 0 if bond doesn't have
the required ip assigned), which isn't good at all.
If my assumption about the routing tables is correct and ip_route_output()
doesn't find addition tables, we should at least try to fix it via scanning
those tables.
Sorry if I didn't understand you correctly...
>
>Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
>---
>This previously worked, the problem was added in 2.6.35 with vlan 0
>added by default when the module 8021q is loaded. Before that no route
>lookup was done if the bond device did not have any vlan. The problem
>now exists event if the module 8021q is not loaded.
>
> drivers/net/bonding/bond_main.c |    9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 8676649..300e5b8 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2168,17 +2168,19 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> 	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
> 		pr_debug("basa: target %pI4\n", &targets[i]);
>
>+		vlan_id = 0;
>+
> 		/* Find out through which dev should the packet go */
> 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
> 				     RTO_ONLINK, 0);
> 		if (IS_ERR(rt)) {
> 			pr_debug("%s: no route to arp_ip_target %pI4\n",
> 				 bond->dev->name, &targets[i]);
>-			continue;
>+			/* no route found, trying with bond->dev */
>+			addr = bond_confirm_addr(bond->dev, targets[i], 0);
>+			goto rt_err_try;
> 		}
>
>-		vlan_id = 0;
>-
> 		/* bond device itself */
> 		if (rt->dst.dev == bond->dev)
> 			goto found;
>@@ -2232,6 +2234,7 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> found:
> 		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
> 		ip_rt_put(rt);
>+rt_err_try:
> 		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
> 			      addr, vlan_id);
> 	}
>-- 
>1.7.10.4
>
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH net] bonding: fix arp requests sends with isolated routes
  2014-02-17  9:36 ` Veaceslav Falico
@ 2014-02-17 11:07   ` François Cachereul
  0 siblings, 0 replies; 6+ messages in thread
From: François Cachereul @ 2014-02-17 11:07 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: Jay Vosburgh, Andy Gospodarek, netdev
Le 17/02/2014 10:36, Veaceslav Falico a écrit :
> On Fri, Feb 14, 2014 at 04:59:23PM +0100, François Cachereul wrote:
>> Make arp_send_all() try to send arp packets through slave devices event
>> if no route to arp_ip_target is found. This is useful when the route
>> is in an isolated routing table with routing rule parameters like oif or
>> iif in which case ip_route_output() return an error.
>> Thus, the arp packet is send without vlan and with the bond ip address
>> as sender.
> 
> I'm not sure I understand it completely, specifically I don't really
> understand the term "isolated routing table". Do you mean that it's an
> routing table different from local/main, which is enabled by
> CONFIG_IP_MULTIPLE_TABLES=y ? I think they should be all 'catched' by
> ip_route_output(), or am I missing something?
That what I meant, but like I said when the rule to lookup a table contains
iif or oif parameters (for example with the bond device value),
ip_route_output() can't lookup this table as flowi4_iif is set to
LOOPBACK_INDEX and we set flowi4_oif to 0 because we are searching for it.
> 
> Anyway, with this fix bonding will send packets even if it doesn't find
> route AND src addr (bond_confirm_addr() can return 0 if bond doesn't have
> the required ip assigned), which isn't good at all.
We may had a test to avoid this case but that should also be done if the
route is found because bond_confirm_addr() could likewise return 0.
> 
> If my assumption about the routing tables is correct and ip_route_output()
> doesn't find addition tables, we should at least try to fix it via scanning
> those tables.
ip_route_output() scan all tables for which the rule match parameters like
src network, dst network, oif or iif. In my case, it can't match iif or oif.
If we want to search all tables in any configurations we would have to bypass
route lookup process or at least the "rule match" part. I don't know if it's
a good idea.
Anyway, what I understand is that we are searching for a route only to find
a vlan to add to the output packet, but we don't care of the route. We just
want the packet to pass filtering switches to test the link.
I just propose a default configuration in order to succeed most time
(when we can't find any route) in a way to find a previous behavior in which
this use case worked.
> Sorry if I didn't understand you correctly...
No problem, you understand it well. Thanks for the reply.
> 
>>
>> Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
>> ---
>> This previously worked, the problem was added in 2.6.35 with vlan 0
>> added by default when the module 8021q is loaded. Before that no route
>> lookup was done if the bond device did not have any vlan. The problem
>> now exists event if the module 8021q is not loaded.
>>
>> drivers/net/bonding/bond_main.c |    9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 8676649..300e5b8 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2168,17 +2168,19 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>>     for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
>>         pr_debug("basa: target %pI4\n", &targets[i]);
>>
>> +        vlan_id = 0;
>> +
>>         /* Find out through which dev should the packet go */
>>         rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
>>                      RTO_ONLINK, 0);
>>         if (IS_ERR(rt)) {
>>             pr_debug("%s: no route to arp_ip_target %pI4\n",
>>                  bond->dev->name, &targets[i]);
>> -            continue;
>> +            /* no route found, trying with bond->dev */
>> +            addr = bond_confirm_addr(bond->dev, targets[i], 0);
>> +            goto rt_err_try;
>>         }
>>
>> -        vlan_id = 0;
>> -
>>         /* bond device itself */
>>         if (rt->dst.dev == bond->dev)
>>             goto found;
>> @@ -2232,6 +2234,7 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> found:
>>         addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
>>         ip_rt_put(rt);
>> +rt_err_try:
>>         bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
>>                   addr, vlan_id);
>>     }
>> -- 
>> 1.7.10.4
>>
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH net] bonding: fix arp requests sends with isolated routes
  2014-02-14 15:59 [PATCH net] bonding: fix arp requests sends with isolated routes François Cachereul
  2014-02-17  9:36 ` Veaceslav Falico
@ 2014-02-17 19:56 ` David Miller
  2014-02-18  1:07   ` Jay Vosburgh
  1 sibling, 1 reply; 6+ messages in thread
From: David Miller @ 2014-02-17 19:56 UTC (permalink / raw)
  To: f.cachereul; +Cc: fubar, vfalico, andy, netdev
From: François Cachereul <f.cachereul@alphalink.fr>
Date: Fri, 14 Feb 2014 16:59:23 +0100
> Make arp_send_all() try to send arp packets through slave devices event
> if no route to arp_ip_target is found. This is useful when the route
> is in an isolated routing table with routing rule parameters like oif or
> iif in which case ip_route_output() return an error.
> Thus, the arp packet is send without vlan and with the bond ip address
> as sender.
> 
> Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
> ---
> This previously worked, the problem was added in 2.6.35 with vlan 0
> added by default when the module 8021q is loaded. Before that no route
> lookup was done if the bond device did not have any vlan. The problem
> now exists event if the module 8021q is not loaded.
I don't like this at all, you're trying to paper over the fact that we
can't set the flow key correctly at this point.
Just assuming the route might be there and trying anyways is not really
acceptable in my opinion.  There's a reason we do a route lookup at all.
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH net] bonding: fix arp requests sends with isolated routes
  2014-02-17 19:56 ` David Miller
@ 2014-02-18  1:07   ` Jay Vosburgh
  2014-02-18 10:35     ` François Cachereul
  0 siblings, 1 reply; 6+ messages in thread
From: Jay Vosburgh @ 2014-02-18  1:07 UTC (permalink / raw)
  To: David Miller; +Cc: f.cachereul, vfalico, andy, netdev
David Miller <davem@davemloft.net> wrote:
>From: François Cachereul <f.cachereul@alphalink.fr>
>Date: Fri, 14 Feb 2014 16:59:23 +0100
>
>> Make arp_send_all() try to send arp packets through slave devices event
>> if no route to arp_ip_target is found. This is useful when the route
>> is in an isolated routing table with routing rule parameters like oif or
>> iif in which case ip_route_output() return an error.
>> Thus, the arp packet is send without vlan and with the bond ip address
>> as sender.
>> 
>> Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
>> ---
>> This previously worked, the problem was added in 2.6.35 with vlan 0
>> added by default when the module 8021q is loaded. Before that no route
>> lookup was done if the bond device did not have any vlan. The problem
>> now exists event if the module 8021q is not loaded.
>
>I don't like this at all, you're trying to paper over the fact that we
>can't set the flow key correctly at this point.
>
>Just assuming the route might be there and trying anyways is not really
>acceptable in my opinion.  There's a reason we do a route lookup at all.
	The reason for the route lookup is to get a VLAN ID for the
outgoing ARP (if VLANs are configured above the bond), so it can be
correctly tagged.
	As Francois says, older versions of the bond_arp_send_all
function would skip the route lookup entirely if there were no VLANs
configured above the bond.  E.g., the original logic from a 2.6.32-era
kernel looks like:
	for (i = 0; (i < BOND_MAX_ARP_TARGETS); i++) {
[...]
		if (!bond->vlgrp) {
			pr_debug("basa: empty vlan: arp_send\n");
			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
				      bond->master_ip, 0);
			continue;
		}
		/*
		 * If VLANs are configured, we do a route lookup to
		 * determine which VLAN interface would be used, so we
		 * can tag the ARP with the proper VLAN tag.
		 */
		memset(&fl, 0, sizeof(fl));
		fl.fl4_dst = targets[i];
		fl.fl4_tos = RTO_ONLINK;
		rv = ip_route_output_key(&init_net, &rt, &fl);
[...]
	So, in the past, this particular case (oif / iif in route
selection) would "work," in the sense that an ARP would go out with no
VLAN ID, but only when there were known to be no VLANs configured above
the bond.  If any VLANs were configured above the bond, this case would
fail as we're seeing here.
	Nowadays, there is no easy way to tell if there are VLANs above
the bond, and there's generally a VID 0 configured anyway, so the route
lookup is unconditional.  In the case at issue here (the route lookup
for the arp_ip_target IP address fails), it's not possible for bonding
to determine what interface would be used, and therefore what VLAN tag
to use.
	Francois's patch would make bonding essentially take a best
guess of "no VLAN" and send an untagged ARP for any destination not
found in the regular (no iif, oif, etc, rule) routing table, which is
what used to happen for the "known no VLAN" case.
	With the patch, these ARPs may have an all-zero source IP
address (since the bond_confirm_addr call may not find a suitable source
address for something it can't find a route to).  That is a legal ARP
(used for duplicate address detection according to RFC 2131), but when
last I tried it a couple of years ago, the replies won't pass
arp_validate (as the target IP of 0.0.0.0 in the reply doesn't match any
of the bond's IP address), and I suspect that hasn't changed.
	In the days of yore code above, bonding kept track of what it
thought the bond's IP address was (bond->master_ip), and used that as
the source IP in the ARPs.  That wasn't always correct if the bond had
multiple IP addresses.
	So, ultimately, Francois is correct that this is a regression of
a behavior that used to work.  On the other hand, this patch isn't
really a complete restoration of the prior behavior.  It's no longer
possible to know that there aren't any VLANs above the bond, and so the
"no VLAN" guess is much less reliable than it used to be, plus the ARPs
that will be generated probably won't work with arp_validate.
	As much as I loathe adding more options to bonding, a manually
selected "force VLAN ID" for the arp_ip_target(s) would resolve this for
the minority of cases where the automatic VLAN ID selection does not
function.
	-J
---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [PATCH net] bonding: fix arp requests sends with isolated routes
  2014-02-18  1:07   ` Jay Vosburgh
@ 2014-02-18 10:35     ` François Cachereul
  0 siblings, 0 replies; 6+ messages in thread
From: François Cachereul @ 2014-02-18 10:35 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: David Miller, vfalico, andy, netdev
Le 18/02/2014 02:07, Jay Vosburgh a écrit :
> David Miller <davem@davemloft.net> wrote:
> 
>> From: François Cachereul <f.cachereul@alphalink.fr>
>> Date: Fri, 14 Feb 2014 16:59:23 +0100
>>
>>> Make arp_send_all() try to send arp packets through slave devices event
>>> if no route to arp_ip_target is found. This is useful when the route
>>> is in an isolated routing table with routing rule parameters like oif or
>>> iif in which case ip_route_output() return an error.
>>> Thus, the arp packet is send without vlan and with the bond ip address
>>> as sender.
>>>
>>> Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
>>> ---
>>> This previously worked, the problem was added in 2.6.35 with vlan 0
>>> added by default when the module 8021q is loaded. Before that no route
>>> lookup was done if the bond device did not have any vlan. The problem
>>> now exists event if the module 8021q is not loaded.
>>
>> I don't like this at all, you're trying to paper over the fact that we
>> can't set the flow key correctly at this point.
>>
>> Just assuming the route might be there and trying anyways is not really
>> acceptable in my opinion.  There's a reason we do a route lookup at all.
> 
> 	The reason for the route lookup is to get a VLAN ID for the
> outgoing ARP (if VLANs are configured above the bond), so it can be
> correctly tagged.
> 
> 	As Francois says, older versions of the bond_arp_send_all
> function would skip the route lookup entirely if there were no VLANs
> configured above the bond.  E.g., the original logic from a 2.6.32-era
> kernel looks like:
> 
> 	for (i = 0; (i < BOND_MAX_ARP_TARGETS); i++) {
> [...]
> 		if (!bond->vlgrp) {
> 			pr_debug("basa: empty vlan: arp_send\n");
> 			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
> 				      bond->master_ip, 0);
> 			continue;
> 		}
> 
> 		/*
> 		 * If VLANs are configured, we do a route lookup to
> 		 * determine which VLAN interface would be used, so we
> 		 * can tag the ARP with the proper VLAN tag.
> 		 */
> 		memset(&fl, 0, sizeof(fl));
> 		fl.fl4_dst = targets[i];
> 		fl.fl4_tos = RTO_ONLINK;
> 
> 		rv = ip_route_output_key(&init_net, &rt, &fl);
> [...]
> 
> 	So, in the past, this particular case (oif / iif in route
> selection) would "work," in the sense that an ARP would go out with no
> VLAN ID, but only when there were known to be no VLANs configured above
> the bond.  If any VLANs were configured above the bond, this case would
> fail as we're seeing here.
> 
> 	Nowadays, there is no easy way to tell if there are VLANs above
> the bond, and there's generally a VID 0 configured anyway, so the route
> lookup is unconditional.  In the case at issue here (the route lookup
> for the arp_ip_target IP address fails), it's not possible for bonding
> to determine what interface would be used, and therefore what VLAN tag
> to use.
> 
> 	Francois's patch would make bonding essentially take a best
> guess of "no VLAN" and send an untagged ARP for any destination not
> found in the regular (no iif, oif, etc, rule) routing table, which is
> what used to happen for the "known no VLAN" case.
> 
> 	With the patch, these ARPs may have an all-zero source IP
> address (since the bond_confirm_addr call may not find a suitable source
> address for something it can't find a route to).  That is a legal ARP
> (used for duplicate address detection according to RFC 2131), but when
> last I tried it a couple of years ago, the replies won't pass
> arp_validate (as the target IP of 0.0.0.0 in the reply doesn't match any
> of the bond's IP address), and I suspect that hasn't changed.
This problem exists already when a route is found. Currently, 
bond_confirm_addr() call at the end of bond_arp_send_all() may already
not find a suitable source address, if for example a route and its
source address are not in the same network.
> 
> 	In the days of yore code above, bonding kept track of what it
> thought the bond's IP address was (bond->master_ip), and used that as
> the source IP in the ARPs.  That wasn't always correct if the bond had
> multiple IP addresses.
> 
> 	So, ultimately, Francois is correct that this is a regression of
> a behavior that used to work.  On the other hand, this patch isn't
> really a complete restoration of the prior behavior.  It's no longer
> possible to know that there aren't any VLANs above the bond, and so the
> "no VLAN" guess is much less reliable than it used to be, plus the ARPs
> that will be generated probably won't work with arp_validate.
> 
> 	As much as I loathe adding more options to bonding, a manually
> selected "force VLAN ID" for the arp_ip_target(s) would resolve this for
> the minority of cases where the automatic VLAN ID selection does not
> function.
I had thought about adding an option but nothing I came with seemed good
enough. That's why I proposed this solution. 
Maybe something like "ip_target:forced_vlan_id" per ip target for the 
arp_ip_target option would do the trick. ':forced_vlan_id' would have to
be omitted for automatic VLAN ID selection or set to -1 for no vlan id
(0 doesn't seem a good idea as it's used for priority tagged frames).
This way we keep current behavior.
If you're ok with this, I'll submit another patch with the modified
option.
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> 
^ permalink raw reply	[flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-02-18 10:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-14 15:59 [PATCH net] bonding: fix arp requests sends with isolated routes François Cachereul
2014-02-17  9:36 ` Veaceslav Falico
2014-02-17 11:07   ` François Cachereul
2014-02-17 19:56 ` David Miller
2014-02-18  1:07   ` Jay Vosburgh
2014-02-18 10:35     ` François Cachereul
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).