* Re: [BUG] VPN broken in net-next
2011-03-03 13:09 ` Julian Anastasov
@ 2011-03-03 17:32 ` Stephen Hemminger
2011-03-03 19:23 ` David Miller
2011-03-09 21:28 ` David Miller
2 siblings, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2011-03-03 17:32 UTC (permalink / raw)
To: Julian Anastasov; +Cc: David Miller, netdev
> Hello,
>
> On Thu, 3 Mar 2011, Julian Anastasov wrote:
>
> > May be the problem is in inet_hash_insert(), it should
> > hash ifa_local, not ifa_address. May be they are equal for
>
> ... and of course the new __ip_dev_find should use
> ifa_local too.
>
> > the common case but not for peer addresses. In devinet_ioctl()
> > we can see they are equal initially:
> >
> > ifa->ifa_address = ifa->ifa_local = sin->sin_addr.s_addr;
> >
> > but later SIOCSIFDSTADDR can change ifa_address which
> > is destination address from the same prefix:
> >
> > ifa->ifa_address = sin->sin_addr.s_addr;
>
> While checking for ifa_address usage I see other
> two places that look suspicious:
>
> - inet_gifconf() exposes address from ifa_local but then
> devinet_ioctl() matches by ifa_address in the
> 'if (tryaddrmatch)' block. I think, we should use ifa_local.
>
> - IN_DEV_ARP_NOTIFY: announces ifa_address instead of ifa_local.
Julian you are on the right track with the ifa_local being the issue.
1. Bring up VPN
(no errors)
2. Ping one address gets connect error
3. Instrumentation triggers. I added code so if __ip_dev_find failed,
it walked the hash table.
net=ffffffff81a12d80
[ 393.224228] __ip_dev_find(ffffffff81a12d80, 10.250.0.104) hash=108 failed
[ 393.224232] 138: ffffffff81a12d80 ifa_addr=127.0.0.1 ifa_local=127.0.0.1
[ 393.224236] 150: ffffffff81a12d80 ifa_addr=192.168.1.11 ifa_local=192.168.1.11
[ 393.224239] 249: ffffffff81a12d80 ifa_addr=192.168.100.1 ifa_local=192.168.100.1
[ 393.224242] 254: ffffffff81a12d80 ifa_addr=192.168.99.1 ifa_local=192.168.99.1
[ 393.224245] 255: ffffffff81a12d80 ifa_addr=10.255.254.0 ifa_local=10.250.0.10
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] VPN broken in net-next
2011-03-03 13:09 ` Julian Anastasov
2011-03-03 17:32 ` Stephen Hemminger
@ 2011-03-03 19:23 ` David Miller
2011-03-03 21:54 ` Stephen Hemminger
2011-03-04 8:39 ` Julian Anastasov
2011-03-09 21:28 ` David Miller
2 siblings, 2 replies; 19+ messages in thread
From: David Miller @ 2011-03-03 19:23 UTC (permalink / raw)
To: ja; +Cc: shemminger, netdev
From: Julian Anastasov <ja@ssi.bg>
Date: Thu, 3 Mar 2011 15:09:22 +0200 (EET)
> On Thu, 3 Mar 2011, Julian Anastasov wrote:
>
>> May be the problem is in inet_hash_insert(), it should
>> hash ifa_local, not ifa_address. May be they are equal for
>
> ... and of course the new __ip_dev_find should use
> ifa_local too.
Thanks for looking into this Julian. I will look at the other
cases you found later.
Stephen, is this sufficient to fix your problem? I suspect it is
not because fib_add_addr() adds prefixes with RTN_LOCAL to the
local routing table too :-/
I suspect that even if we need to handle prefixes, we can still use
the hash for optimistic lookup, and fallback to a local table FIB
inspection if that fails.
--------------------
ipv4: Fix __ip_dev_find() to use ifa_local instead of ifa_address.
Reported-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 9038928..ff53860 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -111,7 +111,7 @@ static inline unsigned int inet_addr_hash(struct net *net, __be32 addr)
static void inet_hash_insert(struct net *net, struct in_ifaddr *ifa)
{
- unsigned int hash = inet_addr_hash(net, ifa->ifa_address);
+ unsigned int hash = inet_addr_hash(net, ifa->ifa_local);
spin_lock(&inet_addr_hash_lock);
hlist_add_head_rcu(&ifa->hash, &inet_addr_lst[hash]);
@@ -146,7 +146,7 @@ struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
if (!net_eq(dev_net(dev), net))
continue;
- if (ifa->ifa_address == addr) {
+ if (ifa->ifa_local == addr) {
result = dev;
break;
}
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [BUG] VPN broken in net-next
2011-03-03 19:23 ` David Miller
@ 2011-03-03 21:54 ` Stephen Hemminger
2011-03-04 8:39 ` Julian Anastasov
1 sibling, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2011-03-03 21:54 UTC (permalink / raw)
To: David Miller; +Cc: ja, netdev
On Thu, 03 Mar 2011 11:23:28 -0800 (PST)
David Miller <davem@davemloft.net> wrote:
> From: Julian Anastasov <ja@ssi.bg>
> Date: Thu, 3 Mar 2011 15:09:22 +0200 (EET)
>
> > On Thu, 3 Mar 2011, Julian Anastasov wrote:
> >
> >> May be the problem is in inet_hash_insert(), it should
> >> hash ifa_local, not ifa_address. May be they are equal for
> >
> > ... and of course the new __ip_dev_find should use
> > ifa_local too.
>
> Thanks for looking into this Julian. I will look at the other
> cases you found later.
>
> Stephen, is this sufficient to fix your problem? I suspect it is
> not because fib_add_addr() adds prefixes with RTN_LOCAL to the
> local routing table too :-/
>
> I suspect that even if we need to handle prefixes, we can still use
> the hash for optimistic lookup, and fallback to a local table FIB
> inspection if that fails.
>
> --------------------
> ipv4: Fix __ip_dev_find() to use ifa_local instead of ifa_address.
>
> Reported-by: Stephen Hemminger <shemminger@vyatta.com>
> Reported-by: Julian Anastasov <ja@ssi.bg>
> Signed-off-by: David S. Miller <davem@davemloft.net>
>
VPN works now with this patch on net-next.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] VPN broken in net-next
2011-03-03 19:23 ` David Miller
2011-03-03 21:54 ` Stephen Hemminger
@ 2011-03-04 8:39 ` Julian Anastasov
2011-03-23 4:56 ` David Miller
1 sibling, 1 reply; 19+ messages in thread
From: Julian Anastasov @ 2011-03-04 8:39 UTC (permalink / raw)
To: David Miller; +Cc: shemminger, netdev
Hello,
On Thu, 3 Mar 2011, David Miller wrote:
> I suspect that even if we need to handle prefixes, we can still use
> the hash for optimistic lookup, and fallback to a local table FIB
> inspection if that fails.
Yes, as ip_route_output_slow uses __ip_dev_find for
fl4_src there should be some kind of fallback to local table,
so that traffic from 127.0.0.2 to 127.0.0.3 or other local
subnets on loopback can work. Another option is to use
inet_addr_onlink but I suspect people can add many addresses
on loopback: inet_addr_onlink(loopback_indev, addr, 0)
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] VPN broken in net-next
2011-03-04 8:39 ` Julian Anastasov
@ 2011-03-23 4:56 ` David Miller
2011-03-23 9:05 ` Julian Anastasov
2011-03-23 15:24 ` Stephen Hemminger
0 siblings, 2 replies; 19+ messages in thread
From: David Miller @ 2011-03-23 4:56 UTC (permalink / raw)
To: ja; +Cc: shemminger, netdev
From: Julian Anastasov <ja@ssi.bg>
Date: Fri, 4 Mar 2011 10:39:55 +0200 (EET)
> On Thu, 3 Mar 2011, David Miller wrote:
>
>> I suspect that even if we need to handle prefixes, we can still use
>> the hash for optimistic lookup, and fallback to a local table FIB
>> inspection if that fails.
>
> Yes, as ip_route_output_slow uses __ip_dev_find for
> fl4_src there should be some kind of fallback to local table,
> so that traffic from 127.0.0.2 to 127.0.0.3 or other local
> subnets on loopback can work. Another option is to use
> inet_addr_onlink but I suspect people can add many addresses
> on loopback: inet_addr_onlink(loopback_indev, addr, 0)
I just got back to this, sorry for taking so long :-)
Here is the patch I've come up with and will commit to
net-2.6, thanks!
--------------------
ipv4: Fallback to FIB local table in __ip_dev_find().
In commit 9435eb1cf0b76b323019cebf8d16762a50a12a19
("ipv4: Implement __ip_dev_find using new interface address hash.")
we reimplemented __ip_dev_find() so that it doesn't have to
do a full FIB table lookup.
Instead, it consults a hash table of addresses configured to
interfaces.
This works identically to the old code in all except one case,
and that is for loopback subnets.
The old code would match the loopback device for any IP address
that falls within a subnet configured to the loopback device.
Handle this corner case by doing the FIB lookup.
We could implement this via inet_addr_onlink() but:
1) Someone could configure many addresses to loopback and
inet_addr_onlink() is a simple list traversal.
2) We know the old code works.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/ipv4/devinet.c | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index d5a4553..5345b0b 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -64,6 +64,8 @@
#include <net/rtnetlink.h>
#include <net/net_namespace.h>
+#include "fib_lookup.h"
+
static struct ipv4_devconf ipv4_devconf = {
.data = {
[IPV4_DEVCONF_ACCEPT_REDIRECTS - 1] = 1,
@@ -151,6 +153,20 @@ struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
break;
}
}
+ if (!result) {
+ struct flowi4 fl4 = { .daddr = addr };
+ struct fib_result res = { 0 };
+ struct fib_table *local;
+
+ /* Fallback to FIB local table so that communication
+ * over loopback subnets work.
+ */
+ local = fib_get_table(net, RT_TABLE_LOCAL);
+ if (local &&
+ !fib_table_lookup(local, &fl4, &res, FIB_LOOKUP_NOREF) &&
+ res.type == RTN_LOCAL)
+ result = FIB_RES_DEV(res);
+ }
if (result && devref)
dev_hold(result);
rcu_read_unlock();
--
1.7.4.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [BUG] VPN broken in net-next
2011-03-23 4:56 ` David Miller
@ 2011-03-23 9:05 ` Julian Anastasov
2011-03-23 15:24 ` Stephen Hemminger
1 sibling, 0 replies; 19+ messages in thread
From: Julian Anastasov @ 2011-03-23 9:05 UTC (permalink / raw)
To: David Miller; +Cc: shemminger, netdev
Hello,
On Tue, 22 Mar 2011, David Miller wrote:
> I just got back to this, sorry for taking so long :-)
>
> Here is the patch I've come up with and will commit to
> net-2.6, thanks!
>
> --------------------
> ipv4: Fallback to FIB local table in __ip_dev_find().
>
> In commit 9435eb1cf0b76b323019cebf8d16762a50a12a19
> ("ipv4: Implement __ip_dev_find using new interface address hash.")
> we reimplemented __ip_dev_find() so that it doesn't have to
> do a full FIB table lookup.
>
> Instead, it consults a hash table of addresses configured to
> interfaces.
>
> This works identically to the old code in all except one case,
> and that is for loopback subnets.
>
> The old code would match the loopback device for any IP address
> that falls within a subnet configured to the loopback device.
>
> Handle this corner case by doing the FIB lookup.
>
> We could implement this via inet_addr_onlink() but:
>
> 1) Someone could configure many addresses to loopback and
> inet_addr_onlink() is a simple list traversal.
>
> 2) We know the old code works.
>
> Reported-by: Julian Anastasov <ja@ssi.bg>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> net/ipv4/devinet.c | 16 ++++++++++++++++
> 1 files changed, 16 insertions(+), 0 deletions(-)
>
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index d5a4553..5345b0b 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -64,6 +64,8 @@
> #include <net/rtnetlink.h>
> #include <net/net_namespace.h>
>
> +#include "fib_lookup.h"
> +
> static struct ipv4_devconf ipv4_devconf = {
> .data = {
> [IPV4_DEVCONF_ACCEPT_REDIRECTS - 1] = 1,
> @@ -151,6 +153,20 @@ struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
> break;
> }
> }
> + if (!result) {
> + struct flowi4 fl4 = { .daddr = addr };
> + struct fib_result res = { 0 };
> + struct fib_table *local;
> +
> + /* Fallback to FIB local table so that communication
> + * over loopback subnets work.
> + */
> + local = fib_get_table(net, RT_TABLE_LOCAL);
> + if (local &&
> + !fib_table_lookup(local, &fl4, &res, FIB_LOOKUP_NOREF) &&
> + res.type == RTN_LOCAL)
> + result = FIB_RES_DEV(res);
> + }
> if (result && devref)
> dev_hold(result);
> rcu_read_unlock();
> --
> 1.7.4.1
Not tested but looks ok for me.
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] VPN broken in net-next
2011-03-23 4:56 ` David Miller
2011-03-23 9:05 ` Julian Anastasov
@ 2011-03-23 15:24 ` Stephen Hemminger
1 sibling, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2011-03-23 15:24 UTC (permalink / raw)
To: David Miller; +Cc: ja, netdev
On Tue, 22 Mar 2011 21:56:55 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: Julian Anastasov <ja@ssi.bg>
> Date: Fri, 4 Mar 2011 10:39:55 +0200 (EET)
>
> > On Thu, 3 Mar 2011, David Miller wrote:
> >
> >> I suspect that even if we need to handle prefixes, we can still use
> >> the hash for optimistic lookup, and fallback to a local table FIB
> >> inspection if that fails.
> >
> > Yes, as ip_route_output_slow uses __ip_dev_find for
> > fl4_src there should be some kind of fallback to local table,
> > so that traffic from 127.0.0.2 to 127.0.0.3 or other local
> > subnets on loopback can work. Another option is to use
> > inet_addr_onlink but I suspect people can add many addresses
> > on loopback: inet_addr_onlink(loopback_indev, addr, 0)
>
> I just got back to this, sorry for taking so long :-)
>
> Here is the patch I've come up with and will commit to
> net-2.6, thanks!
>
> --------------------
> ipv4: Fallback to FIB local table in __ip_dev_find().
>
> In commit 9435eb1cf0b76b323019cebf8d16762a50a12a19
> ("ipv4: Implement __ip_dev_find using new interface address hash.")
> we reimplemented __ip_dev_find() so that it doesn't have to
> do a full FIB table lookup.
>
> Instead, it consults a hash table of addresses configured to
> interfaces.
>
> This works identically to the old code in all except one case,
> and that is for loopback subnets.
>
> The old code would match the loopback device for any IP address
> that falls within a subnet configured to the loopback device.
>
> Handle this corner case by doing the FIB lookup.
>
> We could implement this via inet_addr_onlink() but:
>
> 1) Someone could configure many addresses to loopback and
> inet_addr_onlink() is a simple list traversal.
>
> 2) We know the old code works.
>
> Reported-by: Julian Anastasov <ja@ssi.bg>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> net/ipv4/devinet.c | 16 ++++++++++++++++
> 1 files changed, 16 insertions(+), 0 deletions(-)
>
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index d5a4553..5345b0b 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -64,6 +64,8 @@
> #include <net/rtnetlink.h>
> #include <net/net_namespace.h>
>
> +#include "fib_lookup.h"
> +
> static struct ipv4_devconf ipv4_devconf = {
> .data = {
> [IPV4_DEVCONF_ACCEPT_REDIRECTS - 1] = 1,
> @@ -151,6 +153,20 @@ struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
> break;
> }
> }
> + if (!result) {
> + struct flowi4 fl4 = { .daddr = addr };
> + struct fib_result res = { 0 };
> + struct fib_table *local;
> +
> + /* Fallback to FIB local table so that communication
> + * over loopback subnets work.
> + */
> + local = fib_get_table(net, RT_TABLE_LOCAL);
> + if (local &&
> + !fib_table_lookup(local, &fl4, &res, FIB_LOOKUP_NOREF) &&
> + res.type == RTN_LOCAL)
> + result = FIB_RES_DEV(res);
> + }
> if (result && devref)
> dev_hold(result);
> rcu_read_unlock();
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
--
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [BUG] VPN broken in net-next
2011-03-03 13:09 ` Julian Anastasov
2011-03-03 17:32 ` Stephen Hemminger
2011-03-03 19:23 ` David Miller
@ 2011-03-09 21:28 ` David Miller
2 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2011-03-09 21:28 UTC (permalink / raw)
To: ja; +Cc: shemminger, netdev
From: Julian Anastasov <ja@ssi.bg>
Date: Thu, 3 Mar 2011 15:09:22 +0200 (EET)
> While checking for ifa_address usage I see other
> two places that look suspicious:
>
> - inet_gifconf() exposes address from ifa_local but then
> devinet_ioctl() matches by ifa_address in the
> 'if (tryaddrmatch)' block. I think, we should use ifa_local.
>
> - IN_DEV_ARP_NOTIFY: announces ifa_address instead of ifa_local.
Thanks Julian, I'll push the following into net-2.6:
--------------------
ipv4: Fix erroneous uses of ifa_address.
In usual cases ifa_address == ifa_local, but in the case where
SIOCSIFDSTADDR sets the destination address on a point-to-point
link, ifa_address gets set to that destination address.
Therefore we should use ifa_local when we want the local interface
address.
There were two cases where the selection was done incorrectly:
1) When devinet_ioctl() does matching, it checks ifa_address even
though gifconf correct reported ifa_local to the user
2) IN_DEV_ARP_NOTIFY handling sends a gratuitous ARP using
ifa_address instead of ifa_local.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/ipv4/devinet.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index df4616f..036652c 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -670,7 +670,7 @@ int devinet_ioctl(struct net *net, unsigned int cmd, void __user *arg)
ifap = &ifa->ifa_next) {
if (!strcmp(ifr.ifr_name, ifa->ifa_label) &&
sin_orig.sin_addr.s_addr ==
- ifa->ifa_address) {
+ ifa->ifa_local) {
break; /* found */
}
}
@@ -1040,8 +1040,8 @@ static void inetdev_send_gratuitous_arp(struct net_device *dev,
return;
arp_send(ARPOP_REQUEST, ETH_P_ARP,
- ifa->ifa_address, dev,
- ifa->ifa_address, NULL,
+ ifa->ifa_local, dev,
+ ifa->ifa_local, NULL,
dev->dev_addr, NULL);
}
--
1.7.4.1
^ permalink raw reply related [flat|nested] 19+ messages in thread