* [PATCH] Disable rp_filter for IPsec packets
@ 2011-03-14 21:14 Michael Smith
2011-03-14 21:25 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Michael Smith @ 2011-03-14 21:14 UTC (permalink / raw)
To: netdev
The reverse path filter interferes with IPsec subnet-to-subnet tunnels,
especially when the link to the IPsec peer is on an interface other than
the one hosting the default route.
With dynamic routing, where the peer might be reachable through eth0
today and eth1 tomorrow, it's difficult to keep rp_filter enabled unless
fake routes to the remote subnets are configured on the interface
currently used to reach the peer.
IPsec provides a much stronger anti-spoofing policy than rp_filter, so
this patch disables the rp_filter for packets with a security path.
Signed-off-by: Michael Smith <msmith@cbnco.com>
---
include/net/ip_fib.h | 2 +-
net/ipv4/fib_frontend.c | 4 ++--
net/ipv4/route.c | 26 ++++++++++++++++++--------
3 files changed, 21 insertions(+), 11 deletions(-)
I have a feeling the approach of checking for skb->sp != NULL is pretty
naive, but it seems to work. Patch is against Linus' tree.
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 07bdb5e..61d5365 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -217,7 +217,7 @@ extern const struct nla_policy rtm_ipv4_policy[];
extern void ip_fib_init(void);
extern int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
struct net_device *dev, __be32 *spec_dst,
- u32 *itag, u32 mark);
+ u32 *itag, u32 mark, int norpf);
extern void fib_select_default(struct net *net, const struct flowi *flp,
struct fib_result *res);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 1d2cdd4..ae889cc 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -245,7 +245,7 @@ EXPORT_SYMBOL(inet_dev_addr_type);
*/
int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
struct net_device *dev, __be32 *spec_dst,
- u32 *itag, u32 mark)
+ u32 *itag, u32 mark, int norpf)
{
struct in_device *in_dev;
struct flowi fl = {
@@ -265,7 +265,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
in_dev = __in_dev_get_rcu(dev);
if (in_dev) {
no_addr = in_dev->ifa_list == NULL;
- rpf = IN_DEV_RPFILTER(in_dev);
+ rpf = norpf ? 0 : IN_DEV_RPFILTER(in_dev);
accept_local = IN_DEV_ACCEPT_LOCAL(in_dev);
if (mark && !IN_DEV_SRC_VMARK(in_dev))
fl.mark = 0;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 6ed6603..3ada54c 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1870,7 +1870,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK);
} else {
err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst,
- &itag, 0);
+ &itag, 0, 0);
if (err < 0)
goto e_err;
}
@@ -1962,7 +1962,7 @@ static int __mkroute_input(struct sk_buff *skb,
struct fib_result *res,
struct in_device *in_dev,
__be32 daddr, __be32 saddr, u32 tos,
- struct rtable **result)
+ struct rtable **result, int norpf)
{
struct rtable *rth;
int err;
@@ -1982,7 +1982,8 @@ static int __mkroute_input(struct sk_buff *skb,
err = fib_validate_source(saddr, daddr, tos, FIB_RES_OIF(*res),
- in_dev->dev, &spec_dst, &itag, skb->mark);
+ in_dev->dev, &spec_dst, &itag, skb->mark,
+ norpf);
if (err < 0) {
ip_handle_martian_source(in_dev->dev, in_dev, skb, daddr,
saddr);
@@ -2059,7 +2060,7 @@ static int ip_mkroute_input(struct sk_buff *skb,
struct fib_result *res,
const struct flowi *fl,
struct in_device *in_dev,
- __be32 daddr, __be32 saddr, u32 tos)
+ __be32 daddr, __be32 saddr, u32 tos, int norpf)
{
struct rtable* rth = NULL;
int err;
@@ -2071,7 +2072,7 @@ static int ip_mkroute_input(struct sk_buff *skb,
#endif
/* create a routing cache entry */
- err = __mkroute_input(skb, res, in_dev, daddr, saddr, tos, &rth);
+ err = __mkroute_input(skb, res, in_dev, daddr, saddr, tos, &rth, norpf);
if (err)
return err;
@@ -2111,6 +2112,13 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
int err = -EINVAL;
struct net * net = dev_net(dev);
+#ifdef CONFIG_XFRM
+ /* Disable the reverse path filter for IPsec-protected packets. */
+ int norpf = skb->sp != NULL;
+#else
+ int norpf = 0;
+#endif
+
/* IP on this device is disabled. */
if (!in_dev)
@@ -2154,7 +2162,8 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
if (res.type == RTN_LOCAL) {
err = fib_validate_source(saddr, daddr, tos,
net->loopback_dev->ifindex,
- dev, &spec_dst, &itag, skb->mark);
+ dev, &spec_dst, &itag, skb->mark,
+ norpf);
if (err < 0)
goto martian_source_keep_err;
if (err)
@@ -2168,7 +2177,8 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
if (res.type != RTN_UNICAST)
goto martian_destination;
- err = ip_mkroute_input(skb, &res, &fl, in_dev, daddr, saddr, tos);
+ err = ip_mkroute_input(skb, &res, &fl, in_dev, daddr, saddr, tos,
+ norpf);
out: return err;
brd_input:
@@ -2179,7 +2189,7 @@ brd_input:
spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK);
else {
err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst,
- &itag, skb->mark);
+ &itag, skb->mark, norpf);
if (err < 0)
goto martian_source_keep_err;
if (err)
--
1.6.3.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 21:14 [PATCH] Disable rp_filter for IPsec packets Michael Smith
@ 2011-03-14 21:25 ` David Miller
2011-03-14 21:29 ` Michael Smith
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2011-03-14 21:25 UTC (permalink / raw)
To: msmith; +Cc: netdev
From: Michael Smith <msmith@cbnco.com>
Date: Mon, 14 Mar 2011 17:14:59 -0400
> The reverse path filter interferes with IPsec subnet-to-subnet tunnels,
> especially when the link to the IPsec peer is on an interface other than
> the one hosting the default route.
>
> With dynamic routing, where the peer might be reachable through eth0
> today and eth1 tomorrow, it's difficult to keep rp_filter enabled unless
> fake routes to the remote subnets are configured on the interface
> currently used to reach the peer.
>
> IPsec provides a much stronger anti-spoofing policy than rp_filter, so
> this patch disables the rp_filter for packets with a security path.
>
> Signed-off-by: Michael Smith <msmith@cbnco.com>
First, I'm only willing to accept a patch like this to net-next-2.6
for which all of the code you are changing is radically different.
Secondly, fib_validate_source() already takes too many damn arguments.
Find another, less costly, way to pass this information down there.
Frankly, I think RPF should be disabled completely by default. When
it doesn't do anything useful, it's making route lookups twice as
expensive as they need to be.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 21:25 ` David Miller
@ 2011-03-14 21:29 ` Michael Smith
2011-03-14 21:41 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Michael Smith @ 2011-03-14 21:29 UTC (permalink / raw)
To: netdev
David Miller wrote:
> First, I'm only willing to accept a patch like this to net-next-2.6
> for which all of the code you are changing is radically different.
OK.
> Secondly, fib_validate_source() already takes too many damn arguments.
> Find another, less costly, way to pass this information down there.
What would be a less costly way to pass it? Could I just hand it the
whole skb?
> Frankly, I think RPF should be disabled completely by default. When
> it doesn't do anything useful, it's making route lookups twice as
> expensive as they need to be.
Yeah, it's disabled by default. It's an easy way of preventing spoofing
of internal source addresses from the Internet, so I like it.
Thanks,
Mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 21:29 ` Michael Smith
@ 2011-03-14 21:41 ` David Miller
2011-03-14 22:11 ` Michael Smith
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2011-03-14 21:41 UTC (permalink / raw)
To: msmith; +Cc: netdev
From: Michael Smith <msmith@cbnco.com>
Date: Mon, 14 Mar 2011 17:29:43 -0400
> David Miller wrote:
>> Secondly, fib_validate_source() already takes too many damn arguments.
>> Find another, less costly, way to pass this information down there.
>
> What would be a less costly way to pass it? Could I just hand it the
> whole skb?
I don't see how passing a pointer is better than passing an interger.
In both cases you're adding an extra argument to the function.
I was trying to get you to think out of the box and come up with
something clever, but that isn't working. :-)
>> Frankly, I think RPF should be disabled completely by default. When
>> it doesn't do anything useful, it's making route lookups twice as
>> expensive as they need to be.
>
> Yeah, it's disabled by default. It's an easy way of preventing
> spoofing of internal source addresses from the Internet, so I like it.
It is not "disabled by default". fib_validate_source() still does a
limited validation of the reverse path, even with the sysctl is set to
zero.
I want it to do absolutely nothing, and instead just use inet_select_addr()
to calculate spec_dst.
Even the spec_dst calculation is spurious, necessary only in limited
situations, and even in that case only takes on special values for
multicast and broadcast addresses.
In short, fib_validate_source() is nothing but completely unnecessary
overhead in the common case.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 21:41 ` David Miller
@ 2011-03-14 22:11 ` Michael Smith
2011-03-14 22:14 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Michael Smith @ 2011-03-14 22:11 UTC (permalink / raw)
To: netdev
David Miller wrote:
>> What would be a less costly way to pass it? Could I just hand it the
>> whole skb?
>
> I don't see how passing a pointer is better than passing an interger.
> In both cases you're adding an extra argument to the function.
Yeah, I was thinking an sk_buff could replace the mark parameter,
possibly dev, maybe saddr, daddr, and tos too. On the other hand I can't
think of anything less onerous than an extra stack argument - unless
fib_validate_source() didn't exist at all.
> I was trying to get you to think out of the box and come up with
> something clever, but that isn't working. :-)
Yes, I got that, but I don't know what you are looking for, and don't
expect to succeed by trying something else at random.
> In short, fib_validate_source() is nothing but completely unnecessary
> overhead in the common case.
I'm not entitled to an opinion about fib_validate_source(). It feels
like it might be trying to do one too many things. If it were my code I
might split the RPF out from the spec_dst calculation, move the whole
lot into net/ipv4/route.c, and only do the fib_lookup() if RPF is
enabled or CONFIG_IP_ROUTE_CLASSID (i.e. we need to know the itag).
If that makes sense I'll give it a shot, but beware, I don't even know
what an itag is, and I don't see documentation for CONFIG_IP_ROUTE_CLASSID.
Thanks,
Mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 22:11 ` Michael Smith
@ 2011-03-14 22:14 ` David Miller
2011-03-14 22:23 ` Michael Smith
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2011-03-14 22:14 UTC (permalink / raw)
To: msmith; +Cc: netdev
From: Michael Smith <msmith@cbnco.com>
Date: Mon, 14 Mar 2011 18:11:24 -0400
> David Miller wrote:
>> I was trying to get you to think out of the box and come up with
>> something clever, but that isn't working. :-)
>
> Yes, I got that, but I don't know what you are looking for, and don't
> expect to succeed by trying something else at random.
Existing arguments might be large enough to carry more than one piece
of information :-)
>> In short, fib_validate_source() is nothing but completely unnecessary
>> overhead in the common case.
>
> I'm not entitled to an opinion about fib_validate_source(). It feels
> like it might be trying to do one too many things. If it were my code
> I might split the RPF out from the spec_dst calculation, move the
> whole lot into net/ipv4/route.c, and only do the fib_lookup() if RPF
> is enabled or CONFIG_IP_ROUTE_CLASSID (i.e. we need to know the itag).
Can't split two two things up, because spec_dst is a product of the
reverse FIB lookup, in the form of FIB_RES_PREFSRC().
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 22:14 ` David Miller
@ 2011-03-14 22:23 ` Michael Smith
2011-03-14 22:27 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Michael Smith @ 2011-03-14 22:23 UTC (permalink / raw)
To: netdev
David Miller wrote:
> Existing arguments might be large enough to carry more than one piece
> of information :-)
If it's encoded into another argument, would there be more overhead from
bit-shifting it out than you'd save by losing an argument?
>> I might split the RPF out from the spec_dst calculation, move the
>> whole lot into net/ipv4/route.c, and only do the fib_lookup() if RPF
>> is enabled or CONFIG_IP_ROUTE_CLASSID (i.e. we need to know the itag).
>
> Can't split two two things up, because spec_dst is a product of the
> reverse FIB lookup, in the form of FIB_RES_PREFSRC().
True, but the result of the reverse FIB lookup could be passed around.
Mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 22:23 ` Michael Smith
@ 2011-03-14 22:27 ` David Miller
2011-03-15 23:21 ` Michael Smith
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2011-03-14 22:27 UTC (permalink / raw)
To: msmith; +Cc: netdev
From: Michael Smith <msmith@cbnco.com>
Date: Mon, 14 Mar 2011 18:23:20 -0400
> David Miller wrote:
>
>> Existing arguments might be large enough to carry more than one piece
>> of information :-)
>
> If it's encoded into another argument, would there be more overhead
> from bit-shifting it out than you'd save by losing an argument?
It sure will if it's the different between the argument being passed
in a register vs. on the stack.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-14 22:27 ` David Miller
@ 2011-03-15 23:21 ` Michael Smith
2011-03-15 23:35 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Michael Smith @ 2011-03-15 23:21 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On Mon, 14 Mar 2011, David Miller wrote:
> > David Miller wrote:
> >
> >> Existing arguments might be large enough to carry more than one piece
> >> of information :-)
> >
> > If it's encoded into another argument, would there be more overhead
> > from bit-shifting it out than you'd save by losing an argument?
>
> It sure will if it's the different between the argument being passed
> in a register vs. on the stack.
I have a patch to replace u32 mark with an sk_buff. The mark is in the
sk_buff already, and so is the secpath field I need. Would that be
acceptable? I can hold off until the merge window is over.
-extern int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
- struct net_device *dev, __be32 *spec_dst,
- u32 *itag, u32 mark);
+extern int fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
+ u8 tos, int oif, struct net_device *dev,
+ __be32 *spec_dst, u32 *itag);
(dev is also usually the same as skb->dev, but not always...)
Mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Disable rp_filter for IPsec packets
2011-03-15 23:21 ` Michael Smith
@ 2011-03-15 23:35 ` David Miller
0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2011-03-15 23:35 UTC (permalink / raw)
To: msmith; +Cc: netdev
From: Michael Smith <msmith@cbnco.com>
Date: Tue, 15 Mar 2011 19:21:29 -0400 (EDT)
> On Mon, 14 Mar 2011, David Miller wrote:
>
>> > David Miller wrote:
>> >
>> >> Existing arguments might be large enough to carry more than one piece
>> >> of information :-)
>> >
>> > If it's encoded into another argument, would there be more overhead
>> > from bit-shifting it out than you'd save by losing an argument?
>>
>> It sure will if it's the different between the argument being passed
>> in a register vs. on the stack.
>
> I have a patch to replace u32 mark with an sk_buff. The mark is in the
> sk_buff already, and so is the secpath field I need. Would that be
> acceptable? I can hold off until the merge window is over.
Sounds good, and yes please wait until after the merge window.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-03-15 23:35 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-14 21:14 [PATCH] Disable rp_filter for IPsec packets Michael Smith
2011-03-14 21:25 ` David Miller
2011-03-14 21:29 ` Michael Smith
2011-03-14 21:41 ` David Miller
2011-03-14 22:11 ` Michael Smith
2011-03-14 22:14 ` David Miller
2011-03-14 22:23 ` Michael Smith
2011-03-14 22:27 ` David Miller
2011-03-15 23:21 ` Michael Smith
2011-03-15 23:35 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).