* [RFC PATCH net-next] ip6: Do not expire uncached routes for mtu invalidation
@ 2014-09-08 8:34 Alex Gartrell
2014-09-08 10:30 ` Eric Dumazet
0 siblings, 1 reply; 4+ messages in thread
From: Alex Gartrell @ 2014-09-08 8:34 UTC (permalink / raw)
To: davem; +Cc: edumazet, netdev, kernel-team, ps, Alex Gartrell
This patch does two things: first it won't introduced RTF_EXPIRES to
rt6i_flags unless it already exists or RTF_CACHE is set; second, in
ip_pol_route, we'll check for expiration without the RTF_EXPIRES bit, and,
if it's set, zero out the pmtu so that we'll fall back to the device mtu.
This fixes an issue where we were deleting local, uncached dst routes.
This would result in packets being rejected after mtu expiration.
Here's a repro of the problem.
ip addr add dev lo face::1/128
grep ^face0000000000000000000000000001 /proc/net/ipv6_route
# The flags do not have RTF_MODIFIED | RTF_EXPIRED
ipvsadm -A -t 8.8.8.8:15213 # service not supported on first try
ipvsadm -A -t [face::1]:15213 -s rr > /dev/null
ipvsadm -a -t [face::1]:15213 -r 2401:db00:20:7017:face:0:13:0 --ipip > /dev/null
timeout 3 nc face::1 15213
grep ^face0000000000000000000000000001 /proc/net/ipv6_route
# The flags will not include RTF_MODIFIED | RTF_EXPIRED
Signed-off-by: Alex Gartrell <agartrell@fb.com>
---
include/net/ip6_fib.h | 3 ++-
net/ipv6/route.c | 13 ++++++++++++-
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 9bcb220..2f0d4d0 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -184,7 +184,8 @@ static inline void rt6_update_expires(struct rt6_info *rt0, int timeout)
rt0->dst.expires = rt->dst.expires;
dst_set_expires(&rt0->dst, timeout);
- rt0->rt6i_flags |= RTF_EXPIRES;
+ if (rt0->rt6i_flags & (RTF_CACHE | RTF_EXPIRES))
+ rt0->rt6i_flags |= RTF_EXPIRES;
}
static inline void rt6_set_from(struct rt6_info *rt, struct rt6_info *from)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index f74b041..a509a06 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -947,8 +947,19 @@ restart:
nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
else if (!(rt->dst.flags & DST_HOST))
nrt = rt6_alloc_clone(rt, &fl6->daddr);
- else
+ else {
+ if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.expires &&
+ time_after(jiffies, rt->dst.expires)) {
+ /* Uncached routes may have expires set if we
+ * intend to expire the MTU but not the dest
+ * itself. In that case, we should reset the mtu
+ * before handing it back */
+ dst_metric_set(&rt->dst, RTAX_MTU, 0);
+ rt6_clean_expires(rt);
+ rt->rt6i_flags &= ~RTF_MODIFIED;
+ }
goto out2;
+ }
ip6_rt_put(rt);
rt = nrt ? : net->ipv6.ip6_null_entry;
--
1.8.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] ip6: Do not expire uncached routes for mtu invalidation
2014-09-08 8:34 [RFC PATCH net-next] ip6: Do not expire uncached routes for mtu invalidation Alex Gartrell
@ 2014-09-08 10:30 ` Eric Dumazet
2014-09-08 17:12 ` Alex Gartrell
0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2014-09-08 10:30 UTC (permalink / raw)
To: Alex Gartrell; +Cc: davem, edumazet, netdev, kernel-team, ps
On Mon, 2014-09-08 at 01:34 -0700, Alex Gartrell wrote:
> This patch does two things: first it won't introduced RTF_EXPIRES to
> rt6i_flags unless it already exists or RTF_CACHE is set; second, in
> ip_pol_route, we'll check for expiration without the RTF_EXPIRES bit, and,
> if it's set, zero out the pmtu so that we'll fall back to the device mtu.
>
> This fixes an issue where we were deleting local, uncached dst routes.
> This would result in packets being rejected after mtu expiration.
>
> Here's a repro of the problem.
>
> ip addr add dev lo face::1/128
> grep ^face0000000000000000000000000001 /proc/net/ipv6_route
> # The flags do not have RTF_MODIFIED | RTF_EXPIRED
>
> ipvsadm -A -t 8.8.8.8:15213 # service not supported on first try
> ipvsadm -A -t [face::1]:15213 -s rr > /dev/null
> ipvsadm -a -t [face::1]:15213 -r 2401:db00:20:7017:face:0:13:0 --ipip > /dev/null
>
> timeout 3 nc face::1 15213
>
> grep ^face0000000000000000000000000001 /proc/net/ipv6_route
> # The flags will not include RTF_MODIFIED | RTF_EXPIRED
>
> Signed-off-by: Alex Gartrell <agartrell@fb.com>
> ---
> include/net/ip6_fib.h | 3 ++-
> net/ipv6/route.c | 13 ++++++++++++-
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index 9bcb220..2f0d4d0 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -184,7 +184,8 @@ static inline void rt6_update_expires(struct rt6_info *rt0, int timeout)
> rt0->dst.expires = rt->dst.expires;
>
> dst_set_expires(&rt0->dst, timeout);
> - rt0->rt6i_flags |= RTF_EXPIRES;
> + if (rt0->rt6i_flags & (RTF_CACHE | RTF_EXPIRES))
> + rt0->rt6i_flags |= RTF_EXPIRES;
This looks wrong. What could be the point of settinf RTF_EXPIRES if its
already set ?
> }
>
> static inline void rt6_set_from(struct rt6_info *rt, struct rt6_info *from)
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index f74b041..a509a06 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -947,8 +947,19 @@ restart:
> nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
> else if (!(rt->dst.flags & DST_HOST))
> nrt = rt6_alloc_clone(rt, &fl6->daddr);
> - else
> + else {
> + if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.expires &&
> + time_after(jiffies, rt->dst.expires)) {
> + /* Uncached routes may have expires set if we
> + * intend to expire the MTU but not the dest
> + * itself. In that case, we should reset the mtu
> + * before handing it back */
> + dst_metric_set(&rt->dst, RTAX_MTU, 0);
> + rt6_clean_expires(rt);
> + rt->rt6i_flags &= ~RTF_MODIFIED;
Many cpus can perform this at the same time on same route, this looks
racy.
> + }
> goto out2;
> + }
>
> ip6_rt_put(rt);
> rt = nrt ? : net->ipv6.ip6_null_entry;
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] ip6: Do not expire uncached routes for mtu invalidation
2014-09-08 10:30 ` Eric Dumazet
@ 2014-09-08 17:12 ` Alex Gartrell
2014-09-08 17:20 ` Eric Dumazet
0 siblings, 1 reply; 4+ messages in thread
From: Alex Gartrell @ 2014-09-08 17:12 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, edumazet, netdev, kernel-team, ps
Thank you for taking a look, Eric.
I'll admit that I have a distinct lack of confidence that I've got the
right solution to the problem here, but I've made it about as far as I
can without getting your collective comments, so it's much appreciated.
On 9/8/14 3:30 AM, Eric Dumazet wrote:
> On Mon, 2014-09-08 at 01:34 -0700, Alex Gartrell wrote:
>> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
>> index 9bcb220..2f0d4d0 100644
>> --- a/include/net/ip6_fib.h
>> +++ b/include/net/ip6_fib.h
>> @@ -184,7 +184,8 @@ static inline void rt6_update_expires(struct rt6_info *rt0, int timeout)
>> rt0->dst.expires = rt->dst.expires;
>>
>> dst_set_expires(&rt0->dst, timeout);
>> - rt0->rt6i_flags |= RTF_EXPIRES;
>> + if (rt0->rt6i_flags & (RTF_CACHE | RTF_EXPIRES))
>> + rt0->rt6i_flags |= RTF_EXPIRES;
>
> This looks wrong. What could be the point of settinf RTF_EXPIRES if its
> already set ?
>
This is a good point. It was clearer to me at the time to include it
(more similar to the old implementation which set the bit
unconditionally), but I don't really care.
>> }
>>
>> static inline void rt6_set_from(struct rt6_info *rt, struct rt6_info *from)
>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>> index f74b041..a509a06 100644
>> --- a/net/ipv6/route.c
>> +++ b/net/ipv6/route.c
>> @@ -947,8 +947,19 @@ restart:
>> nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
>> else if (!(rt->dst.flags & DST_HOST))
>> nrt = rt6_alloc_clone(rt, &fl6->daddr);
>> - else
>> + else {
>> + if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.expires &&
>> + time_after(jiffies, rt->dst.expires)) {
>> + /* Uncached routes may have expires set if we
>> + * intend to expire the MTU but not the dest
>> + * itself. In that case, we should reset the mtu
>> + * before handing it back */
>> + dst_metric_set(&rt->dst, RTAX_MTU, 0);
>> + rt6_clean_expires(rt);
>> + rt->rt6i_flags &= ~RTF_MODIFIED;
>
> Many cpus can perform this at the same time on same route, this looks
> racy.
Initially I was just going to agree with you here, but taking another
look at ip_vs_xmit at least, there doesn't appear to be any special
locking before invoking ->update_pmtu, which is playing with rt6i_flags
and dst.expires as well. Is that racy as well or is there something
else I'm missing here?
There are other ways to skin this particular cat though, and I've got no
specific attachment to any of them. The most logical thing to do IMO is
clone the route when it may be necessary to do so, but given the fact
that that was very deliberately undone in 7343ff3 "ipv6: Don't create
clones of host routes," I'm not sure that it's the right thing to do or
that it won't require major surgery.
Thanks again,
--
Alex Gartrell <agartrell@fb.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] ip6: Do not expire uncached routes for mtu invalidation
2014-09-08 17:12 ` Alex Gartrell
@ 2014-09-08 17:20 ` Eric Dumazet
0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2014-09-08 17:20 UTC (permalink / raw)
To: Alex Gartrell; +Cc: davem, edumazet, netdev, kernel-team, ps
On Mon, 2014-09-08 at 10:12 -0700, Alex Gartrell wrote:
> Thank you for taking a look, Eric.
>
> I'll admit that I have a distinct lack of confidence that I've got the
> right solution to the problem here, but I've made it about as far as I
> can without getting your collective comments, so it's much appreciated.
..
> >>
> >> static inline void rt6_set_from(struct rt6_info *rt, struct rt6_info *from)
> >> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> >> index f74b041..a509a06 100644
> >> --- a/net/ipv6/route.c
> >> +++ b/net/ipv6/route.c
> >> @@ -947,8 +947,19 @@ restart:
> >> nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
> >> else if (!(rt->dst.flags & DST_HOST))
> >> nrt = rt6_alloc_clone(rt, &fl6->daddr);
> >> - else
> >> + else {
> >> + if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.expires &&
> >> + time_after(jiffies, rt->dst.expires)) {
> >> + /* Uncached routes may have expires set if we
> >> + * intend to expire the MTU but not the dest
> >> + * itself. In that case, we should reset the mtu
> >> + * before handing it back */
> >> + dst_metric_set(&rt->dst, RTAX_MTU, 0);
> >> + rt6_clean_expires(rt);
> >> + rt->rt6i_flags &= ~RTF_MODIFIED;
> >
> > Many cpus can perform this at the same time on same route, this looks
> > racy.
>
> Initially I was just going to agree with you here, but taking another
> look at ip_vs_xmit at least, there doesn't appear to be any special
> locking before invoking ->update_pmtu, which is playing with rt6i_flags
> and dst.expires as well. Is that racy as well or is there something
> else I'm missing here?
>
> There are other ways to skin this particular cat though, and I've got no
> specific attachment to any of them. The most logical thing to do IMO is
> clone the route when it may be necessary to do so, but given the fact
> that that was very deliberately undone in 7343ff3 "ipv6: Don't create
> clones of host routes," I'm not sure that it's the right thing to do or
> that it won't require major surgery.
Have you followed thread started yesterday ?
https://patchwork.ozlabs.org/patch/386739/
Reverting 7343ff3 "ipv6: Don't create clones of host routes" was
considered as a matter of fact, when I replied :
"This means we have to clone all routes."
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-09-08 17:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-08 8:34 [RFC PATCH net-next] ip6: Do not expire uncached routes for mtu invalidation Alex Gartrell
2014-09-08 10:30 ` Eric Dumazet
2014-09-08 17:12 ` Alex Gartrell
2014-09-08 17:20 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).