* [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
@ 2012-10-19 19:13 Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 1/6] ipv4: check rt_genid in dst_check Benjamin LaHaise
` (6 more replies)
0 siblings, 7 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 19:13 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
This is v2 of an attempt to pull in the relevant fixes for a problem in
v2.6.32 kernels where invalid cached routes are retained even after changes
to the routing table have been made. A simple test case can be found at
http://marc.info/?l=linux-netdev&m=135015076708950&w=2 . Based on feedback
from David Miller, additional changes have been pulled in, including fixes
for the same issue in IPv6. Most of the patches required some rework owing
to the large differences in the networking stack between 2.6.32 and 3.6.
I have performed basic tests to confirm that the cases I was hitting are
now fixed, including a couple of tests with IPv4 and IPv6. Comments?
Thanks again to David for the pointers to the additional fixes required in
this area.
Benjamin LaHaise (6):
ipv4: check_rt_genid in dst_check
net: Document dst->obsolete better.
ipv6: use DST_* macro to set obselete field
netns: move net->ipv4.rt_genid to net->rt_genid
ipv6: use net->rt_genid to check dst validity
xfrm: invalidate dst on policy insertion/deletion
include/net/dst.h | 14 +++++++++++++-
include/net/ip6_fib.h | 4 +---
include/net/net_namespace.h | 12 ++++++++++++
include/net/netns/ipv4.h | 1 -
net/core/dst.c | 4 ++--
net/ipv4/route.c | 28 +++++++++++++++-------------
net/ipv6/inet6_connection_sock.c | 24 +-----------------------
net/ipv6/route.c | 35 ++++++++++++++++++++++++-----------
net/sctp/output.c | 2 +-
net/xfrm/xfrm_policy.c | 22 ++++++++++++----------
security/selinux/include/xfrm.h | 1 +
11 files changed, 82 insertions(+), 65 deletions(-)
--
"Thought is the essence of where you are now."
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/6] ipv4: check rt_genid in dst_check
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
@ 2012-10-19 19:21 ` Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 2/6] net: Document dst->obsolete better Benjamin LaHaise
` (5 subsequent siblings)
6 siblings, 0 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 19:21 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
commit d11a4dc18bf41719c9f0d7ed494d295dd2973b92
Author: Timo Teräs <timo.teras@iki.fi>
Date: Thu Mar 18 23:20:20 2010 +0000
ipv4: check rt_genid in dst_check
Xfrm_dst keeps a reference to ipv4 rtable entries on each
cached bundle. The only way to renew xfrm_dst when the underlying
route has changed, is to implement dst_check for this. This is
what ipv6 side does too.
The problems started after 87c1e12b5eeb7b30b4b41291bef8e0b41fc3dde9
("ipsec: Fix bogus bundle flowi") which fixed a bug causing xfrm_dst
to not get reused, until that all lookups always generated new
xfrm_dst with new route reference and path mtu worked. But after the
fix, the old routes started to get reused even after they were expired
causing pmtu to break (well it would occationally work if the rtable
gc had run recently and marked the route obsolete causing dst_check to
get called).
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is based on the above, with the addition of verifying blackhole
routes in the same manner.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
net/ipv4/route.c | 17 ++++++++++++-----
1 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 58f141b..f16d19b 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1412,7 +1412,7 @@ void ip_rt_redirect(__be32 old_gw, __be32 daddr, __be32 new_gw,
dev_hold(rt->u.dst.dev);
if (rt->idev)
in_dev_hold(rt->idev);
- rt->u.dst.obsolete = 0;
+ rt->u.dst.obsolete = -1;
rt->u.dst.lastuse = jiffies;
rt->u.dst.path = &rt->u.dst;
rt->u.dst.neighbour = NULL;
@@ -1477,7 +1477,7 @@ static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst)
struct dst_entry *ret = dst;
if (rt) {
- if (dst->obsolete) {
+ if (dst->obsolete > 0) {
ip_rt_put(rt);
ret = NULL;
} else if ((rt->rt_flags & RTCF_REDIRECTED) ||
@@ -1700,7 +1700,9 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, u32 mtu)
static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie)
{
- return NULL;
+ if (rt_is_expired((struct rtable *)dst))
+ return NULL;
+ return dst;
}
static void ipv4_dst_destroy(struct dst_entry *dst)
@@ -1862,7 +1864,8 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
if (!rth)
goto e_nobufs;
- rth->u.dst.output= ip_rt_bug;
+ rth->u.dst.output = ip_rt_bug;
+ rth->u.dst.obsolete = -1;
atomic_set(&rth->u.dst.__refcnt, 1);
rth->u.dst.flags= DST_HOST;
@@ -2023,6 +2026,7 @@ static int __mkroute_input(struct sk_buff *skb,
rth->fl.oif = 0;
rth->rt_spec_dst= spec_dst;
+ rth->u.dst.obsolete = -1;
rth->u.dst.input = ip_forward;
rth->u.dst.output = ip_output;
rth->rt_genid = rt_genid(dev_net(rth->u.dst.dev));
@@ -2187,6 +2191,7 @@ local_input:
goto e_nobufs;
rth->u.dst.output= ip_rt_bug;
+ rth->u.dst.obsolete = -1;
rth->rt_genid = rt_genid(net);
atomic_set(&rth->u.dst.__refcnt, 1);
@@ -2411,7 +2416,8 @@ static int __mkroute_output(struct rtable **result,
rth->rt_gateway = fl->fl4_dst;
rth->rt_spec_dst= fl->fl4_src;
- rth->u.dst.output=ip_output;
+ rth->u.dst.output = ip_output;
+ rth->u.dst.obsolete = -1;
rth->rt_genid = rt_genid(dev_net(dev_out));
RT_CACHE_STAT_INC(out_slow_tot);
@@ -2741,6 +2747,7 @@ static int ipv4_dst_blackhole(struct net *net, struct rtable **rp, struct flowi
if (rt) {
struct dst_entry *new = &rt->u.dst;
+ new->obsolete = -1;
atomic_set(&new->__refcnt, 1);
new->__use = 1;
new->input = dst_discard;
--
1.7.1
--
"Thought is the essence of where you are now."
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/6] net: Document dst->obsolete better.
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 1/6] ipv4: check rt_genid in dst_check Benjamin LaHaise
@ 2012-10-19 19:21 ` Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 3/6] ipv6: use DST_* macro to set obselete field Benjamin LaHaise
` (4 subsequent siblings)
6 siblings, 0 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 19:21 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
commit f5b0a8743601a4477419171f5046bd07d1c080a0
Author: David S. Miller <davem@davemloft.net>
Date: Thu Jul 19 12:31:33 2012 -0700
net: Document dst->obsolete better.
Add a big comment explaining how the field works, and use defines
instead of magic constants for the values assigned to it.
Suggested by Joe Perches.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a slightly modified backport of the above commit to cover additional
locationsi of dst.u.obsolete = -1.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
include/net/dst.h | 14 +++++++++++++-
net/core/dst.c | 4 ++--
net/ipv4/route.c | 12 ++++++------
net/sctp/output.c | 2 +-
net/xfrm/xfrm_policy.c | 21 +++++++++++----------
5 files changed, 33 insertions(+), 20 deletions(-)
diff --git a/include/net/dst.h b/include/net/dst.h
index 5a900dd..938b525 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -42,7 +42,19 @@ struct dst_entry
struct dst_entry *child;
struct net_device *dev;
short error;
+
+ /* A non-zero value of dst->obsolete forces by-hand validation
+ * of the route entry. Positive values are set by the generic
+ * dst layer to indicate that the entry has been forcefully
+ * destroyed.
+ *
+ * Negative values are used by the implementation layer code to
+ * force invocation of the dst_ops->check() method.
+ */
short obsolete;
+#define DST_OBSOLETE_NONE 0
+#define DST_OBSOLETE_DEAD 2
+#define DST_OBSOLETE_FORCE_CHK -1
int flags;
#define DST_HOST 1
#define DST_NOXFRM 2
@@ -200,7 +212,7 @@ extern struct dst_entry *dst_destroy(struct dst_entry * dst);
static inline void dst_free(struct dst_entry * dst)
{
- if (dst->obsolete > 1)
+ if (dst->obsolete > 0)
return;
if (!atomic_read(&dst->__refcnt)) {
dst = dst_destroy(dst);
diff --git a/net/core/dst.c b/net/core/dst.c
index cb1b348..5eb9929 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -99,7 +99,7 @@ loop:
* But we do not have state "obsoleted, but
* referenced by parent", so it is right.
*/
- if (dst->obsolete > 1)
+ if (dst->obsolete > 0)
continue;
___dst_free(dst);
@@ -193,7 +193,7 @@ static void ___dst_free(struct dst_entry * dst)
if (dst->dev == NULL || !(dst->dev->flags&IFF_UP)) {
dst->input = dst->output = dst_discard;
}
- dst->obsolete = 2;
+ dst->obsolete = DST_OBSOLETE_DEAD;
}
void __dst_free(struct dst_entry * dst)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f16d19b..f79a9a8 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1412,7 +1412,7 @@ void ip_rt_redirect(__be32 old_gw, __be32 daddr, __be32 new_gw,
dev_hold(rt->u.dst.dev);
if (rt->idev)
in_dev_hold(rt->idev);
- rt->u.dst.obsolete = -1;
+ rt->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rt->u.dst.lastuse = jiffies;
rt->u.dst.path = &rt->u.dst;
rt->u.dst.neighbour = NULL;
@@ -1865,7 +1865,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
goto e_nobufs;
rth->u.dst.output = ip_rt_bug;
- rth->u.dst.obsolete = -1;
+ rth->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
atomic_set(&rth->u.dst.__refcnt, 1);
rth->u.dst.flags= DST_HOST;
@@ -2026,7 +2026,7 @@ static int __mkroute_input(struct sk_buff *skb,
rth->fl.oif = 0;
rth->rt_spec_dst= spec_dst;
- rth->u.dst.obsolete = -1;
+ rth->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rth->u.dst.input = ip_forward;
rth->u.dst.output = ip_output;
rth->rt_genid = rt_genid(dev_net(rth->u.dst.dev));
@@ -2191,7 +2191,7 @@ local_input:
goto e_nobufs;
rth->u.dst.output= ip_rt_bug;
- rth->u.dst.obsolete = -1;
+ rth->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rth->rt_genid = rt_genid(net);
atomic_set(&rth->u.dst.__refcnt, 1);
@@ -2417,7 +2417,7 @@ static int __mkroute_output(struct rtable **result,
rth->rt_spec_dst= fl->fl4_src;
rth->u.dst.output = ip_output;
- rth->u.dst.obsolete = -1;
+ rth->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rth->rt_genid = rt_genid(dev_net(dev_out));
RT_CACHE_STAT_INC(out_slow_tot);
@@ -2747,7 +2747,7 @@ static int ipv4_dst_blackhole(struct net *net, struct rtable **rp, struct flowi
if (rt) {
struct dst_entry *new = &rt->u.dst;
- new->obsolete = -1;
+ new->obsolete = DST_OBSOLETE_FORCE_CHK;
atomic_set(&new->__refcnt, 1);
new->__use = 1;
new->input = dst_discard;
diff --git a/net/sctp/output.c b/net/sctp/output.c
index d494100..a6fc861 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -375,7 +375,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
skb_set_owner_w(nskb, sk);
/* The 'obsolete' field of dst is set to 2 when a dst is freed. */
- if (!dst || (dst->obsolete > 1)) {
+ if (!dst || dst->obsolete) {
dst_release(dst);
sctp_transport_route(tp, NULL, sctp_sk(sk));
if (asoc && (asoc->param_flags & SPP_PMTUD_ENABLE)) {
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index cb81ca3..1ae61bd 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1421,7 +1421,7 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
dst1->xfrm = xfrm[i];
xdst->genid = xfrm[i]->genid;
- dst1->obsolete = -1;
+ dst1->obsolete = DST_OBSOLETE_FORCE_CHK;
dst1->flags |= DST_HOST;
dst1->lastuse = now;
@@ -2049,12 +2049,13 @@ EXPORT_SYMBOL(__xfrm_route_forward);
static struct dst_entry *xfrm_dst_check(struct dst_entry *dst, u32 cookie)
{
/* Code (such as __xfrm4_bundle_create()) sets dst->obsolete
- * to "-1" to force all XFRM destinations to get validated by
- * dst_ops->check on every use. We do this because when a
- * normal route referenced by an XFRM dst is obsoleted we do
- * not go looking around for all parent referencing XFRM dsts
- * so that we can invalidate them. It is just too much work.
- * Instead we make the checks here on every use. For example:
+ * to DST_OBSOLETE_FORCE_CHK to force all XFRM destinations to
+ * get validated by dst_ops->check on every use. We do this
+ * because when a normal route referenced by an XFRM dst is
+ * obsoleted we do not go looking around for all parent
+ * referencing XFRM dsts so that we can invalidate them. It
+ * is just too much work. Instead we make the checks here on
+ * every use. For example:
*
* XFRM dst A --> IPv4 dst X
*
@@ -2064,9 +2065,9 @@ static struct dst_entry *xfrm_dst_check(struct dst_entry *dst, u32 cookie)
* stale_bundle() check.
*
* When a policy's bundle is pruned, we dst_free() the XFRM
- * dst which causes it's ->obsolete field to be set to a
- * positive non-zero integer. If an XFRM dst has been pruned
- * like this, we want to force a new route lookup.
+ * dst which causes it's ->obsolete field to be set to
+ * DST_OBSOLETE_DEAD. If an XFRM dst has been pruned like
+ * this, we want to force a new route lookup.
*/
if (dst->obsolete < 0 && !stale_bundle(dst))
return dst;
--
1.7.1
--
"Thought is the essence of where you are now."
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/6] ipv6: use DST_* macro to set obselete field
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 1/6] ipv4: check rt_genid in dst_check Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 2/6] net: Document dst->obsolete better Benjamin LaHaise
@ 2012-10-19 19:21 ` Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 4/6] netns: move net->ipv4.rt_genid to net->rt_genid Benjamin LaHaise
` (3 subsequent siblings)
6 siblings, 0 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 19:21 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
commit 2c20cbd7e3aa6e9dddc07975d3f3a89fe1f69c00
Author: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Mon Sep 10 22:09:47 2012 +0000
ipv6: use DST_* macro to set obselete field
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a slightly modified backport of the above commit that covers
additional locations setting dst->u.obsolete = -1.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
net/ipv6/route.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e307517..b420ea9 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -129,7 +129,7 @@ static struct rt6_info ip6_null_entry_template = {
.dst = {
.__refcnt = ATOMIC_INIT(1),
.__use = 1,
- .obsolete = -1,
+ .obsolete = DST_OBSOLETE_FORCE_CHK,
.error = -ENETUNREACH,
.metrics = { [RTAX_HOPLIMIT - 1] = 255, },
.input = ip6_pkt_discard,
@@ -152,7 +152,7 @@ static struct rt6_info ip6_prohibit_entry_template = {
.dst = {
.__refcnt = ATOMIC_INIT(1),
.__use = 1,
- .obsolete = -1,
+ .obsolete = DST_OBSOLETE_FORCE_CHK,
.error = -EACCES,
.metrics = { [RTAX_HOPLIMIT - 1] = 255, },
.input = ip6_pkt_prohibit,
@@ -170,7 +170,7 @@ static struct rt6_info ip6_blk_hole_entry_template = {
.dst = {
.__refcnt = ATOMIC_INIT(1),
.__use = 1,
- .obsolete = -1,
+ .obsolete = DST_OBSOLETE_FORCE_CHK,
.error = -EINVAL,
.metrics = { [RTAX_HOPLIMIT - 1] = 255, },
.input = dst_discard,
@@ -1161,7 +1161,7 @@ int ip6_route_add(struct fib6_config *cfg)
goto out;
}
- rt->u.dst.obsolete = -1;
+ rt->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rt->rt6i_expires = (cfg->fc_flags & RTF_EXPIRES) ?
jiffies + clock_t_to_jiffies(cfg->fc_expires) :
0;
@@ -1960,7 +1960,7 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev,
rt->u.dst.metrics[RTAX_MTU-1] = ipv6_get_mtu(rt->rt6i_dev);
rt->u.dst.metrics[RTAX_ADVMSS-1] = ipv6_advmss(net, dst_mtu(&rt->u.dst));
rt->u.dst.metrics[RTAX_HOPLIMIT-1] = -1;
- rt->u.dst.obsolete = -1;
+ rt->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rt->rt6i_flags = RTF_UP | RTF_NONEXTHOP;
if (anycast)
--
1.7.1
--
"Thought is the essence of where you are now."
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/6] netns: move net->ipv4.rt_genid to net->rt_genid
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
` (2 preceding siblings ...)
2012-10-19 19:21 ` [PATCH 3/6] ipv6: use DST_* macro to set obselete field Benjamin LaHaise
@ 2012-10-19 19:21 ` Benjamin LaHaise
2012-10-19 19:22 ` [PATCH 5/6] ipv6: use net->rt_genid to check dst validity Benjamin LaHaise
` (2 subsequent siblings)
6 siblings, 0 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 19:21 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
commit b42664f898c976247f7f609b8bb9c94d7475ca10
Author: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Mon Sep 10 22:09:44 2012 +0000
netns: move net->ipv4.rt_genid to net->rt_genid
This commit prepares the use of rt_genid by both IPv4 and IPv6.
Initialization is left in IPv4 part.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 2.6.32 still has the routing cache, this patch has been modified to
retain the IPv4 method for updating rt_genid in rt_cache_invalidate() to
perturb the route cache's hash function.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
include/net/net_namespace.h | 12 ++++++++++++
include/net/netns/ipv4.h | 1 -
net/ipv4/route.c | 11 +++--------
3 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index a120284..b810fa1 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -84,6 +84,7 @@ struct net {
struct sk_buff_head wext_nlevents;
#endif
struct net_generic *gen;
+ atomic_t rt_genid;
};
@@ -272,4 +273,15 @@ extern struct ctl_table_header *register_net_sysctl_rotable(
const struct ctl_path *path, struct ctl_table *table);
extern void unregister_net_sysctl_table(struct ctl_table_header *header);
+static inline int rt_genid(struct net *net)
+{
+ return atomic_read(&net->rt_genid);
+}
+
+extern void rt_cache_invalidate(struct net *net);
+static inline void rt_genid_bump(struct net *net)
+{
+ rt_cache_invalidate(net);
+}
+
#endif /* __NET_NET_NAMESPACE_H */
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 9a4b8b7..5f0d561 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -54,7 +54,6 @@ struct netns_ipv4 {
int current_rt_cache_rebuild_count;
struct timer_list rt_secret_timer;
- atomic_t rt_genid;
#ifdef CONFIG_IP_MROUTE
struct sock *mroute_sk;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f79a9a8..83f16e3 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -267,11 +267,6 @@ static inline unsigned int rt_hash(__be32 daddr, __be32 saddr, int idx,
& rt_hash_mask;
}
-static inline int rt_genid(struct net *net)
-{
- return atomic_read(&net->ipv4.rt_genid);
-}
-
#ifdef CONFIG_PROC_FS
struct rt_cache_iter_state {
struct seq_net_private p;
@@ -884,12 +879,12 @@ static void rt_worker_func(struct work_struct *work)
* many times (2^24) without giving recent rt_genid.
* Jenkins hash is strong enough that litle changes of rt_genid are OK.
*/
-static void rt_cache_invalidate(struct net *net)
+void rt_cache_invalidate(struct net *net)
{
unsigned char shuffle;
get_random_bytes(&shuffle, sizeof(shuffle));
- atomic_add(shuffle + 1U, &net->ipv4.rt_genid);
+ atomic_add(shuffle + 1U, &net->rt_genid);
}
/*
@@ -3364,7 +3359,7 @@ static __net_initdata struct pernet_operations sysctl_route_ops = {
static __net_init int rt_secret_timer_init(struct net *net)
{
- atomic_set(&net->ipv4.rt_genid,
+ atomic_set(&net->rt_genid,
(int) ((num_physpages ^ (num_physpages>>8)) ^
(jiffies ^ (jiffies >> 7))));
--
1.7.1
--
"Thought is the essence of where you are now."
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 5/6] ipv6: use net->rt_genid to check dst validity
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
` (3 preceding siblings ...)
2012-10-19 19:21 ` [PATCH 4/6] netns: move net->ipv4.rt_genid to net->rt_genid Benjamin LaHaise
@ 2012-10-19 19:22 ` Benjamin LaHaise
2012-10-19 19:22 ` [PATCH 6/6] xfrm: invalidate dst on policy insertion/deletion Benjamin LaHaise
2012-10-19 19:48 ` [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Willy Tarreau
6 siblings, 0 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 19:22 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
commit 6f3118b571b8a4c06c7985dc3172c3526cb86253
Author: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Mon Sep 10 22:09:46 2012 +0000
ipv6: use net->rt_genid to check dst validity
IPv6 dst should take care of rt_genid too. When a xfrm policy is inserted or
deleted, all dst should be invalidated.
To force the validation, dst entries should be created with ->obsolete set t
o
DST_OBSOLETE_FORCE_CHK. This was already the case for all functions calling
ip6_dst_alloc(), except for ip6_rt_copy().
As a consequence, we can remove the specific code in inet6_connection_sock.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This version of the above commit is slightly modified to compensate for
differences in ip6_dst_alloc().
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
include/net/ip6_fib.h | 4 +---
net/ipv6/inet6_connection_sock.c | 24 +-----------------------
net/ipv6/route.c | 29 +++++++++++++++++++++--------
3 files changed, 23 insertions(+), 34 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 15b492a..c6672d6 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -115,9 +115,7 @@ struct rt6_info
struct rt6key rt6i_dst;
-#ifdef CONFIG_XFRM
- u32 rt6i_flow_cache_genid;
-#endif
+ u32 rt6i_genid;
struct rt6key rt6i_src;
};
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index cc4797d..835bfe4 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -148,34 +148,12 @@ void __inet6_csk_dst_store(struct sock *sk, struct dst_entry *dst,
struct in6_addr *daddr, struct in6_addr *saddr)
{
__ip6_dst_store(sk, dst, daddr, saddr);
-
-#ifdef CONFIG_XFRM
- {
- struct rt6_info *rt = (struct rt6_info *)dst;
- rt->rt6i_flow_cache_genid = atomic_read(&flow_cache_genid);
- }
-#endif
}
static inline
struct dst_entry *__inet6_csk_dst_check(struct sock *sk, u32 cookie)
{
- struct dst_entry *dst;
-
- dst = __sk_dst_check(sk, cookie);
-
-#ifdef CONFIG_XFRM
- if (dst) {
- struct rt6_info *rt = (struct rt6_info *)dst;
- if (rt->rt6i_flow_cache_genid != atomic_read(&flow_cache_genid)) {
- sk->sk_dst_cache = NULL;
- dst_release(dst);
- dst = NULL;
- }
- }
-#endif
-
- return dst;
+ return __sk_dst_check(sk, cookie);
}
int inet6_csk_xmit(struct sk_buff *skb, int ipfragok)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b420ea9..80ab9cd 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -186,9 +186,17 @@ static struct rt6_info ip6_blk_hole_entry_template = {
#endif
/* allocate dst with ip6_dst_ops */
-static inline struct rt6_info *ip6_dst_alloc(struct dst_ops *ops)
+static inline struct rt6_info *ip6_dst_alloc(struct net *net,
+ struct dst_ops *ops)
{
- return (struct rt6_info *)dst_alloc(ops);
+ struct rt6_info *rt = (struct rt6_info *)dst_alloc(ops);
+
+ if (rt) {
+ rt->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
+ rt->rt6i_genid = rt_genid(net);
+ }
+
+ return rt;
}
static void ip6_dst_destroy(struct dst_entry *dst)
@@ -886,6 +894,13 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
rt = (struct rt6_info *) dst;
+ /* All IPV6 dsts are created with ->obsolete set to the value
+ * DST_OBSOLETE_FORCE_CHK which forces validation calls down
+ * into this function always.
+ */
+ if (rt->rt6i_genid != rt_genid(dev_net(rt->u.dst.dev)))
+ return NULL;
+
if (rt && rt->rt6i_node && (rt->rt6i_node->fn_sernum == cookie))
return dst;
@@ -970,7 +985,7 @@ struct dst_entry *icmp6_dst_alloc(struct net_device *dev,
if (unlikely(idev == NULL))
return NULL;
- rt = ip6_dst_alloc(&net->ipv6.ip6_dst_ops);
+ rt = ip6_dst_alloc(net, &net->ipv6.ip6_dst_ops);
if (unlikely(rt == NULL)) {
in6_dev_put(idev);
goto out;
@@ -1154,14 +1169,13 @@ int ip6_route_add(struct fib6_config *cfg)
goto out;
}
- rt = ip6_dst_alloc(&net->ipv6.ip6_dst_ops);
+ rt = ip6_dst_alloc(net, &net->ipv6.ip6_dst_ops);
if (rt == NULL) {
err = -ENOMEM;
goto out;
}
- rt->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rt->rt6i_expires = (cfg->fc_flags & RTF_EXPIRES) ?
jiffies + clock_t_to_jiffies(cfg->fc_expires) :
0;
@@ -1663,7 +1677,7 @@ void rt6_pmtu_discovery(struct in6_addr *daddr, struct in6_addr *saddr,
static struct rt6_info * ip6_rt_copy(struct rt6_info *ort)
{
struct net *net = dev_net(ort->rt6i_dev);
- struct rt6_info *rt = ip6_dst_alloc(&net->ipv6.ip6_dst_ops);
+ struct rt6_info *rt = ip6_dst_alloc(net, &net->ipv6.ip6_dst_ops);
if (rt) {
rt->u.dst.input = ort->u.dst.input;
@@ -1943,7 +1957,7 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev,
int anycast)
{
struct net *net = dev_net(idev->dev);
- struct rt6_info *rt = ip6_dst_alloc(&net->ipv6.ip6_dst_ops);
+ struct rt6_info *rt = ip6_dst_alloc(net, &net->ipv6.ip6_dst_ops);
struct neighbour *neigh;
if (rt == NULL)
@@ -1960,7 +1974,6 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev,
rt->u.dst.metrics[RTAX_MTU-1] = ipv6_get_mtu(rt->rt6i_dev);
rt->u.dst.metrics[RTAX_ADVMSS-1] = ipv6_advmss(net, dst_mtu(&rt->u.dst));
rt->u.dst.metrics[RTAX_HOPLIMIT-1] = -1;
- rt->u.dst.obsolete = DST_OBSOLETE_FORCE_CHK;
rt->rt6i_flags = RTF_UP | RTF_NONEXTHOP;
if (anycast)
--
1.7.1
--
"Thought is the essence of where you are now."
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 6/6] xfrm: invalidate dst on policy insertion/deletion
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
` (4 preceding siblings ...)
2012-10-19 19:22 ` [PATCH 5/6] ipv6: use net->rt_genid to check dst validity Benjamin LaHaise
@ 2012-10-19 19:22 ` Benjamin LaHaise
2012-10-19 19:48 ` [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Willy Tarreau
6 siblings, 0 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 19:22 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
commit ee8372dd1989287c5eedb69d44bac43f69e496f1
Author: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Mon Sep 10 22:09:45 2012 +0000
xfrm: invalidate dst on policy insertion/deletion
When a policy is inserted or deleted, all dst should be recalculated.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
---
net/xfrm/xfrm_policy.c | 1 +
security/selinux/include/xfrm.h | 1 +
2 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 1ae61bd..3aa00e1 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -587,6 +587,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl)
xfrm_pol_hold(policy);
net->xfrm.policy_count[dir]++;
atomic_inc(&flow_cache_genid);
+ rt_genid_bump(net);
if (delpol)
__xfrm_policy_unlink(delpol, dir);
policy->index = delpol ? delpol->index : xfrm_gen_index(net, dir);
diff --git a/security/selinux/include/xfrm.h b/security/selinux/include/xfrm.h
index 13128f9..9acf6fa 100644
--- a/security/selinux/include/xfrm.h
+++ b/security/selinux/include/xfrm.h
@@ -49,6 +49,7 @@ int selinux_xfrm_decode_session(struct sk_buff *skb, u32 *sid, int ckall);
static inline void selinux_xfrm_notify_policyload(void)
{
atomic_inc(&flow_cache_genid);
+ rt_genid_bump(&init_net);
}
#else
static inline int selinux_xfrm_enabled(void)
--
1.7.1
--
"Thought is the essence of where you are now."
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
` (5 preceding siblings ...)
2012-10-19 19:22 ` [PATCH 6/6] xfrm: invalidate dst on policy insertion/deletion Benjamin LaHaise
@ 2012-10-19 19:48 ` Willy Tarreau
2012-10-19 19:49 ` David Miller
6 siblings, 1 reply; 19+ messages in thread
From: Willy Tarreau @ 2012-10-19 19:48 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: David Miller, stable, netdev
Hi Ben,
On Fri, Oct 19, 2012 at 03:13:48PM -0400, Benjamin LaHaise wrote:
> This is v2 of an attempt to pull in the relevant fixes for a problem in
> v2.6.32 kernels where invalid cached routes are retained even after changes
> to the routing table have been made. A simple test case can be found at
> http://marc.info/?l=linux-netdev&m=135015076708950&w=2 . Based on feedback
> from David Miller, additional changes have been pulled in, including fixes
> for the same issue in IPv6. Most of the patches required some rework owing
> to the large differences in the networking stack between 2.6.32 and 3.6.
>
> I have performed basic tests to confirm that the cases I was hitting are
> now fixed, including a couple of tests with IPv4 and IPv6. Comments?
> Thanks again to David for the pointers to the additional fixes required in
> this area.
Looks like you've done an amazing work. I'm not the best person to judge
if these changes are valid or not, so I'll trust you and David on this.
I just have two questions: since you had to backport them from 3.6 to 2.6.32,
I'm assuming that the equivalent was not ready in closer LTS kernels (2.6.34,
3.0, 3.2, 3.4). Does this mean they're affected by the same issue too or are
they immune ? And if they're affected, are you going to proceed with forward
porting of your work for the closest ones (.34 and 3.0 come to mind) ? And if
they aren't affected, do you think that picking some changes there would have
been easier ?
As you probably know, we want to ensure that people who upgrade from an old
LTS to a newer one won't experience regressions due to problems which were
fixed only in the older version.
Thanks !
Willy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 19:48 ` [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Willy Tarreau
@ 2012-10-19 19:49 ` David Miller
2012-10-19 19:55 ` Willy Tarreau
0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2012-10-19 19:49 UTC (permalink / raw)
To: w; +Cc: bcrl, stable, netdev
How about checking if these changes are already in 3.0/3.2/etc. or not
before asking such questions?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 19:49 ` David Miller
@ 2012-10-19 19:55 ` Willy Tarreau
2012-10-19 20:01 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Willy Tarreau @ 2012-10-19 19:55 UTC (permalink / raw)
To: David Miller; +Cc: bcrl, stable, netdev
On Fri, Oct 19, 2012 at 03:49:30PM -0400, David Miller wrote:
>
> How about checking if these changes are already in 3.0/3.2/etc. or not
> before asking such questions?
Because I didn't find the patches in 3.0 and Ben said he backported them
from 3.6, I think these are two valid reasons to ask, no ?
Willy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 19:55 ` Willy Tarreau
@ 2012-10-19 20:01 ` David Miller
2012-10-19 20:03 ` Willy Tarreau
0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2012-10-19 20:01 UTC (permalink / raw)
To: w; +Cc: bcrl, stable, netdev
From: Willy Tarreau <w@1wt.eu>
Date: Fri, 19 Oct 2012 21:55:57 +0200
> On Fri, Oct 19, 2012 at 03:49:30PM -0400, David Miller wrote:
>>
>> How about checking if these changes are already in 3.0/3.2/etc. or not
>> before asking such questions?
>
> Because I didn't find the patches in 3.0 and Ben said he backported them
> from 3.6, I think these are two valid reasons to ask, no ?
Well, the thing is, I personally don't consider them appropriate for
3.x.y -stable backports, and that's why I haven't submitted them.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 20:01 ` David Miller
@ 2012-10-19 20:03 ` Willy Tarreau
2012-10-19 20:07 ` David Miller
2012-10-19 20:18 ` Benjamin LaHaise
0 siblings, 2 replies; 19+ messages in thread
From: Willy Tarreau @ 2012-10-19 20:03 UTC (permalink / raw)
To: David Miller; +Cc: bcrl, stable, netdev
On Fri, Oct 19, 2012 at 04:01:04PM -0400, David Miller wrote:
> From: Willy Tarreau <w@1wt.eu>
> Date: Fri, 19 Oct 2012 21:55:57 +0200
>
> > On Fri, Oct 19, 2012 at 03:49:30PM -0400, David Miller wrote:
> >>
> >> How about checking if these changes are already in 3.0/3.2/etc. or not
> >> before asking such questions?
> >
> > Because I didn't find the patches in 3.0 and Ben said he backported them
> > from 3.6, I think these are two valid reasons to ask, no ?
>
> Well, the thing is, I personally don't consider them appropriate for
> 3.x.y -stable backports, and that's why I haven't submitted them.
OK. Is is because the issue is less important there or because the fix are
more risky than the issues they fix (or any other reason) ?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 20:03 ` Willy Tarreau
@ 2012-10-19 20:07 ` David Miller
2012-10-19 20:14 ` Willy Tarreau
2012-10-19 20:18 ` Benjamin LaHaise
1 sibling, 1 reply; 19+ messages in thread
From: David Miller @ 2012-10-19 20:07 UTC (permalink / raw)
To: w; +Cc: bcrl, stable, netdev
From: Willy Tarreau <w@1wt.eu>
Date: Fri, 19 Oct 2012 22:03:18 +0200
> On Fri, Oct 19, 2012 at 04:01:04PM -0400, David Miller wrote:
>> From: Willy Tarreau <w@1wt.eu>
>> Date: Fri, 19 Oct 2012 21:55:57 +0200
>>
>> > On Fri, Oct 19, 2012 at 03:49:30PM -0400, David Miller wrote:
>> >>
>> >> How about checking if these changes are already in 3.0/3.2/etc. or not
>> >> before asking such questions?
>> >
>> > Because I didn't find the patches in 3.0 and Ben said he backported them
>> > from 3.6, I think these are two valid reasons to ask, no ?
>>
>> Well, the thing is, I personally don't consider them appropriate for
>> 3.x.y -stable backports, and that's why I haven't submitted them.
>
> OK. Is is because the issue is less important there or because the fix are
> more risky than the issues they fix (or any other reason) ?
I have a different opinion about the risk/benefit ratio than Ben does.
I do not think these cases are important enough to enough people to
justify -stable inclusion at all.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 20:07 ` David Miller
@ 2012-10-19 20:14 ` Willy Tarreau
2012-10-19 20:22 ` Benjamin LaHaise
0 siblings, 1 reply; 19+ messages in thread
From: Willy Tarreau @ 2012-10-19 20:14 UTC (permalink / raw)
To: David Miller; +Cc: bcrl, stable, netdev
On Fri, Oct 19, 2012 at 04:07:11PM -0400, David Miller wrote:
> From: Willy Tarreau <w@1wt.eu>
> Date: Fri, 19 Oct 2012 22:03:18 +0200
>
> > On Fri, Oct 19, 2012 at 04:01:04PM -0400, David Miller wrote:
> >> From: Willy Tarreau <w@1wt.eu>
> >> Date: Fri, 19 Oct 2012 21:55:57 +0200
> >>
> >> > On Fri, Oct 19, 2012 at 03:49:30PM -0400, David Miller wrote:
> >> >>
> >> >> How about checking if these changes are already in 3.0/3.2/etc. or not
> >> >> before asking such questions?
> >> >
> >> > Because I didn't find the patches in 3.0 and Ben said he backported them
> >> > from 3.6, I think these are two valid reasons to ask, no ?
> >>
> >> Well, the thing is, I personally don't consider them appropriate for
> >> 3.x.y -stable backports, and that's why I haven't submitted them.
> >
> > OK. Is is because the issue is less important there or because the fix are
> > more risky than the issues they fix (or any other reason) ?
>
> I have a different opinion about the risk/benefit ratio than Ben does.
>
> I do not think these cases are important enough to enough people to
> justify -stable inclusion at all.
OK, thanks for the precision.
So maybe in the end we should just merge d11a4dc18 that Ben found to be
the least invasive one fixing the issues, and we'd be in sync with the
rest of the stable branches, even if, as you noted a few days ago, it's
only a partial fix for the issue.
Ben, what's your opinion on this ? I know it's never fun to do backports
and not merge them later, but I trust David more than anyone else on the
network part, so if he decided that while incomplete, the patch above
was all that was needed for other stable branches, maybe we should just
stay on the safe side and do the same ?
Willy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 20:03 ` Willy Tarreau
2012-10-19 20:07 ` David Miller
@ 2012-10-19 20:18 ` Benjamin LaHaise
1 sibling, 0 replies; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 20:18 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
On Fri, Oct 19, 2012 at 10:03:18PM +0200, Willy Tarreau wrote:
> OK. Is is because the issue is less important there or because the fix are
> more risky than the issues they fix (or any other reason) ?
The behaviour of 2.6.32.x is certainly a regression from previous stable
kernels used by various distributions (ie 2.6.18), and is quite serious
for those of us running routers on top of these releases.
-ben
--
"Thought is the essence of where you are now."
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 20:14 ` Willy Tarreau
@ 2012-10-19 20:22 ` Benjamin LaHaise
2012-10-19 20:53 ` Willy Tarreau
0 siblings, 1 reply; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 20:22 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
On Fri, Oct 19, 2012 at 10:14:31PM +0200, Willy Tarreau wrote:
...
> So maybe in the end we should just merge d11a4dc18 that Ben found to be
> the least invasive one fixing the issues, and we'd be in sync with the
> rest of the stable branches, even if, as you noted a few days ago, it's
> only a partial fix for the issue.
>
> Ben, what's your opinion on this ? I know it's never fun to do backports
> and not merge them later, but I trust David more than anyone else on the
> network part, so if he decided that while incomplete, the patch above
> was all that was needed for other stable branches, maybe we should just
> stay on the safe side and do the same ?
There is a caveat to the minimally invasive fix: doing so will result in
cached routes always being lookup up when the check occurs. This could
potentially result in a performance regression from some users. The
tradeoffs here are really murky.
-ben
--
"Thought is the essence of where you are now."
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 20:22 ` Benjamin LaHaise
@ 2012-10-19 20:53 ` Willy Tarreau
2012-10-19 21:03 ` Benjamin LaHaise
0 siblings, 1 reply; 19+ messages in thread
From: Willy Tarreau @ 2012-10-19 20:53 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: David Miller, stable, netdev
On Fri, Oct 19, 2012 at 04:22:44PM -0400, Benjamin LaHaise wrote:
> On Fri, Oct 19, 2012 at 10:14:31PM +0200, Willy Tarreau wrote:
> ...
> > So maybe in the end we should just merge d11a4dc18 that Ben found to be
> > the least invasive one fixing the issues, and we'd be in sync with the
> > rest of the stable branches, even if, as you noted a few days ago, it's
> > only a partial fix for the issue.
> >
> > Ben, what's your opinion on this ? I know it's never fun to do backports
> > and not merge them later, but I trust David more than anyone else on the
> > network part, so if he decided that while incomplete, the patch above
> > was all that was needed for other stable branches, maybe we should just
> > stay on the safe side and do the same ?
>
> There is a caveat to the minimally invasive fix: doing so will result in
> cached routes always being lookup up when the check occurs. This could
> potentially result in a performance regression from some users. The
> tradeoffs here are really murky.
I see. Then users will have the same issue when upgrading from 2.6.32 to 3.0.
OK let's bisect the situation :
- current 2.6.32 users are facing a routing bug that needs be fixed.
- 2.6.34 and onwards do not have this bug but are probably affected by
lower performance due to the minimal fix.
- if we apply the minimal fix alone, some 2.6.32 users will suddenly
experience a performance regression (what order of magnitude would
be nice to estimate).
- if we apply all the fixes, 2.6.32 will experience the performance
regression only when the upgrade to newer kernels which don't have
these fixes.
- the fixes come with a risk of regression (as usual).
Note that everything is tied to the level of performance regression to
expect from the fix when jumping from current 2.6.32 to next LTS. For
example, if some other improvements in 3.0 compensate for the performance
loss caused by the fix, it might be OK to have the other fixes only in
2.6.32. But if it is an important enough performance drop, would the
following plan be acceptable to you and David ?
- we merge the minimal fix now in order to get rid of this nasty issue
and to be in sync with other stable kernels ; a performance regression
will happen, but users will face it anyway when upgrading.
- some of us confident enough in the other fixes use them in their
environment for some time to ensure they don't cause unexpected
regressions.
- if past some delay (weeks, months ?) no issue is uncovered, patches
are proposed for inclusion in more recent branches and we merge them
into older branches.
An alternative solution could be the following one if the performance
regression is supposed not to be that important :
- we only merge the minimal patch now to fix the issue and cross fingers
hoping nobody complains ;
- if someone complains, we submit him the other patches then merge them
if they address the regression and do not cause a new one ;
- depending on the importance of the observed performance regression,
the patches may or may not be adopted by more recent kernels.
BTW, do you have an idea of how to test the performance drop due to the
patch on a test platform ? I mean, I can inject packets and change routes,
but doing so randomly can prove nothing.
Thanks,
Willy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 20:53 ` Willy Tarreau
@ 2012-10-19 21:03 ` Benjamin LaHaise
2012-10-19 21:22 ` Willy Tarreau
0 siblings, 1 reply; 19+ messages in thread
From: Benjamin LaHaise @ 2012-10-19 21:03 UTC (permalink / raw)
To: Willy Tarreau; +Cc: David Miller, stable, netdev
On Fri, Oct 19, 2012 at 10:53:00PM +0200, Willy Tarreau wrote:
> I see. Then users will have the same issue when upgrading from 2.6.32 to 3.0.
>
> OK let's bisect the situation :
> - current 2.6.32 users are facing a routing bug that needs be fixed.
>
> - 2.6.34 and onwards do not have this bug but are probably affected by
> lower performance due to the minimal fix.
Argh, sorry (too many patches in flight), it looks like just cherry picking
d11a4dc18bf41719c9f0d7ed494d295dd2973b92 is okay. It does have the
rt_is_expired() changes, so it should not have much of a performance
regression. So going with only this change is probably the way to go
for now.
-ben
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated
2012-10-19 21:03 ` Benjamin LaHaise
@ 2012-10-19 21:22 ` Willy Tarreau
0 siblings, 0 replies; 19+ messages in thread
From: Willy Tarreau @ 2012-10-19 21:22 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: David Miller, stable, netdev
On Fri, Oct 19, 2012 at 05:03:33PM -0400, Benjamin LaHaise wrote:
> On Fri, Oct 19, 2012 at 10:53:00PM +0200, Willy Tarreau wrote:
> > I see. Then users will have the same issue when upgrading from 2.6.32 to 3.0.
> >
> > OK let's bisect the situation :
> > - current 2.6.32 users are facing a routing bug that needs be fixed.
> >
> > - 2.6.34 and onwards do not have this bug but are probably affected by
> > lower performance due to the minimal fix.
>
> Argh, sorry (too many patches in flight), it looks like just cherry picking
> d11a4dc18bf41719c9f0d7ed494d295dd2973b92 is okay. It does have the
> rt_is_expired() changes, so it should not have much of a performance
> regression. So going with only this change is probably the way to go
> for now.
Perfect then! I'm queuing it.
Thanks very much for your work !
Willy
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-10-19 21:22 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-19 19:13 [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 1/6] ipv4: check rt_genid in dst_check Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 2/6] net: Document dst->obsolete better Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 3/6] ipv6: use DST_* macro to set obselete field Benjamin LaHaise
2012-10-19 19:21 ` [PATCH 4/6] netns: move net->ipv4.rt_genid to net->rt_genid Benjamin LaHaise
2012-10-19 19:22 ` [PATCH 5/6] ipv6: use net->rt_genid to check dst validity Benjamin LaHaise
2012-10-19 19:22 ` [PATCH 6/6] xfrm: invalidate dst on policy insertion/deletion Benjamin LaHaise
2012-10-19 19:48 ` [stable 2.6.32.y PATCH 0/6] net: fixes for cached dsts are never invalidated Willy Tarreau
2012-10-19 19:49 ` David Miller
2012-10-19 19:55 ` Willy Tarreau
2012-10-19 20:01 ` David Miller
2012-10-19 20:03 ` Willy Tarreau
2012-10-19 20:07 ` David Miller
2012-10-19 20:14 ` Willy Tarreau
2012-10-19 20:22 ` Benjamin LaHaise
2012-10-19 20:53 ` Willy Tarreau
2012-10-19 21:03 ` Benjamin LaHaise
2012-10-19 21:22 ` Willy Tarreau
2012-10-19 20:18 ` Benjamin LaHaise
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).