* [PATCH 1/3] vti6: flush x-netns xfrm cache when vti interface is removed
2016-11-25 6:57 pull request (net): ipsec 2016-11-25 Steffen Klassert
@ 2016-11-25 6:57 ` Steffen Klassert
2016-11-25 6:57 ` [PATCH 2/3] xfrm: unbreak xfrm_sk_policy_lookup Steffen Klassert
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Steffen Klassert @ 2016-11-25 6:57 UTC (permalink / raw)
To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This is the same fix than commit a5d0dc810abf ("vti: flush x-netns xfrm
cache when vti interface is removed")
This patch fixes a refcnt problem when a x-netns vti6 interface is removed:
unregister_netdevice: waiting for vti6_test to become free. Usage count = 1
Here is a script to reproduce the problem:
ip link set dev ntfp2 up
ip addr add dev ntfp2 2001::1/64
ip link add vti6_test type vti6 local 2001::1 remote 2001::2 key 1
ip netns add secure
ip link set vti6_test netns secure
ip netns exec secure ip link set vti6_test up
ip netns exec secure ip link s lo up
ip netns exec secure ip addr add dev vti6_test 2003::1/64
ip -6 xfrm policy add dir out tmpl src 2001::1 dst 2001::2 proto esp \
mode tunnel mark 1
ip -6 xfrm policy add dir in tmpl src 2001::2 dst 2001::1 proto esp \
mode tunnel mark 1
ip xfrm state add src 2001::1 dst 2001::2 proto esp spi 1 mode tunnel \
enc des3_ede 0x112233445566778811223344556677881122334455667788 mark 1
ip xfrm state add src 2001::2 dst 2001::1 proto esp spi 1 mode tunnel \
enc des3_ede 0x112233445566778811223344556677881122334455667788 mark 1
ip netns exec secure ping6 -c 4 2003::2
ip netns del secure
CC: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
net/ipv6/ip6_vti.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index 8a02ca8..c299c1e 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -1138,6 +1138,33 @@ static struct xfrm6_protocol vti_ipcomp6_protocol __read_mostly = {
.priority = 100,
};
+static bool is_vti6_tunnel(const struct net_device *dev)
+{
+ return dev->netdev_ops == &vti6_netdev_ops;
+}
+
+static int vti6_device_event(struct notifier_block *unused,
+ unsigned long event, void *ptr)
+{
+ struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+ struct ip6_tnl *t = netdev_priv(dev);
+
+ if (!is_vti6_tunnel(dev))
+ return NOTIFY_DONE;
+
+ switch (event) {
+ case NETDEV_DOWN:
+ if (!net_eq(t->net, dev_net(dev)))
+ xfrm_garbage_collect(t->net);
+ break;
+ }
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block vti6_notifier_block __read_mostly = {
+ .notifier_call = vti6_device_event,
+};
+
/**
* vti6_tunnel_init - register protocol and reserve needed resources
*
@@ -1148,6 +1175,8 @@ static int __init vti6_tunnel_init(void)
const char *msg;
int err;
+ register_netdevice_notifier(&vti6_notifier_block);
+
msg = "tunnel device";
err = register_pernet_device(&vti6_net_ops);
if (err < 0)
@@ -1180,6 +1209,7 @@ xfrm_proto_ah_failed:
xfrm_proto_esp_failed:
unregister_pernet_device(&vti6_net_ops);
pernet_dev_failed:
+ unregister_netdevice_notifier(&vti6_notifier_block);
pr_err("vti6 init: failed to register %s\n", msg);
return err;
}
@@ -1194,6 +1224,7 @@ static void __exit vti6_tunnel_cleanup(void)
xfrm6_protocol_deregister(&vti_ah6_protocol, IPPROTO_AH);
xfrm6_protocol_deregister(&vti_esp6_protocol, IPPROTO_ESP);
unregister_pernet_device(&vti6_net_ops);
+ unregister_netdevice_notifier(&vti6_notifier_block);
}
module_init(vti6_tunnel_init);
--
1.9.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/3] xfrm: unbreak xfrm_sk_policy_lookup
2016-11-25 6:57 pull request (net): ipsec 2016-11-25 Steffen Klassert
2016-11-25 6:57 ` [PATCH 1/3] vti6: flush x-netns xfrm cache when vti interface is removed Steffen Klassert
@ 2016-11-25 6:57 ` Steffen Klassert
2016-11-25 6:58 ` [PATCH 3/3] flowcache: Increase threshold for refusing new allocations Steffen Klassert
2016-11-28 1:22 ` pull request (net): ipsec 2016-11-25 David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Steffen Klassert @ 2016-11-25 6:57 UTC (permalink / raw)
To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
From: Florian Westphal <fw@strlen.de>
if we succeed grabbing the refcount, then
if (err && !xfrm_pol_hold_rcu)
will evaluate to false so this hits last else branch which then
sets policy to ERR_PTR(0).
Fixes: ae33786f73a7ce ("xfrm: policy: only use rcu in xfrm_sk_policy_lookup")
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
net/xfrm/xfrm_policy.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index fd69866..5bf7e1bf 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1268,12 +1268,14 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int dir,
err = security_xfrm_policy_lookup(pol->security,
fl->flowi_secid,
policy_to_flow_dir(dir));
- if (!err && !xfrm_pol_hold_rcu(pol))
- goto again;
- else if (err == -ESRCH)
+ if (!err) {
+ if (!xfrm_pol_hold_rcu(pol))
+ goto again;
+ } else if (err == -ESRCH) {
pol = NULL;
- else
+ } else {
pol = ERR_PTR(err);
+ }
} else
pol = NULL;
}
--
1.9.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/3] flowcache: Increase threshold for refusing new allocations
2016-11-25 6:57 pull request (net): ipsec 2016-11-25 Steffen Klassert
2016-11-25 6:57 ` [PATCH 1/3] vti6: flush x-netns xfrm cache when vti interface is removed Steffen Klassert
2016-11-25 6:57 ` [PATCH 2/3] xfrm: unbreak xfrm_sk_policy_lookup Steffen Klassert
@ 2016-11-25 6:58 ` Steffen Klassert
2016-11-28 1:22 ` pull request (net): ipsec 2016-11-25 David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Steffen Klassert @ 2016-11-25 6:58 UTC (permalink / raw)
To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
From: Miroslav Urbanek <mu@miroslavurbanek.com>
The threshold for OOM protection is too small for systems with large
number of CPUs. Applications report ENOBUFs on connect() every 10
minutes.
The problem is that the variable net->xfrm.flow_cache_gc_count is a
global counter while the variable fc->high_watermark is a per-CPU
constant. Take the number of CPUs into account as well.
Fixes: 6ad3122a08e3 ("flowcache: Avoid OOM condition under preasure")
Reported-by: Lukáš Koldrt <lk@excello.cz>
Tested-by: Jan Hejl <jh@excello.cz>
Signed-off-by: Miroslav Urbanek <mu@miroslavurbanek.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
net/core/flow.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/net/core/flow.c b/net/core/flow.c
index 3937b1b..18e8893 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -95,7 +95,6 @@ static void flow_cache_gc_task(struct work_struct *work)
list_for_each_entry_safe(fce, n, &gc_list, u.gc_list) {
flow_entry_kill(fce, xfrm);
atomic_dec(&xfrm->flow_cache_gc_count);
- WARN_ON(atomic_read(&xfrm->flow_cache_gc_count) < 0);
}
}
@@ -236,9 +235,8 @@ flow_cache_lookup(struct net *net, const struct flowi *key, u16 family, u8 dir,
if (fcp->hash_count > fc->high_watermark)
flow_cache_shrink(fc, fcp);
- if (fcp->hash_count > 2 * fc->high_watermark ||
- atomic_read(&net->xfrm.flow_cache_gc_count) > fc->high_watermark) {
- atomic_inc(&net->xfrm.flow_cache_genid);
+ if (atomic_read(&net->xfrm.flow_cache_gc_count) >
+ 2 * num_online_cpus() * fc->high_watermark) {
flo = ERR_PTR(-ENOBUFS);
goto ret_object;
}
--
1.9.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: pull request (net): ipsec 2016-11-25
2016-11-25 6:57 pull request (net): ipsec 2016-11-25 Steffen Klassert
` (2 preceding siblings ...)
2016-11-25 6:58 ` [PATCH 3/3] flowcache: Increase threshold for refusing new allocations Steffen Klassert
@ 2016-11-28 1:22 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2016-11-28 1:22 UTC (permalink / raw)
To: steffen.klassert; +Cc: herbert, netdev
From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Fri, 25 Nov 2016 07:57:57 +0100
> 1) Fix a refcount leak in vti6.
> From Nicolas Dichtel.
>
> 2) Fix a wrong if statement in xfrm_sk_policy_lookup.
> From Florian Westphal.
>
> 3) The flowcache watermarks are per cpu. Take this into
> account when comparing to the threshold where we
> refusing new allocations. From Miroslav Urbanek.
>
> Please pull or let me know if there are problems.
Pulled, thanks Steffen!
^ permalink raw reply [flat|nested] 5+ messages in thread