* [PATCH] netfilter: fix ->nfnl NULL oops @ 2011-11-08 22:16 Alexey Dobriyan 2011-11-08 22:50 ` Alex Bligh 2011-11-09 14:34 ` Pablo Neira Ayuso 0 siblings, 2 replies; 13+ messages in thread From: Alexey Dobriyan @ 2011-11-08 22:16 UTC (permalink / raw) To: alex, pablo; +Cc: kaber, netfilter-devel Sorry for delay. I recall myself writing that net->nfnl NULL check is racy or something like that (but I can't find this email in archives). I've read the code once again, and I'm quite sure, NULL ->nfnl check is correct if RCU precautions are made. Regarding ->report check, I think it's bogus. If there are no listeners, there are NO listeners and whether to report back to userspace doesn't matter. I'm sure I'm missing something obvious here. Please, review. ------------------------------------------------------------------ [PATCH] netfilter: fix ->nfnl NULL oops Base on bugreport and patch by Alex Bligh. nfnetlink and nfconntrack modules can be loaded in either order. If nfnetlink is loaded after nfconntrack, code clearing ->nfnl socket will be called after code which does conntrack cleanup. However, nfconntrack "calls" into nfnetlink by dereferencing socket pointer, and thus can't survive (see netlink_has_listeners()). Check for NULL ->nfnl socket. This is OK, because a) RCU b) dying netns has refcount 0, and no userspace which can access netlink socket directly or indirectly. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> --- net/netfilter/nf_conntrack_netlink.c | 4 ++-- net/netfilter/nfnetlink.c | 11 ++++++++++- 2 files changed, 12 insertions(+), 3 deletions(-) --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -570,7 +570,7 @@ ctnetlink_conntrack_event(unsigned int events, struct nf_ct_event *item) return 0; net = nf_ct_net(ct); - if (!item->report && !nfnetlink_has_listeners(net, group)) + if (!nfnetlink_has_listeners(net, group)) return 0; skb = nlmsg_new(ctnetlink_nlmsg_size(ct), GFP_ATOMIC); @@ -1723,7 +1723,7 @@ ctnetlink_expect_event(unsigned int events, struct nf_exp_event *item) } else return 0; - if (!item->report && !nfnetlink_has_listeners(net, group)) + if (!nfnetlink_has_listeners(net, group)) return 0; skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); --- a/net/netfilter/nfnetlink.c +++ b/net/netfilter/nfnetlink.c @@ -99,7 +99,16 @@ nfnetlink_find_client(u_int16_t type, const struct nfnetlink_subsystem *ss) int nfnetlink_has_listeners(struct net *net, unsigned int group) { - return netlink_has_listeners(net->nfnl, group); + struct sock *nfnl; + + /* + * nfnetlink and nfconntrack can be loaded legitimately in either order, + * thus can die legitimately in either order. + * But, no socket -- no listeners! + * Refcount of dying netns is 0, there is no userspace waiting for events. + */ + nfnl = rcu_dereference(net->nfnl); + return nfnl && netlink_has_listeners(nfnl, group); } EXPORT_SYMBOL_GPL(nfnetlink_has_listeners); ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-08 22:16 [PATCH] netfilter: fix ->nfnl NULL oops Alexey Dobriyan @ 2011-11-08 22:50 ` Alex Bligh 2011-11-09 14:35 ` Pablo Neira Ayuso 2011-11-09 14:34 ` Pablo Neira Ayuso 1 sibling, 1 reply; 13+ messages in thread From: Alex Bligh @ 2011-11-08 22:50 UTC (permalink / raw) To: Alexey Dobriyan, pablo; +Cc: kaber, netfilter-devel, Alex Bligh --On 9 November 2011 01:16:35 +0300 Alexey Dobriyan <adobriyan@gmail.com> wrote: > Sorry for delay. > > I recall myself writing that net->nfnl NULL check is racy or > something like that (but I can't find this email in archives). > > I've read the code once again, and I'm quite sure, > NULL ->nfnl check is correct if RCU precautions are made. Your patch looks better than mine in this respect (as far as I can tell from code reading rather than testing) > Regarding ->report check, I think it's bogus. All I can tell is that net->nfnl == NULL is a condition that happens in practice. If that is read, race free, and treated as something that causes nfnetlink_has_listeners to return 0 irrespective of item_report, then the oops will not occur. So from my point of view it seems right. Note that in the mean time I think my patch made -stable. -- Alex Bligh ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-08 22:50 ` Alex Bligh @ 2011-11-09 14:35 ` Pablo Neira Ayuso 2011-11-09 19:04 ` Alex Bligh 0 siblings, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2011-11-09 14:35 UTC (permalink / raw) To: Alex Bligh; +Cc: Alexey Dobriyan, kaber, netfilter-devel On Tue, Nov 08, 2011 at 10:50:05PM +0000, Alex Bligh wrote: > --On 9 November 2011 01:16:35 +0300 Alexey Dobriyan > <adobriyan@gmail.com> wrote: > > >Sorry for delay. > > > >I recall myself writing that net->nfnl NULL check is racy or > >something like that (but I can't find this email in archives). > > > >I've read the code once again, and I'm quite sure, > >NULL ->nfnl check is correct if RCU precautions are made. > > Your patch looks better than mine in this respect (as far as I > can tell from code reading rather than testing) > > >Regarding ->report check, I think it's bogus. > > All I can tell is that net->nfnl == NULL is a condition that happens > in practice. If that is read, race free, and treated as something > that causes nfnetlink_has_listeners to return 0 irrespective of > item_report, then the oops will not occur. So from my point of view > it seems right. > > Note that in the mean time I think my patch made -stable. That's news. I personally NACKed that patch. Can you confirm that, please? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-09 14:35 ` Pablo Neira Ayuso @ 2011-11-09 19:04 ` Alex Bligh 0 siblings, 0 replies; 13+ messages in thread From: Alex Bligh @ 2011-11-09 19:04 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: Alexey Dobriyan, kaber, netfilter-devel, Alex Bligh Pablo, --On 9 November 2011 15:35:10 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: >> Note that in the mean time I think my patch made -stable. > > That's news. > > I personally NACKed that patch. > > Can you confirm that, please? I meant -mm. See below. We do need something to fix the Oops even if it isn't the right way. The patch can never do any harm, as far as I can tell, save waste a couple of cycles on container destroy, as if net->nfnl == NULL, it's always going to Oops. -- Alex Bligh ---------- Forwarded Message ---------- Date: 31 October 2011 12:52:22 -0700 From: akpm@linux-foundation.org To: kaber@trash.net CC: davem@davemloft.net, akpm@linux-foundation.org, alex@alex.org.uk, stable@kernel.org, stable@vger.kernel.org Subject: [patch 1/1] net/netfilter/nf_conntrack_netlink.c: fix Oops on container destroy From: Alex Bligh <alex@alex.org.uk> Subject: net/netfilter/nf_conntrack_netlink.c: fix Oops on container destroy Problem: A repeatable Oops can be caused if a container with networking unshared is destroyed when it has nf_conntrack entries yet to expire. A copy of the oops follows below. A perl program generating the oops repeatably is attached inline below. Analysis: The oops is called from cleanup_net when the namespace is destroyed. conntrack iterates through outstanding events and calls death_by_timeout on each of them, which in turn produces a call to ctnetlink_conntrack_event. This calls nf_netlink_has_listeners, which oopses because net->nfnl is NULL. The perl program generates the container through fork() then clone(NS_NEWNET). I does not explicitly set up netlink explicitly set up netlink, but I presume it was set up else net->nfnl would have been NULL earlier (i.e. when an earlier connection timed out). This would thus suggest that net->nfnl is made NULL during the destruction of the container, which I think is done by nfnetlink_net_exit_batch. I can see that the various subsystems are deinitialised in the opposite order to which the relevant register_pernet_subsys calls are called, and both nf_conntrack and nfnetlink_net_ops register their relevant subsystems. If nfnetlink_net_ops registered later than nfconntrack, then its exit routine would have been called first, which would cause the oops described. I am not sure there is anything to prevent this happening in a container environment. Whilst there's perhaps a more complex problem revolving around ordering of subsystem deinit, it seems to me that missing a netlink event on a container that is dying is not a disaster. An early check for net->nfnl being non-NULL in ctnetlink_conntrack_event appears to fix this. There may remain a potential race condition if it becomes NULL immediately after being checked (I am not sure any lock is held at this point or how synchronisation for subsystem deinitialization works). Patch: The patch attached should apply on everything from 2.6.26 (if not before) onwards; it appears to be a problem on all kernels. This was taken against Ubuntu-3.0.0-11.17 which is very close to 3.0.4. I have torture-tested it with the above perl script for 15 minutes or so; the perl script hung the machine within 20 seconds without this patch. Applicability: If this is the right solution, it should be applied to all stable kernels as well as head. Apart from the minor overhead of checking one variable against NULL, it can never 'do the wrong thing', because if net->nfnl is NULL, an oops will inevitably result. Therefore, checking is a reasonable thing to do unless it can be proven than net->nfnl will never be NULL. Check net->nfnl for NULL in ctnetlink_conntrack_event to avoid Oops on container destroy Signed-off-by: Alex Bligh <alex@alex.org.uk> Cc: Patrick McHardy <kaber@trash.net> Cc: David Miller <davem@davemloft.net> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- net/netfilter/nf_conntrack_netlink.c | 5 +++++ 1 file changed, 5 insertions(+) diff -puN net/netfilter/nf_conntrack_netlink.c~net-netfilter-nf_conntrack_netlinkc-fi x-oops-on-container-destroy net/netfilter/nf_conntrack_netlink.c --- a/net/netfilter/nf_conntrack_netlink.c~net-netfilter-nf_conntrack_netlinkc- fix-oops-on-container-destroy +++ a/net/netfilter/nf_conntrack_netlink.c @@ -570,6 +570,11 @@ ctnetlink_conntrack_event(unsigned int e return 0; net = nf_ct_net(ct); + + /* container deinit, netlink may have died before death_by_timeout */ + if (!net->nfnl) + return 0; + if (!item->report && !nfnetlink_has_listeners(net, group)) return 0; _ ---------- End Forwarded Message ---------- -- Alex Bligh ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-08 22:16 [PATCH] netfilter: fix ->nfnl NULL oops Alexey Dobriyan 2011-11-08 22:50 ` Alex Bligh @ 2011-11-09 14:34 ` Pablo Neira Ayuso 2011-11-09 19:06 ` Alex Bligh 2011-11-15 9:56 ` Pablo Neira Ayuso 1 sibling, 2 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2011-11-09 14:34 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: alex, kaber, netfilter-devel On Wed, Nov 09, 2011 at 01:16:35AM +0300, Alexey Dobriyan wrote: > Sorry for delay. > > I recall myself writing that net->nfnl NULL check is racy or > something like that (but I can't find this email in archives). > > I've read the code once again, and I'm quite sure, > NULL ->nfnl check is correct if RCU precautions are made. > > Regarding ->report check, I think it's bogus. > > If there are no listeners, there are NO listeners > and whether to report back to userspace doesn't matter. > > I'm sure I'm missing something obvious here. > > Please, review. Alexey. This is a workaround. We have to make ctnl_notifier container-aware which is the real problem. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-09 14:34 ` Pablo Neira Ayuso @ 2011-11-09 19:06 ` Alex Bligh 2011-11-15 9:56 ` Pablo Neira Ayuso 1 sibling, 0 replies; 13+ messages in thread From: Alex Bligh @ 2011-11-09 19:06 UTC (permalink / raw) To: Pablo Neira Ayuso, Alexey Dobriyan; +Cc: kaber, netfilter-devel, Alex Bligh --On 9 November 2011 15:34:23 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: >> I recall myself writing that net->nfnl NULL check is racy or >> something like that (but I can't find this email in archives). >> >> I've read the code once again, and I'm quite sure, >> NULL ->nfnl check is correct if RCU precautions are made. >> >> Regarding ->report check, I think it's bogus. >> >> If there are no listeners, there are NO listeners >> and whether to report back to userspace doesn't matter. >> >> I'm sure I'm missing something obvious here. >> >> Please, review. > > Alexey. This is a workaround. We have to make ctnl_notifier > container-aware which is the real problem. It is indeed a workaround. However, until we have a real solution, we need a workaround, or we can trivially generate fatal oopses. I am all for fixing it properly, but please can we leave the workaround in place until we have the proper answer, as without this we see machines die frequently. -- Alex Bligh ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-09 14:34 ` Pablo Neira Ayuso 2011-11-09 19:06 ` Alex Bligh @ 2011-11-15 9:56 ` Pablo Neira Ayuso 2011-11-15 10:13 ` Alex Bligh 1 sibling, 1 reply; 13+ messages in thread From: Pablo Neira Ayuso @ 2011-11-15 9:56 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: alex, kaber, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 1050 bytes --] On Wed, Nov 09, 2011 at 03:34:23PM +0100, Pablo Neira Ayuso wrote: > On Wed, Nov 09, 2011 at 01:16:35AM +0300, Alexey Dobriyan wrote: > > Sorry for delay. > > > > I recall myself writing that net->nfnl NULL check is racy or > > something like that (but I can't find this email in archives). > > > > I've read the code once again, and I'm quite sure, > > NULL ->nfnl check is correct if RCU precautions are made. > > > > Regarding ->report check, I think it's bogus. > > > > If there are no listeners, there are NO listeners > > and whether to report back to userspace doesn't matter. > > > > I'm sure I'm missing something obvious here. > > > > Please, review. > > Alexey. This is a workaround. We have to make ctnl_notifier > container-aware which is the real problem. I made this patch. It makes the ctnl_notifier container-aware. I'm trying to set some lxc container to test it (it's not that straight forward in debian). btw: if it's fine, i'll add the credits (reported-by and other tags before final submission, this is still a RFC). [-- Attachment #2: ct.patch --] [-- Type: text/x-diff, Size: 11300 bytes --] netfilter: make ctnetlink event callback registration netns aware From: Pablo Neira Ayuso <pablo@netfilter.org> This fixes an oops with the following recipe: 1) container is created. 2) nf_conntrack_netlink is enabled and there are some entries in the conntrack table. 3) one container is destroyed, oops because the conntrack table cleanup tries to report the destroy event to user-space but the nfnetlink socket has already gone (as the nfnetlink socket is net_namespace-aware). To fix this situation, we make the ctnl_notifier netns aware so the callback is registered/unregistered if the container is created/destroyed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- include/net/netfilter/nf_conntrack_ecache.h | 19 ++++--- include/net/netns/conntrack.h | 2 + net/netfilter/nf_conntrack_ecache.c | 37 +++++++------- net/netfilter/nf_conntrack_netlink.c | 73 +++++++++++++++++++-------- 4 files changed, 82 insertions(+), 49 deletions(-) diff --git a/include/net/netfilter/nf_conntrack_ecache.h b/include/net/netfilter/nf_conntrack_ecache.h index 4283508..a88fb69 100644 --- a/include/net/netfilter/nf_conntrack_ecache.h +++ b/include/net/netfilter/nf_conntrack_ecache.h @@ -67,18 +67,18 @@ struct nf_ct_event_notifier { int (*fcn)(unsigned int events, struct nf_ct_event *item); }; -extern struct nf_ct_event_notifier __rcu *nf_conntrack_event_cb; -extern int nf_conntrack_register_notifier(struct nf_ct_event_notifier *nb); -extern void nf_conntrack_unregister_notifier(struct nf_ct_event_notifier *nb); +extern int nf_conntrack_register_notifier(struct net *net, struct nf_ct_event_notifier *nb); +extern void nf_conntrack_unregister_notifier(struct net *net, struct nf_ct_event_notifier *nb); extern void nf_ct_deliver_cached_events(struct nf_conn *ct); static inline void nf_conntrack_event_cache(enum ip_conntrack_events event, struct nf_conn *ct) { + struct net *net = nf_ct_net(ct); struct nf_conntrack_ecache *e; - if (nf_conntrack_event_cb == NULL) + if (net->ct.nf_conntrack_event_cb == NULL) return; e = nf_ct_ecache_find(ct); @@ -95,11 +95,12 @@ nf_conntrack_eventmask_report(unsigned int eventmask, int report) { int ret = 0; + struct net *net = nf_ct_net(ct); struct nf_ct_event_notifier *notify; struct nf_conntrack_ecache *e; rcu_read_lock(); - notify = rcu_dereference(nf_conntrack_event_cb); + notify = rcu_dereference(net->ct.nf_conntrack_event_cb); if (notify == NULL) goto out_unlock; @@ -164,9 +165,8 @@ struct nf_exp_event_notifier { int (*fcn)(unsigned int events, struct nf_exp_event *item); }; -extern struct nf_exp_event_notifier __rcu *nf_expect_event_cb; -extern int nf_ct_expect_register_notifier(struct nf_exp_event_notifier *nb); -extern void nf_ct_expect_unregister_notifier(struct nf_exp_event_notifier *nb); +extern int nf_ct_expect_register_notifier(struct net *net, struct nf_exp_event_notifier *nb); +extern void nf_ct_expect_unregister_notifier(struct net *net, struct nf_exp_event_notifier *nb); static inline void nf_ct_expect_event_report(enum ip_conntrack_expect_events event, @@ -174,11 +174,12 @@ nf_ct_expect_event_report(enum ip_conntrack_expect_events event, u32 pid, int report) { + struct net *net = nf_ct_exp_net(exp); struct nf_exp_event_notifier *notify; struct nf_conntrack_ecache *e; rcu_read_lock(); - notify = rcu_dereference(nf_expect_event_cb); + notify = rcu_dereference(net->ct.nf_expect_event_cb); if (notify == NULL) goto out_unlock; diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h index 0249399..7a911ec 100644 --- a/include/net/netns/conntrack.h +++ b/include/net/netns/conntrack.h @@ -18,6 +18,8 @@ struct netns_ct { struct hlist_nulls_head unconfirmed; struct hlist_nulls_head dying; struct ip_conntrack_stat __percpu *stat; + struct nf_ct_event_notifier __rcu *nf_conntrack_event_cb; + struct nf_exp_event_notifier __rcu *nf_expect_event_cb; int sysctl_events; unsigned int sysctl_events_retry_timeout; int sysctl_acct; diff --git a/net/netfilter/nf_conntrack_ecache.c b/net/netfilter/nf_conntrack_ecache.c index 3add994..362b3fa 100644 --- a/net/netfilter/nf_conntrack_ecache.c +++ b/net/netfilter/nf_conntrack_ecache.c @@ -26,22 +26,17 @@ static DEFINE_MUTEX(nf_ct_ecache_mutex); -struct nf_ct_event_notifier __rcu *nf_conntrack_event_cb __read_mostly; -EXPORT_SYMBOL_GPL(nf_conntrack_event_cb); - -struct nf_exp_event_notifier __rcu *nf_expect_event_cb __read_mostly; -EXPORT_SYMBOL_GPL(nf_expect_event_cb); - /* deliver cached events and clear cache entry - must be called with locally * disabled softirqs */ void nf_ct_deliver_cached_events(struct nf_conn *ct) { + struct net *net = nf_ct_net(ct); unsigned long events; struct nf_ct_event_notifier *notify; struct nf_conntrack_ecache *e; rcu_read_lock(); - notify = rcu_dereference(nf_conntrack_event_cb); + notify = rcu_dereference(net->ct.nf_conntrack_event_cb); if (notify == NULL) goto out_unlock; @@ -82,19 +77,20 @@ out_unlock: } EXPORT_SYMBOL_GPL(nf_ct_deliver_cached_events); -int nf_conntrack_register_notifier(struct nf_ct_event_notifier *new) +int nf_conntrack_register_notifier(struct net *net, + struct nf_ct_event_notifier *new) { int ret = 0; struct nf_ct_event_notifier *notify; mutex_lock(&nf_ct_ecache_mutex); - notify = rcu_dereference_protected(nf_conntrack_event_cb, + notify = rcu_dereference_protected(net->ct.nf_conntrack_event_cb, lockdep_is_held(&nf_ct_ecache_mutex)); if (notify != NULL) { ret = -EBUSY; goto out_unlock; } - RCU_INIT_POINTER(nf_conntrack_event_cb, new); + RCU_INIT_POINTER(net->ct.nf_conntrack_event_cb, new); mutex_unlock(&nf_ct_ecache_mutex); return ret; @@ -104,32 +100,34 @@ out_unlock: } EXPORT_SYMBOL_GPL(nf_conntrack_register_notifier); -void nf_conntrack_unregister_notifier(struct nf_ct_event_notifier *new) +void nf_conntrack_unregister_notifier(struct net *net, + struct nf_ct_event_notifier *new) { struct nf_ct_event_notifier *notify; mutex_lock(&nf_ct_ecache_mutex); - notify = rcu_dereference_protected(nf_conntrack_event_cb, + notify = rcu_dereference_protected(net->ct.nf_conntrack_event_cb, lockdep_is_held(&nf_ct_ecache_mutex)); BUG_ON(notify != new); - RCU_INIT_POINTER(nf_conntrack_event_cb, NULL); + RCU_INIT_POINTER(net->ct.nf_conntrack_event_cb, NULL); mutex_unlock(&nf_ct_ecache_mutex); } EXPORT_SYMBOL_GPL(nf_conntrack_unregister_notifier); -int nf_ct_expect_register_notifier(struct nf_exp_event_notifier *new) +int nf_ct_expect_register_notifier(struct net *net, + struct nf_exp_event_notifier *new) { int ret = 0; struct nf_exp_event_notifier *notify; mutex_lock(&nf_ct_ecache_mutex); - notify = rcu_dereference_protected(nf_expect_event_cb, + notify = rcu_dereference_protected(net->ct.nf_expect_event_cb, lockdep_is_held(&nf_ct_ecache_mutex)); if (notify != NULL) { ret = -EBUSY; goto out_unlock; } - RCU_INIT_POINTER(nf_expect_event_cb, new); + RCU_INIT_POINTER(net->ct.nf_expect_event_cb, new); mutex_unlock(&nf_ct_ecache_mutex); return ret; @@ -139,15 +137,16 @@ out_unlock: } EXPORT_SYMBOL_GPL(nf_ct_expect_register_notifier); -void nf_ct_expect_unregister_notifier(struct nf_exp_event_notifier *new) +void nf_ct_expect_unregister_notifier(struct net *net, + struct nf_exp_event_notifier *new) { struct nf_exp_event_notifier *notify; mutex_lock(&nf_ct_ecache_mutex); - notify = rcu_dereference_protected(nf_expect_event_cb, + notify = rcu_dereference_protected(net->ct.nf_expect_event_cb, lockdep_is_held(&nf_ct_ecache_mutex)); BUG_ON(notify != new); - RCU_INIT_POINTER(nf_expect_event_cb, NULL); + RCU_INIT_POINTER(net->ct.nf_expect_event_cb, NULL); mutex_unlock(&nf_ct_ecache_mutex); } EXPORT_SYMBOL_GPL(nf_ct_expect_unregister_notifier); diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index e58aa9b..ef21b22 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -4,7 +4,7 @@ * (C) 2001 by Jay Schulist <jschlst@samba.org> * (C) 2002-2006 by Harald Welte <laforge@gnumonks.org> * (C) 2003 by Patrick Mchardy <kaber@trash.net> - * (C) 2005-2008 by Pablo Neira Ayuso <pablo@netfilter.org> + * (C) 2005-2011 by Pablo Neira Ayuso <pablo@netfilter.org> * * Initial connection tracking via netlink development funded and * generally made possible by Network Robots, Inc. (www.networkrobots.com) @@ -2163,6 +2163,54 @@ MODULE_ALIAS("ip_conntrack_netlink"); MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_CTNETLINK); MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_CTNETLINK_EXP); +static int __net_init ctnetlink_net_init(struct net *net) +{ +#ifdef CONFIG_NF_CONNTRACK_EVENTS + int ret; + + ret = nf_conntrack_register_notifier(net, &ctnl_notifier); + if (ret < 0) { + pr_err("ctnetlink_init: cannot register notifier.\n"); + goto err_out; + } + + ret = nf_ct_expect_register_notifier(net, &ctnl_notifier_exp); + if (ret < 0) { + pr_err("ctnetlink_init: cannot expect register notifier.\n"); + goto err_unreg_notifier; + } +#endif + return 0; + +#ifdef CONFIG_NF_CONNTRACK_EVENTS +err_unreg_notifier: + nf_conntrack_unregister_notifier(net, &ctnl_notifier); +err_out: + return ret; +#endif +} + +static void ctnetlink_net_exit(struct net *net) +{ +#ifdef CONFIG_NF_CONNTRACK_EVENTS + nf_ct_expect_unregister_notifier(net, &ctnl_notifier_exp); + nf_conntrack_unregister_notifier(net, &ctnl_notifier); +#endif +} + +static void __net_exit ctnetlink_net_exit_batch(struct list_head *net_exit_list) +{ + struct net *net; + + list_for_each_entry(net, net_exit_list, exit_list) + ctnetlink_net_exit(net); +} + +static struct pernet_operations ctnetlink_net_ops = { + .init = ctnetlink_net_init, + .exit_batch = ctnetlink_net_exit_batch, +}; + static int __init ctnetlink_init(void) { int ret; @@ -2180,28 +2228,15 @@ static int __init ctnetlink_init(void) goto err_unreg_subsys; } -#ifdef CONFIG_NF_CONNTRACK_EVENTS - ret = nf_conntrack_register_notifier(&ctnl_notifier); - if (ret < 0) { - pr_err("ctnetlink_init: cannot register notifier.\n"); + if (register_pernet_subsys(&ctnetlink_net_ops)) { + pr_err("ctnetlink_init: cannot register pernet operations\n"); goto err_unreg_exp_subsys; } - ret = nf_ct_expect_register_notifier(&ctnl_notifier_exp); - if (ret < 0) { - pr_err("ctnetlink_init: cannot expect register notifier.\n"); - goto err_unreg_notifier; - } -#endif - return 0; -#ifdef CONFIG_NF_CONNTRACK_EVENTS -err_unreg_notifier: - nf_conntrack_unregister_notifier(&ctnl_notifier); err_unreg_exp_subsys: nfnetlink_subsys_unregister(&ctnl_exp_subsys); -#endif err_unreg_subsys: nfnetlink_subsys_unregister(&ctnl_subsys); err_out: @@ -2213,11 +2248,7 @@ static void __exit ctnetlink_exit(void) pr_info("ctnetlink: unregistering from nfnetlink.\n"); nf_ct_remove_userspace_expectations(); -#ifdef CONFIG_NF_CONNTRACK_EVENTS - nf_ct_expect_unregister_notifier(&ctnl_notifier_exp); - nf_conntrack_unregister_notifier(&ctnl_notifier); -#endif - + unregister_pernet_subsys(&ctnetlink_net_ops); nfnetlink_subsys_unregister(&ctnl_exp_subsys); nfnetlink_subsys_unregister(&ctnl_subsys); } ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-15 9:56 ` Pablo Neira Ayuso @ 2011-11-15 10:13 ` Alex Bligh 2011-11-15 14:09 ` Pablo Neira Ayuso 2011-11-21 23:39 ` Pablo Neira Ayuso 0 siblings, 2 replies; 13+ messages in thread From: Alex Bligh @ 2011-11-15 10:13 UTC (permalink / raw) To: Pablo Neira Ayuso, Alexey Dobriyan; +Cc: kaber, netfilter-devel, Alex Bligh Pablo, Have you tried this patch (without the ->nfnl NULL check change) with the perl program I wrote that reliably replicates the original bug? https://bugs.launchpad.net/ubuntu/+source/linux-lts-backport-natty/+bug/843892 specifically the attachment in comment #6: https://bugs.launchpad.net/ubuntu/+source/linux-lts-backport-natty/+bug/843892/+attachment/2382526/+files/testns.pl Alex --On 15 November 2011 10:56:42 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Wed, Nov 09, 2011 at 03:34:23PM +0100, Pablo Neira Ayuso wrote: >> On Wed, Nov 09, 2011 at 01:16:35AM +0300, Alexey Dobriyan wrote: >> > Sorry for delay. >> > >> > I recall myself writing that net->nfnl NULL check is racy or >> > something like that (but I can't find this email in archives). >> > >> > I've read the code once again, and I'm quite sure, >> > NULL ->nfnl check is correct if RCU precautions are made. >> > >> > Regarding ->report check, I think it's bogus. >> > >> > If there are no listeners, there are NO listeners >> > and whether to report back to userspace doesn't matter. >> > >> > I'm sure I'm missing something obvious here. >> > >> > Please, review. >> >> Alexey. This is a workaround. We have to make ctnl_notifier >> container-aware which is the real problem. > > I made this patch. It makes the ctnl_notifier container-aware. > > I'm trying to set some lxc container to test it (it's not that > straight forward in debian). > > btw: if it's fine, i'll add the credits (reported-by and other tags > before final submission, this is still a RFC). -- Alex Bligh ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-15 10:13 ` Alex Bligh @ 2011-11-15 14:09 ` Pablo Neira Ayuso 2011-11-21 23:39 ` Pablo Neira Ayuso 1 sibling, 0 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2011-11-15 14:09 UTC (permalink / raw) To: Alex Bligh; +Cc: Alexey Dobriyan, kaber, netfilter-devel On Tue, Nov 15, 2011 at 10:13:58AM +0000, Alex Bligh wrote: > Pablo, > > Have you tried this patch (without the ->nfnl NULL check change) with > the perl program I wrote that reliably replicates the original bug? > > https://bugs.launchpad.net/ubuntu/+source/linux-lts-backport-natty/+bug/843892 > > specifically the attachment in comment #6: > > https://bugs.launchpad.net/ubuntu/+source/linux-lts-backport-natty/+bug/843892/+attachment/2382526/+files/testns.pl I didn't test my patch yet, I just wanted to ping you that I was looking into this. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-15 10:13 ` Alex Bligh 2011-11-15 14:09 ` Pablo Neira Ayuso @ 2011-11-21 23:39 ` Pablo Neira Ayuso 2011-11-21 23:42 ` Pablo Neira Ayuso 2011-11-21 23:50 ` Alex Bligh 1 sibling, 2 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2011-11-21 23:39 UTC (permalink / raw) To: Alex Bligh; +Cc: Alexey Dobriyan, kaber, netfilter-devel Hi Alex, On Tue, Nov 15, 2011 at 10:13:58AM +0000, Alex Bligh wrote: > Pablo, > > Have you tried this patch (without the ->nfnl NULL check change) with > the perl program I wrote that reliably replicates the original bug? > > https://bugs.launchpad.net/ubuntu/+source/linux-lts-backport-natty/+bug/843892 > > specifically the attachment in comment #6: > > https://bugs.launchpad.net/ubuntu/+source/linux-lts-backport-natty/+bug/843892/+attachment/2382526/+files/testns.pl I didn't try with this script, but the problem can be easily triggered with: 0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded. 1) container is started. 2) connect to it via lxc-console. 3) generate some traffic with the container to create some conntrack entries in its table. 4) stop the container: hit the oops. I've been testing the patch that I proposed with this recipe, now it works fine. I'll pass my patch for 3.2-rc soon, in case you want to make further testing of it. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-21 23:39 ` Pablo Neira Ayuso @ 2011-11-21 23:42 ` Pablo Neira Ayuso 2011-11-21 23:50 ` Alex Bligh 1 sibling, 0 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2011-11-21 23:42 UTC (permalink / raw) To: Alex Bligh; +Cc: Alexey Dobriyan, kaber, netfilter-devel On Tue, Nov 22, 2011 at 12:39:42AM +0100, Pablo Neira Ayuso wrote: > I've been testing the patch that I proposed with this recipe, now it > works fine. > > I'll pass my patch for 3.2-rc soon, in case you want to make further > testing of it. the patch is applied to my nf branch, btw: http://1984.lsi.us.es/git/?p=net/.git;a=shortlog;h=refs/heads/nf ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-21 23:39 ` Pablo Neira Ayuso 2011-11-21 23:42 ` Pablo Neira Ayuso @ 2011-11-21 23:50 ` Alex Bligh 2011-11-22 14:23 ` Pablo Neira Ayuso 1 sibling, 1 reply; 13+ messages in thread From: Alex Bligh @ 2011-11-21 23:50 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: Alexey Dobriyan, kaber, netfilter-devel, Alex Bligh --On 22 November 2011 00:39:42 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > I didn't try with this script, but the problem can be easily > triggered with: > > 0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded. > 1) container is started. > 2) connect to it via lxc-console. > 3) generate some traffic with the container to create some conntrack > entries in its table. > 4) stop the container: hit the oops. > > I've been testing the patch that I proposed with this recipe, now it > works fine. > > I'll pass my patch for 3.2-rc soon, in case you want to make further > testing of it. I will try to, when I have a minute. Occasionally I was getting a double oops before. My theory at the time was that this was to do with passing traffic /as/ the container was being destroyed. I have little to substantiate that, but that was the reason for the perl script (which really just runs a ping as the container is being destroyed, which ensures there are conntrack entries). -- Alex Bligh ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] netfilter: fix ->nfnl NULL oops 2011-11-21 23:50 ` Alex Bligh @ 2011-11-22 14:23 ` Pablo Neira Ayuso 0 siblings, 0 replies; 13+ messages in thread From: Pablo Neira Ayuso @ 2011-11-22 14:23 UTC (permalink / raw) To: Alex Bligh; +Cc: Alexey Dobriyan, kaber, netfilter-devel On Mon, Nov 21, 2011 at 11:50:36PM +0000, Alex Bligh wrote: > > > --On 22 November 2011 00:39:42 +0100 Pablo Neira Ayuso > <pablo@netfilter.org> wrote: > > >I didn't try with this script, but the problem can be easily > >triggered with: > > > >0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded. > >1) container is started. > >2) connect to it via lxc-console. > >3) generate some traffic with the container to create some conntrack > > entries in its table. > >4) stop the container: hit the oops. > > > >I've been testing the patch that I proposed with this recipe, now it > >works fine. > > > >I'll pass my patch for 3.2-rc soon, in case you want to make further > >testing of it. > > I will try to, when I have a minute. Occasionally I was getting a > double oops before. My theory at the time was that this was to do > with passing traffic /as/ the container was being destroyed. I have > little to substantiate that, but that was the reason for the perl > script (which really just runs a ping as the container is being > destroyed, which ensures there are conntrack entries). I think my fix is OK but more validation is always welcome, so I'd appreciate if you validate this with your script ;-). ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-11-22 14:24 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-08 22:16 [PATCH] netfilter: fix ->nfnl NULL oops Alexey Dobriyan 2011-11-08 22:50 ` Alex Bligh 2011-11-09 14:35 ` Pablo Neira Ayuso 2011-11-09 19:04 ` Alex Bligh 2011-11-09 14:34 ` Pablo Neira Ayuso 2011-11-09 19:06 ` Alex Bligh 2011-11-15 9:56 ` Pablo Neira Ayuso 2011-11-15 10:13 ` Alex Bligh 2011-11-15 14:09 ` Pablo Neira Ayuso 2011-11-21 23:39 ` Pablo Neira Ayuso 2011-11-21 23:42 ` Pablo Neira Ayuso 2011-11-21 23:50 ` Alex Bligh 2011-11-22 14:23 ` Pablo Neira Ayuso
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).