From: "billbonaparte" <programme110@gmail.com>
To: <fw@strlen.de>
Cc: <linux-kernel@vger.kernel.org>,
"Pablo Neira Ayuso" <pablo@netfilter.org>,
"Patrick McHardy" <kaber@trash.net>, <kadlec@blackhole.kfki.hu>,
<davem@davemloft.net>, "Changli Gao" <xiaosuo@gmail.com>,
"Jesper Dangaard Brouer" <brouer@redhat.com>,
"Andrey Vagin" <avagin@openvz.org>
Subject: Re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
Date: Tue, 4 Nov 2014 09:48:32 +0800 [thread overview]
Message-ID: <012601cff7d1$7ce2d620$76a88260$@gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]
(sorry to send this e-mail again, last mail is rejected by server due to
non-acceptable content)
Florian Westphal [mailto:fw@strlen.de] wrote:
>Correct. This is broken since the central spin lock removal, since
>nf_conntrack_lock no longer protects both get_next_corpse and
>conntrack_confirm.
>
>Please send a patch, moving dying check after removal of conntrack from
>the percpu list,
Since unconfirmed conntrack is stored in unconfirmed-list which is per-cpu
list and protected by per-cpu spin-lock, we can remove it from
uncomfirmed-list and insert it into ct-hash-table separately. that is to
say, we can remove it from uncomfirmed-list without holding corresponding
hash-lock, then check if it is dying.
if it is dying, we add it to the dying-list, then quit
__nf_conntrack_confirm. we do this to follow the rules that the conntrack
must alternatively at unconfirmed-list or dying-list when it is abort to be
destroyed.
>> 2. operation on ct->status should be atomic, because it race aginst
>> get_next_corpse.
>
>Alternatively we could also get rid of the unconfirmed list handling in
get_next_corpse,
>it looks to me as if its simply not worth the trouble to also caring
>about
unconfirmed lists.
yes, I think so.
if there is a race at operating ct->status, there will be in alternative
case:
1) IPS_DYING bit which set in get_next_corpse override other bits (e.g.
IPS_SRC_NAT_DONE_BIT), or
2) other bits (e.g. IPS_SRC_NAT_DONE_BIT) which set in nf_nat_setup_info
override IPS_DYING bit.
but, any case seems to be okay.
[-- Attachment #2: fix_conntrack_confirm_race.patch --]
[-- Type: application/octet-stream, Size: 2691 bytes --]
>From c454ca5a96f5b6f815fe29cc2c91c92d719d7b95 Mon Sep 17 00:00:00 2001
From: bill bonaparte <programme110@gmail.com>
Date: Mon, 3 Nov 2014 17:13:51 +0800
Subject: [PATCH] netfilter: nf_conntrack: fix a race in
__nf_conntrack_confirm against nf_ct_get_next_corpse
After we remove central spinlock nf_conntrack_lock, we get the race against nf_ct_get_next_corpse again,
to get rid of this race, we should remove the conntrack from the unconfirmed-list (which is per-cpu list) firstly,
then check if it is dying.
---
net/netfilter/nf_conntrack_core.c | 28 ++++++++++++++++------------
1 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5016a69..5d54a18 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -589,6 +589,22 @@ __nf_conntrack_confirm(struct sk_buff *skb)
zone = nf_ct_zone(ct);
local_bh_disable();
+
+ /* We have to check the DYING flag after unlink the conntrack
+ to prevent a race against nf_ct_get_next_corpse() possibly
+ called from user context, else we insert an already 'dead' hash,
+ blocking further use of that particular connection -JM */
+ nf_ct_del_from_dying_or_unconfirmed_list(ct);
+
+ if (unlikely(nf_ct_is_dying(ct))) {
+ /* let's follow the rules that the conntrack must
+ alternatively at the unconfirmed-list or dying-list
+ when it is abort to be destoryed
+ */
+ nf_ct_add_to_dying_list(ct);
+ local_bh_enable();
+ return NF_ACCEPT;
+ }
do {
sequence = read_seqcount_begin(&net->ct.generation);
@@ -611,16 +627,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
*/
NF_CT_ASSERT(!nf_ct_is_confirmed(ct));
pr_debug("Confirming conntrack %p\n", ct);
- /* We have to check the DYING flag inside the lock to prevent
- a race against nf_ct_get_next_corpse() possibly called from
- user context, else we insert an already 'dead' hash, blocking
- further use of that particular connection -JM */
-
- if (unlikely(nf_ct_is_dying(ct))) {
- nf_conntrack_double_unlock(hash, reply_hash);
- local_bh_enable();
- return NF_ACCEPT;
- }
/* See if there's one in the list already, including reverse:
NAT could have grabbed it without realizing, since we're
@@ -636,8 +642,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)))
goto out;
- nf_ct_del_from_dying_or_unconfirmed_list(ct);
-
/* Timer relative to confirmation time, not original
setting time, otherwise we'd get timer wrap in
weird delay cases. */
--
1.7.5.4
next reply other threads:[~2014-11-04 1:49 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-04 1:48 billbonaparte [this message]
2014-11-06 13:00 ` netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse Jesper Dangaard Brouer
2014-11-06 13:36 ` [PATCH nf] netfilter: conntrack: fix race in __nf_conntrack_confirm " Jesper Dangaard Brouer
2014-11-10 16:54 ` Pablo Neira Ayuso
2014-11-12 7:35 ` Jesper Dangaard Brouer
2014-11-12 10:57 ` Jörg Marx
2014-11-13 12:08 ` Pablo Neira Ayuso
2014-11-13 14:33 ` Jörg Marx
2014-11-14 16:40 ` Pablo Neira Ayuso
-- strict thread matches above, loose matches on Subject: below --
2014-11-07 6:47 netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race " Bill Bonaparte
2014-11-04 1:52 billbonaparte
[not found] <02ef01cff25f$29887f60$7c997e20$@gmail.com>
2014-10-28 3:37 ` billbonaparte
2014-10-28 9:46 ` Florian Westphal
2014-10-28 10:11 ` Jesper Dangaard Brouer
2014-10-28 3:27 billbonaparte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='012601cff7d1$7ce2d620$76a88260$@gmail.com' \
--to=programme110@gmail.com \
--cc=avagin@openvz.org \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=kaber@trash.net \
--cc=kadlec@blackhole.kfki.hu \
--cc=linux-kernel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=xiaosuo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.