public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "billbonaparte" <programme110@gmail.com>
To: <fw@strlen.de>
Cc: <linux-kernel@vger.kernel.org>,
	"Pablo Neira Ayuso" <pablo@netfilter.org>,
	"Patrick McHardy" <kaber@trash.net>, <kadlec@blackhole.kfki.hu>,
	<davem@davemloft.net>, "Changli Gao" <xiaosuo@gmail.com>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	"Andrey Vagin" <avagin@openvz.org>
Subject: Re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
Date: Tue, 4 Nov 2014 09:48:32 +0800	[thread overview]
Message-ID: <012601cff7d1$7ce2d620$76a88260$@gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

(sorry to send this e-mail again, last mail is rejected by server due to
non-acceptable content)

Florian Westphal [mailto:fw@strlen.de] wrote:
>Correct.  This is broken since the central spin lock removal, since 
>nf_conntrack_lock no longer protects both get_next_corpse and 
>conntrack_confirm.
> 
>Please send a patch, moving dying check after removal of conntrack from 
>the percpu list,
Since unconfirmed conntrack is stored in unconfirmed-list which is per-cpu
list and protected by per-cpu spin-lock, we can remove it from
uncomfirmed-list and insert it into ct-hash-table separately. that is to
say, we can remove it from uncomfirmed-list without holding corresponding
hash-lock, then check if it is dying.
if it is dying, we add it to the dying-list, then quit
__nf_conntrack_confirm. we do this to follow the rules that the conntrack
must alternatively at unconfirmed-list or dying-list when it is abort to be
destroyed.

>> 	2. operation on ct->status should be atomic, because it race aginst 
>> get_next_corpse.
>
>Alternatively we could also get rid of the unconfirmed list handling in
get_next_corpse, 
>it looks to me as if its simply not worth the trouble to also caring 
>about
unconfirmed lists.

yes, I think so. 
if there is a race at operating ct->status, there will be in alternative
case:
1) IPS_DYING bit which set in get_next_corpse override other bits (e.g.
IPS_SRC_NAT_DONE_BIT), or
2) other bits (e.g. IPS_SRC_NAT_DONE_BIT) which set in nf_nat_setup_info
override IPS_DYING bit.
but, any case seems to be okay.

[-- Attachment #2: fix_conntrack_confirm_race.patch --]
[-- Type: application/octet-stream, Size: 2691 bytes --]

>From c454ca5a96f5b6f815fe29cc2c91c92d719d7b95 Mon Sep 17 00:00:00 2001
From: bill bonaparte <programme110@gmail.com>
Date: Mon, 3 Nov 2014 17:13:51 +0800
Subject: [PATCH] netfilter: nf_conntrack: fix a race in
 __nf_conntrack_confirm against nf_ct_get_next_corpse

After we remove central spinlock nf_conntrack_lock, we get the race against nf_ct_get_next_corpse again,
to get rid of this race, we should remove the conntrack from the unconfirmed-list (which is per-cpu list) firstly,
then check if it is dying.
---
 net/netfilter/nf_conntrack_core.c |   28 ++++++++++++++++------------
 1 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5016a69..5d54a18 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -589,6 +589,22 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 
 	zone = nf_ct_zone(ct);
 	local_bh_disable();
+	
+	/* We have to check the DYING flag after unlink the conntrack
+	   to prevent a race against nf_ct_get_next_corpse() possibly 
+	   called from user context, else we insert an already 'dead' hash, 
+	   blocking further use of that particular connection -JM */
+	nf_ct_del_from_dying_or_unconfirmed_list(ct);
+	
+	if (unlikely(nf_ct_is_dying(ct))) {
+	    /* let's follow the rules that the conntrack must
+	       alternatively at the unconfirmed-list or dying-list
+	       when it is abort to be destoryed 
+	     */
+		nf_ct_add_to_dying_list(ct);
+		local_bh_enable();
+		return NF_ACCEPT;
+	}
 
 	do {
 		sequence = read_seqcount_begin(&net->ct.generation);
@@ -611,16 +627,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 	 */
 	NF_CT_ASSERT(!nf_ct_is_confirmed(ct));
 	pr_debug("Confirming conntrack %p\n", ct);
-	/* We have to check the DYING flag inside the lock to prevent
-	   a race against nf_ct_get_next_corpse() possibly called from
-	   user context, else we insert an already 'dead' hash, blocking
-	   further use of that particular connection -JM */
-
-	if (unlikely(nf_ct_is_dying(ct))) {
-		nf_conntrack_double_unlock(hash, reply_hash);
-		local_bh_enable();
-		return NF_ACCEPT;
-	}
 
 	/* See if there's one in the list already, including reverse:
 	   NAT could have grabbed it without realizing, since we're
@@ -636,8 +642,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 		    zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)))
 			goto out;
 
-	nf_ct_del_from_dying_or_unconfirmed_list(ct);
-
 	/* Timer relative to confirmation time, not original
 	   setting time, otherwise we'd get timer wrap in
 	   weird delay cases. */
-- 
1.7.5.4


             reply	other threads:[~2014-11-04  1:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-04  1:48 billbonaparte [this message]
2014-11-06 13:00 ` netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse Jesper Dangaard Brouer
  -- strict thread matches above, loose matches on Subject: below --
2014-11-07  6:47 Bill Bonaparte
     [not found] <02ef01cff25f$29887f60$7c997e20$@gmail.com>
2014-10-28  3:37 ` billbonaparte
2014-10-28  9:46   ` Florian Westphal
2014-10-28 10:11   ` Jesper Dangaard Brouer
2014-10-28  3:27 billbonaparte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='012601cff7d1$7ce2d620$76a88260$@gmail.com' \
    --to=programme110@gmail.com \
    --cc=avagin@openvz.org \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=kaber@trash.net \
    --cc=kadlec@blackhole.kfki.hu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=xiaosuo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox