netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
       [not found] <02ef01cff25f$29887f60$7c997e20$@gmail.com>
@ 2014-10-28  3:37 ` billbonaparte
  2014-10-28  9:46   ` Florian Westphal
  2014-10-28 10:11   ` Jesper Dangaard Brouer
  0 siblings, 2 replies; 7+ messages in thread
From: billbonaparte @ 2014-10-28  3:37 UTC (permalink / raw)
  To: linux-kernel, 'Netfilter Developer Mailing List',
	'Pablo Neira Ayuso', 'Patrick McHardy', kadlec,
	davem
  Cc: 'Changli Gao', 'Jozsef Kadlecsik',
	'Jesper Dangaard Brouer', 'Andrey Vagin'

Hi, all:
	sorry for sending this mail again, the last mail doesn't show text
clearly.
	In function __nf_conntrack_confirm, we check the conntrack if it was
alreay dead, before insert it into hash-table. 
	we do this because if we insert an already 'dead' hash,  it will
block further use of that particular connection.
	but we don't do that right.
    let's consider the following case:
	
	cpu1                                              cpu2
__nf_conntrack_confirm                          get_next_corpse
   lock corresponding hash-list                        ....
   check nf_ct_is_dying(ct)
for_each_possible_cpu(cpu) {
		......
spin_lock_bh(&pcpu->lock);
		......
set_bit(IPS_DYING_BIT, &ct->status);
   nf_ct_del_from_dying_or_unconfirmed_list(ct);
spin_unlock_bh(&pcpu_lock);
   add_timer(&ct->timeout);                          }	
   ct->status |= IPS_CONFIRMD;
   __nf_conntrack_hash_insert(ct);   /* the conntrack has been seted as
dying*/


	The above case reveal two problems:
	1. we may insert a dead conntrack to hash-table, it will block
further use of that particular connection.
	2. operation on ct->status should be atomic, because it race aginst
get_next_corpse.
	  due to this reason, the operation on ct->status in
nf_nat_setup_info should be atomic as well.

	if we want to resolve the first problem, we must delete the
unconfirmed conntrack from unconfirmed-list first, then check if it is
already dead.
	Am I right to do this ?
	Appreciate any comments and reply.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
  2014-10-28  3:37 ` netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse billbonaparte
@ 2014-10-28  9:46   ` Florian Westphal
  2014-10-28 10:11   ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2014-10-28  9:46 UTC (permalink / raw)
  To: billbonaparte
  Cc: linux-kernel, 'Netfilter Developer Mailing List',
	'Pablo Neira Ayuso', 'Patrick McHardy', kadlec,
	davem, 'Changli Gao', 'Jesper Dangaard Brouer',
	'Andrey Vagin'

billbonaparte <programme110@gmail.com> wrote:
> 	In function __nf_conntrack_confirm, we check the conntrack if it was
> alreay dead, before insert it into hash-table. 
> 	we do this because if we insert an already 'dead' hash,  it will
> block further use of that particular connection.
> 	but we don't do that right.

Correct.  This is broken since the central spin lock removal, since
nf_conntrack_lock no longer protects both get_next_corpse and
conntrack_confirm.

Please send a patch, moving dying check after removal of conntrack from
the percpu list, and add

Fixes: 93bb0ceb75be2 (netfilter: conntrack: remove central spinlock nf_conntrack_lock)

tag to patch.

> 	The above case reveal two problems:
> 	1. we may insert a dead conntrack to hash-table, it will block
> further use of that particular connection.

Yes.

> 	2. operation on ct->status should be atomic, because it race aginst
> get_next_corpse.

Alternatively we could also get rid of the unconfirmed list handling in
get_next_corpse, it looks to me as if its simply not worth the trouble
to also caring about unconfirmed lists.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
  2014-10-28  3:37 ` netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse billbonaparte
  2014-10-28  9:46   ` Florian Westphal
@ 2014-10-28 10:11   ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2014-10-28 10:11 UTC (permalink / raw)
  To: billbonaparte
  Cc: linux-kernel, 'Netfilter Developer Mailing List',
	'Pablo Neira Ayuso', 'Patrick McHardy', kadlec,
	davem, 'Changli Gao', 'Andrey Vagin', brouer,
	netdev@vger.kernel.org


On Tue, 28 Oct 2014 11:37:31 +0800 "billbonaparte" <programme110@gmail.com> wrote:

> Hi, all:
> sorry for sending this mail again, the last mail doesn't show text
> clearly.

This one also mangles the text, so I cannot follow the race you are
describing.  I'll try to reconstruct...

> In function __nf_conntrack_confirm, we check the conntrack if it was
> already dead, before insert it into hash-table. 
> We do this because if we insert an already 'dead' hash,  it will
> block further use of that particular connection.

Have you run into this problem in practice, or is this based on a
theory?

> but we don't do that right.
> let's consider the following case:
> 
[tried to reconstruct]

> 	cpu1                             cpu2
> __nf_conntrack_confirm             get_next_corpse
>   lock corresponding hash-list      ....
>   check nf_ct_is_dying(ct)          ....
>    .....                           for_each_possible_cpu(cpu) {
>    .....                           (processing &pcpu->unconfirmed)
>    .....                           spin_lock_bh(&pcpu->lock);
>    .....                           set_bit(IPS_DYING_BIT, &ct->status);
>    .....                           spin_unlock_bh(&pcpu_lock);
>  spin_lock_bh(&pcpu->lock);
>  nf_ct_del_from_dying_or_unconfirmed_list(ct);
>  spin_unlock_bh(&pcpu_lock);
>
>  add_timer(&ct->timeout);
>  ct->status |= IPS_CONFIRMED;
>  __nf_conntrack_hash_insert(ct);
>   /* the conntrack has been seted as dying*/

Yes, I think you are correct.  There is a race.  As we are modifying
the ct->status, without holding the hash bucket lock.


> The above case reveal two problems:
> 	1. we may insert a dead conntrack to hash-table, it will block
> further use of that particular connection.
> 	2. operation on ct->status should be atomic, because it race aginst
> get_next_corpse.
> 	  due to this reason, the operation on ct->status in
> nf_nat_setup_info should be atomic as well.
> 
> 	if we want to resolve the first problem, we must delete the
> unconfirmed conntrack from unconfirmed-list first, then check if it is
> already dead.

Guess that would be one approach.

> 	Am I right to do this ?
> 	Appreciate any comments and reply.

Perhaps we could get rid of unconfirmed list handling in get_next_corpse?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 7+ messages in thread

* re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
@ 2014-11-07  6:47 Bill Bonaparte
  0 siblings, 0 replies; 7+ messages in thread
From: Bill Bonaparte @ 2014-11-07  6:47 UTC (permalink / raw)
  To: 'Jesper Dangaard Brouer'
  Cc: fw, linux-kernel, 'Pablo Neira Ayuso',
	'Patrick McHardy', kadlec, davem, 'Changli Gao',
	'Andrey Vagin', netfilter-devel, netdev


On Tue, 6 Nov 2014 21:01:00 
"Jesper" <brouter@redhat.com> wrote:
>There is several issues with your submission.  I'll take care of
resubmitting a patch in your name (so you will get credit in the git log).
>
>If you care to know, issues are:
>1. you are not sending to the appropriate mailing lists,  2. patch is as an
attachment (should be inlined),  3. the patch have style and white-space
issues.

Thanks, Jesper. This is my first time to submit a patch, not know much about
the rules.  I will get it soon.

>> if there is a race at operating ct->status, there will be in 
>> alternative
>> case:
>> 1) IPS_DYING bit which set in get_next_corpse override other bits (e.g.
>> IPS_SRC_NAT_DONE_BIT), or
>> 2) other bits (e.g. IPS_SRC_NAT_DONE_BIT) which set in 
>> nf_nat_setup_info override IPS_DYING bit.

> Notice the set_bit() is atomic, so we don't have these issues (of bits
getting overridden).

In most cases, we do the atomic operation on ct->status (with set_bit), but
in function nf_nat_setup_info, we
assume that unconfirmed ct is always holded by current cpu, and has no race
against other cpus, so we don't
use set_bit.  
the following code is extracted from the nf_nat_setup_info:
/* Non-atomic: we own this at the moment. */
  if (maniptype == NF_NAT_MANIP_SRC)
	ct->status |= IPS_SRC_NAT;
  else
	ct->status |= IPS_DST_NAT;

--
Best regards,
  Bill Bonaparte



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
@ 2014-11-04  1:48 billbonaparte
  2014-11-06 13:00 ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 7+ messages in thread
From: billbonaparte @ 2014-11-04  1:48 UTC (permalink / raw)
  To: fw
  Cc: linux-kernel, Pablo Neira Ayuso, Patrick McHardy, kadlec, davem,
	Changli Gao, Jesper Dangaard Brouer, Andrey Vagin

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

(sorry to send this e-mail again, last mail is rejected by server due to
non-acceptable content)

Florian Westphal [mailto:fw@strlen.de] wrote:
>Correct.  This is broken since the central spin lock removal, since 
>nf_conntrack_lock no longer protects both get_next_corpse and 
>conntrack_confirm.
> 
>Please send a patch, moving dying check after removal of conntrack from 
>the percpu list,
Since unconfirmed conntrack is stored in unconfirmed-list which is per-cpu
list and protected by per-cpu spin-lock, we can remove it from
uncomfirmed-list and insert it into ct-hash-table separately. that is to
say, we can remove it from uncomfirmed-list without holding corresponding
hash-lock, then check if it is dying.
if it is dying, we add it to the dying-list, then quit
__nf_conntrack_confirm. we do this to follow the rules that the conntrack
must alternatively at unconfirmed-list or dying-list when it is abort to be
destroyed.

>> 	2. operation on ct->status should be atomic, because it race aginst 
>> get_next_corpse.
>
>Alternatively we could also get rid of the unconfirmed list handling in
get_next_corpse, 
>it looks to me as if its simply not worth the trouble to also caring 
>about
unconfirmed lists.

yes, I think so. 
if there is a race at operating ct->status, there will be in alternative
case:
1) IPS_DYING bit which set in get_next_corpse override other bits (e.g.
IPS_SRC_NAT_DONE_BIT), or
2) other bits (e.g. IPS_SRC_NAT_DONE_BIT) which set in nf_nat_setup_info
override IPS_DYING bit.
but, any case seems to be okay.

[-- Attachment #2: fix_conntrack_confirm_race.patch --]
[-- Type: application/octet-stream, Size: 2691 bytes --]

>From c454ca5a96f5b6f815fe29cc2c91c92d719d7b95 Mon Sep 17 00:00:00 2001
From: bill bonaparte <programme110@gmail.com>
Date: Mon, 3 Nov 2014 17:13:51 +0800
Subject: [PATCH] netfilter: nf_conntrack: fix a race in
 __nf_conntrack_confirm against nf_ct_get_next_corpse

After we remove central spinlock nf_conntrack_lock, we get the race against nf_ct_get_next_corpse again,
to get rid of this race, we should remove the conntrack from the unconfirmed-list (which is per-cpu list) firstly,
then check if it is dying.
---
 net/netfilter/nf_conntrack_core.c |   28 ++++++++++++++++------------
 1 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5016a69..5d54a18 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -589,6 +589,22 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 
 	zone = nf_ct_zone(ct);
 	local_bh_disable();
+	
+	/* We have to check the DYING flag after unlink the conntrack
+	   to prevent a race against nf_ct_get_next_corpse() possibly 
+	   called from user context, else we insert an already 'dead' hash, 
+	   blocking further use of that particular connection -JM */
+	nf_ct_del_from_dying_or_unconfirmed_list(ct);
+	
+	if (unlikely(nf_ct_is_dying(ct))) {
+	    /* let's follow the rules that the conntrack must
+	       alternatively at the unconfirmed-list or dying-list
+	       when it is abort to be destoryed 
+	     */
+		nf_ct_add_to_dying_list(ct);
+		local_bh_enable();
+		return NF_ACCEPT;
+	}
 
 	do {
 		sequence = read_seqcount_begin(&net->ct.generation);
@@ -611,16 +627,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 	 */
 	NF_CT_ASSERT(!nf_ct_is_confirmed(ct));
 	pr_debug("Confirming conntrack %p\n", ct);
-	/* We have to check the DYING flag inside the lock to prevent
-	   a race against nf_ct_get_next_corpse() possibly called from
-	   user context, else we insert an already 'dead' hash, blocking
-	   further use of that particular connection -JM */
-
-	if (unlikely(nf_ct_is_dying(ct))) {
-		nf_conntrack_double_unlock(hash, reply_hash);
-		local_bh_enable();
-		return NF_ACCEPT;
-	}
 
 	/* See if there's one in the list already, including reverse:
 	   NAT could have grabbed it without realizing, since we're
@@ -636,8 +642,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 		    zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)))
 			goto out;
 
-	nf_ct_del_from_dying_or_unconfirmed_list(ct);
-
 	/* Timer relative to confirmation time, not original
 	   setting time, otherwise we'd get timer wrap in
 	   weird delay cases. */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
  2014-11-04  1:48 billbonaparte
@ 2014-11-06 13:00 ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2014-11-06 13:00 UTC (permalink / raw)
  To: billbonaparte
  Cc: fw, linux-kernel, Pablo Neira Ayuso, Patrick McHardy, kadlec,
	davem, Changli Gao, Andrey Vagin, brouer,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org

On Tue, 4 Nov 2014 09:48:32 +0800
"billbonaparte" <programme110@gmail.com> wrote:

> (sorry to send this e-mail again, last mail is rejected by server due to
> non-acceptable content)

There is several issues with your submission.  I'll take care of
resubmitting a patch in your name (so you will get credit in the git
log).

If you care to know, issues are:
 1. you are not sending to the appropriate mailing lists,
 2. patch is as an attachment (should be inlined),
 3. the patch have style and white-space issues.


> Florian Westphal [mailto:fw@strlen.de] wrote:
> >Correct.  This is broken since the central spin lock removal, since 
> >nf_conntrack_lock no longer protects both get_next_corpse and 
> >conntrack_confirm.
> > 
> >Please send a patch, moving dying check after removal of conntrack from 
> >the percpu list,
>
> Since unconfirmed conntrack is stored in unconfirmed-list which is per-cpu
> list and protected by per-cpu spin-lock, we can remove it from
> uncomfirmed-list and insert it into ct-hash-table separately. that is to
> say, we can remove it from uncomfirmed-list without holding corresponding
> hash-lock, then check if it is dying.
> if it is dying, we add it to the dying-list, then quit
> __nf_conntrack_confirm. we do this to follow the rules that the conntrack
> must alternatively at unconfirmed-list or dying-list when it is abort to be
> destroyed.

In the resubmit. I'll take a slightly more conservative approach, by
keeping the DYING check under the hash-lock, as it is currently.  I
guess we could do it without holding the hash-lock, but I want to keep
the fix as simple as possible.


> >> 	2. operation on ct->status should be atomic, because it race aginst 
> >> get_next_corpse.
[...]
> if there is a race at operating ct->status, there will be in alternative
> case:
> 1) IPS_DYING bit which set in get_next_corpse override other bits (e.g.
> IPS_SRC_NAT_DONE_BIT), or
> 2) other bits (e.g. IPS_SRC_NAT_DONE_BIT) which set in nf_nat_setup_info
> override IPS_DYING bit.

Notice the set_bit() is atomic, so we don't have these issues (of bits
getting overridden).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 7+ messages in thread

* netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse
@ 2014-10-28  3:27 billbonaparte
  0 siblings, 0 replies; 7+ messages in thread
From: billbonaparte @ 2014-10-28  3:27 UTC (permalink / raw)
  To: linux-kernel, Netfilter Developer Mailing List,
	'Pablo Neira Ayuso', 'Patrick McHardy', kadlec,
	davem
  Cc: 'Changli Gao', 'Jozsef Kadlecsik',
	'Jesper Dangaard Brouer', 'Andrey Vagin'

Hi, all:
	In function __nf_conntrack_confirm, we check the conntrack if it was
alreay dead, before insert it into hash-table. 
	we do this because if we insert an already 'dead' hash,  it will
block further use of that particular connection.
	but we don't do that right.
    let's consider the following case:
	
	cpu1
cpu2
	__nf_conntrack_confirm
get_next_corpse
   		lock corresponding hash-list
....
		check nf_ct_is_dying(ct)
for_each_possible_cpu(cpu) {
		......
spin_lock_bh(&pcpu->lock);
		......
set_bit(IPS_DYING_BIT, &ct->status);
		nf_ct_del_from_dying_or_unconfirmed_list(ct);
spin_unlock_bh(&pcpu_lock);
		add_timer(&ct->timeout);
}	
		ct->status |= IPS_CONFIRMD;
		__nf_conntrack_hash_insert(ct);


	
	The above case reveal two problems:
	1. we may insert a dead conntrack to hash-table, it will block
further use of that particular connection.
	2. operation on ct->status should be atomic, because it race aginst
get_next_corpse.
	  due to this reason, the operation on ct->status in
nf_nat_setup_info should be atomic as well.

	if we want to resolve the first problem, we must delete the
unconfirmed conntrack from unconfirmed-list first, then check if it is
already dead.
	Am I right to do this ?
	Appreciate any comments and reply.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-11-07  6:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <02ef01cff25f$29887f60$7c997e20$@gmail.com>
2014-10-28  3:37 ` netfilter: nf_conntrack: there maybe a bug in __nf_conntrack_confirm, when it race against get_next_corpse billbonaparte
2014-10-28  9:46   ` Florian Westphal
2014-10-28 10:11   ` Jesper Dangaard Brouer
2014-11-07  6:47 Bill Bonaparte
  -- strict thread matches above, loose matches on Subject: below --
2014-11-04  1:48 billbonaparte
2014-11-06 13:00 ` Jesper Dangaard Brouer
2014-10-28  3:27 billbonaparte

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox