Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache
       [not found] <4BE3D31F.6000607@secunet.com>
@ 2010-05-09 22:57 ` Pablo Neira Ayuso
  2010-05-10 15:23   ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Pablo Neira Ayuso @ 2010-05-09 22:57 UTC (permalink / raw)
  To: Joerg Marx
  Cc: Netfilter Development Mailinglist, Patrick McHardy,
	Mail List - Netfilter

Hi Joerg,

I'm cc'ing netfilter-devel instead, that is used for development stuff.

Joerg Marx wrote:
> Because nobody commented this so far on the 'linux-netdev' list, now a
> re-post on the 'netfilter' list with a kind request for confirmation
> and/or discussion...
> 
> ###################################################################
> 
> 
> Hi,
> 
> we encountered a weird problem of a 'stalled' connection when using
> 'conntrack -F' on a box with heavy network load. 'conntrack -L' gave us
> sometimes a [UNREPLIED] entry for the traffic in question, but no traffic
> flow, no matching of packets in or out, only the timer went down from 600
> to 0 (it was ESP traffic - the default generic timeout = 600 seconds).
> After the entry vanished by timeout (or doing a 'conntrack -F' once more),
> all worked normal again.
> 
> The reason we finally found, is a race window in 'nf_conntrack_confirm'
> when calling '__nf_conntrack_confirm':
> 
> In 'nf_conntrack_confirm' is checked (without holding a lock), if the entry
> to be confirmed is possibly dying: !nf_ct_is_dying(ct).
> If not, then __nf_conntrack_confirm will do some sanity checking, grab a
> spin_lock_bh and insert the 'ct' into the lookup cache.
> 
> Now consider the following scenario:
> 
> 1. a connection has seen the first packet already -> state is UNREPLIED
> 2. now the answer is to be sent, conntrack wants to confirm the connection
> 3. the !nf_ct_is_dying(ct) check is passed, __nf_conntrack_confirm is just
> started
> 4. in a user context a 'conntrack -F' command is running right now e.g. on
> another CPU
> 5. this will flag all unconfirmed connections as 'dying' in
> get_next_corpse(...), including the entry going to be confirmed!
> 6. now the already 'dying' entry is included into the hash cache in
> __nf_conntrack_confirm - BOOM!
> 
> After this step the connection in question is dead, because no packets are
> forwarded until the entry is purged from hash cache. This was a big blocker
> for us, because each dead IPsec tunnel is a dead branch network for 10
> minutes...
> 
> For every packet from now on 'nf_conntrack_find_get' will ignore the entry,
> because it is dying and because __nf_conntrack_confirm finds the hash in
> the cache already, it will NF_DROP the packet.
> 
> The key for finding this was 'NF_CT_STAT_INC(net, insert_failed)' in
> __nf_conntrack_confirm.
> 
> The suggested solution is to check for '!nf_ct_is_dying(ct)' again, _after_
> the spin_lock_bh is grabbed in __nf_conntrack_confirm. So it is clear, that
> no other softirq or user context can set that 'evil' dying flag ;-)
> The return value in this case should be NF_ACCEPT, so we loose no packets
> then, this is also important for us.
> 
> 
> ---
>  net/netfilter/nf_conntrack_core.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/net/netfilter/nf_conntrack_core.c
> b/net/netfilter/nf_conntrack_core.c
> index 1374179..e2c8bfe 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -413,6 +413,11 @@ __nf_conntrack_confirm(struct sk_buff *skb)
> 
>  	spin_lock_bh(&nf_conntrack_lock);
> 
> +	if (unlikely(nf_ct_is_dying(ct))) {
> +		spin_unlock_bh(&nf_conntrack_lock);
> +		return NF_ACCEPT;
> +	}
> +
>  	/* See if there's one in the list already, including reverse:
>  	   NAT could have grabbed it without realizing, since we're
>  	   not in the hash.  If there is, we lost race. */
> -- 1.5.6.5

I think this patch is fine. Patrick, would you apply it and pass it to
-stable, please? Thanks!

Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache
  2010-05-09 22:57 ` [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache Pablo Neira Ayuso
@ 2010-05-10 15:23   ` Patrick McHardy
  2010-05-10 17:08     ` Joerg Marx
  0 siblings, 1 reply; 6+ messages in thread
From: Patrick McHardy @ 2010-05-10 15:23 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Joerg Marx, Netfilter Development Mailinglist,
	Mail List - Netfilter

Pablo Neira Ayuso wrote:
>> we encountered a weird problem of a 'stalled' connection when using
>> 'conntrack -F' on a box with heavy network load. 'conntrack -L' gave us
>> sometimes a [UNREPLIED] entry for the traffic in question, but no traffic
>> flow, no matching of packets in or out, only the timer went down from 600
>> to 0 (it was ESP traffic - the default generic timeout = 600 seconds).
>> After the entry vanished by timeout (or doing a 'conntrack -F' once more),
>> all worked normal again.
>>
>> The reason we finally found, is a race window in 'nf_conntrack_confirm'
>> when calling '__nf_conntrack_confirm':
>>
>> In 'nf_conntrack_confirm' is checked (without holding a lock), if the entry
>> to be confirmed is possibly dying: !nf_ct_is_dying(ct).
>> If not, then __nf_conntrack_confirm will do some sanity checking, grab a
>> spin_lock_bh and insert the 'ct' into the lookup cache.
>>
>> Now consider the following scenario:
>>
>> 1. a connection has seen the first packet already -> state is UNREPLIED
>> 2. now the answer is to be sent, conntrack wants to confirm the connection
>> 3. the !nf_ct_is_dying(ct) check is passed, __nf_conntrack_confirm is just
>> started
>> 4. in a user context a 'conntrack -F' command is running right now e.g. on
>> another CPU
>> 5. this will flag all unconfirmed connections as 'dying' in
>> get_next_corpse(...), including the entry going to be confirmed!
>> 6. now the already 'dying' entry is included into the hash cache in
>> __nf_conntrack_confirm - BOOM!

Just for the record: the conntrack will be confirmed once the first
packet passed through all the hooks, so the incorrect hash insertion
will happen on the first packet, but the race condition you describe
is correct.

>> After this step the connection in question is dead, because no packets are
>> forwarded until the entry is purged from hash cache. This was a big blocker
>> for us, because each dead IPsec tunnel is a dead branch network for 10
>> minutes...
>>
>> For every packet from now on 'nf_conntrack_find_get' will ignore the entry,
>> because it is dying and because __nf_conntrack_confirm finds the hash in
>> the cache already, it will NF_DROP the packet.
>>
>> The key for finding this was 'NF_CT_STAT_INC(net, insert_failed)' in
>> __nf_conntrack_confirm.
>>
>> The suggested solution is to check for '!nf_ct_is_dying(ct)' again, _after_
>> the spin_lock_bh is grabbed in __nf_conntrack_confirm. So it is clear, that
>> no other softirq or user context can set that 'evil' dying flag ;-)
>> The return value in this case should be NF_ACCEPT, so we loose no packets
>> then, this is also important for us.

I think this should be fine since the race you describe only affects
unconfirmed conntracks, but it took me a while to realize that all
the other spots where the DYING bit is set are fine without holding
the conntrack lock.

Could you please add a comment to the check in __nf_conntrack_confirm()
stating that the dying check is supposed to prevent races against
nf_ct_get_next_corpse()? The semantic of the DYING bit is unfortunately
a bit overloaded.

Also, since the condition unconfirmed + dying in nf_conntrack_confirm()
is highly unlikely, I'd suggest to remove the dying check there and only
perform it in __nf_conntrack_confirm().

>> diff --git a/net/netfilter/nf_conntrack_core.c
>> b/net/netfilter/nf_conntrack_core.c
>> index 1374179..e2c8bfe 100644
>> --- a/net/netfilter/nf_conntrack_core.c
>> +++ b/net/netfilter/nf_conntrack_core.c
>> @@ -413,6 +413,11 @@ __nf_conntrack_confirm(struct sk_buff *skb)
>>
>>  	spin_lock_bh(&nf_conntrack_lock);
>>
>> +	if (unlikely(nf_ct_is_dying(ct))) {
>> +		spin_unlock_bh(&nf_conntrack_lock);
>> +		return NF_ACCEPT;
>> +	}
>> +
>>  	/* See if there's one in the list already, including reverse:
>>  	   NAT could have grabbed it without realizing, since we're
>>  	   not in the hash.  If there is, we lost race. */
>> -- 1.5.6.5
> 
> I think this patch is fine. Patrick, would you apply it and pass it to
> -stable, please? Thanks!
> 
> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache
  2010-05-10 15:23   ` Patrick McHardy
@ 2010-05-10 17:08     ` Joerg Marx
  2010-05-14 11:05       ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Joerg Marx @ 2010-05-10 17:08 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Pablo Neira Ayuso, Netfilter Development Mailinglist,
	Mail List - Netfilter

On 05/10/2010 05:23 PM, Patrick McHardy wrote:

...
> 
> I think this should be fine since the race you describe only affects
> unconfirmed conntracks, but it took me a while to realize that all
> the other spots where the DYING bit is set are fine without holding
> the conntrack lock.
> 
> Could you please add a comment to the check in __nf_conntrack_confirm()
> stating that the dying check is supposed to prevent races against
> nf_ct_get_next_corpse()? The semantic of the DYING bit is unfortunately
> a bit overloaded.
> 
> Also, since the condition unconfirmed + dying in nf_conntrack_confirm()
> is highly unlikely, I'd suggest to remove the dying check there and only
> perform it in __nf_conntrack_confirm().
> 

I hope the comment is clearly pointing to the (solved) problem now. I
also removed the obsolete check in nf_conntrack_confirm.

diff --git a/include/net/netfilter/nf_conntrack_core.h
b/include/net/netfilter/nf_conntrack_core.h
index dffde8e..3d7524f 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -61,7 +61,7 @@ static inline int nf_conntrack_confirm(struct sk_buff
*skb)
 	int ret = NF_ACCEPT;

 	if (ct && ct != &nf_conntrack_untracked) {
-		if (!nf_ct_is_confirmed(ct) && !nf_ct_is_dying(ct))
+		if (!nf_ct_is_confirmed(ct))
 			ret = __nf_conntrack_confirm(skb);
 		if (likely(ret == NF_ACCEPT))
 			nf_ct_deliver_cached_events(ct);
diff --git a/net/netfilter/nf_conntrack_core.c
b/net/netfilter/nf_conntrack_core.c
index 0c9bbe9..7ff9a40 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -422,6 +422,16 @@ __nf_conntrack_confirm(struct sk_buff *skb)

 	spin_lock_bh(&nf_conntrack_lock);

+	/* We have to check the DYING flag inside the lock to prevent
+	   a race against nf_ct_get_next_corpse() possibly called from
+	   user context, else we insert an already 'dead' hash, blocking
+	   further use of that particular connection -JM */
+
+	if (unlikely(nf_ct_is_dying(ct))) {
+		spin_unlock_bh(&nf_conntrack_lock);
+		return NF_ACCEPT;
+	}
+
 	/* See if there's one in the list already, including reverse:
 	   NAT could have grabbed it without realizing, since we're
 	   not in the hash.  If there is, we lost race. */



-- 


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache
  2010-05-10 17:08     ` Joerg Marx
@ 2010-05-14 11:05       ` Patrick McHardy
  2010-05-17  8:48         ` Joerg Marx
  0 siblings, 1 reply; 6+ messages in thread
From: Patrick McHardy @ 2010-05-14 11:05 UTC (permalink / raw)
  To: Joerg Marx
  Cc: Pablo Neira Ayuso, Netfilter Development Mailinglist,
	Mail List - Netfilter

Joerg Marx wrote:
> On 05/10/2010 05:23 PM, Patrick McHardy wrote:
> 
> ...
>> I think this should be fine since the race you describe only affects
>> unconfirmed conntracks, but it took me a while to realize that all
>> the other spots where the DYING bit is set are fine without holding
>> the conntrack lock.
>>
>> Could you please add a comment to the check in __nf_conntrack_confirm()
>> stating that the dying check is supposed to prevent races against
>> nf_ct_get_next_corpse()? The semantic of the DYING bit is unfortunately
>> a bit overloaded.
>>
>> Also, since the condition unconfirmed + dying in nf_conntrack_confirm()
>> is highly unlikely, I'd suggest to remove the dying check there and only
>> perform it in __nf_conntrack_confirm().
>>
> 
> I hope the comment is clearly pointing to the (solved) problem now. I
> also removed the obsolete check in nf_conntrack_confirm.

Thanks, this looks fine. But I need a formal submission, including
a changelog and Signed-off-by: line. Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache
  2010-05-14 11:05       ` Patrick McHardy
@ 2010-05-17  8:48         ` Joerg Marx
  2010-05-20 13:56           ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Joerg Marx @ 2010-05-17  8:48 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Pablo Neira Ayuso, Netfilter Development Mailinglist,
	Mail List - Netfilter

On 05/14/2010 01:05 PM, Patrick McHardy wrote:

...
>>
>> I hope the comment is clearly pointing to the (solved) problem now. I
>> also removed the obsolete check in nf_conntrack_confirm.
> 
> Thanks, this looks fine. But I need a formal submission, including
> a changelog and Signed-off-by: line. Thanks!

>From 97913752bc65b2aa4a091a73c3569dc8221a1f25 Mon Sep 17 00:00:00 2001
From: Joerg Marx <joerg.marx@secunet.com>
Date: Mon, 10 May 2010 18:39:47 +0200
Subject: [PATCH] Fix a race in __nf_conntrack_confirm against nf_ct_get_next_corpse()

This race was triggered by a 'conntrack -F' command running in parallel
to the insertion of a hash for a new connection.
Losing this race led to a dead conntrack entry effectively blocking
traffic for a particular connection until timeout or flushing the conntrack
hashes again.
Now the check for an already dying connection is done inside the lock.

Signed-off-by: Joerg Marx <joerg.marx@secunet.com>
---
 include/net/netfilter/nf_conntrack_core.h |    2 +-
 net/netfilter/nf_conntrack_core.c         |   10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index dffde8e..3d7524f 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -61,7 +61,7 @@ static inline int nf_conntrack_confirm(struct sk_buff *skb)
 	int ret = NF_ACCEPT;
 
 	if (ct && ct != &nf_conntrack_untracked) {
-		if (!nf_ct_is_confirmed(ct) && !nf_ct_is_dying(ct))
+		if (!nf_ct_is_confirmed(ct))
 			ret = __nf_conntrack_confirm(skb);
 		if (likely(ret == NF_ACCEPT))
 			nf_ct_deliver_cached_events(ct);
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 0c9bbe9..7ff9a40 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -422,6 +422,16 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 
 	spin_lock_bh(&nf_conntrack_lock);
 
+	/* We have to check the DYING flag inside the lock to prevent
+	   a race against nf_ct_get_next_corpse() possibly called from
+	   user context, else we insert an already 'dead' hash, blocking
+	   further use of that particular connection -JM */
+
+	if (unlikely(nf_ct_is_dying(ct))) {
+		spin_unlock_bh(&nf_conntrack_lock);
+		return NF_ACCEPT;
+	}
+
 	/* See if there's one in the list already, including reverse:
 	   NAT could have grabbed it without realizing, since we're
 	   not in the hash.  If there is, we lost race. */
-- 
1.5.6.5



-- 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache
  2010-05-17  8:48         ` Joerg Marx
@ 2010-05-20 13:56           ` Patrick McHardy
  0 siblings, 0 replies; 6+ messages in thread
From: Patrick McHardy @ 2010-05-20 13:56 UTC (permalink / raw)
  To: Joerg Marx
  Cc: Pablo Neira Ayuso, Netfilter Development Mailinglist,
	Mail List - Netfilter

Appologies for the delay, I once again had a mail outage :|

Joerg Marx wrote:
> Subject: [PATCH] Fix a race in __nf_conntrack_confirm against nf_ct_get_next_corpse()
> 
> This race was triggered by a 'conntrack -F' command running in parallel
> to the insertion of a hash for a new connection.
> Losing this race led to a dead conntrack entry effectively blocking
> traffic for a particular connection until timeout or flushing the conntrack
> hashes again.
> Now the check for an already dying connection is done inside the lock.

Applied, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-05-20 13:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <4BE3D31F.6000607@secunet.com>
2010-05-09 22:57 ` [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache Pablo Neira Ayuso
2010-05-10 15:23   ` Patrick McHardy
2010-05-10 17:08     ` Joerg Marx
2010-05-14 11:05       ` Patrick McHardy
2010-05-17  8:48         ` Joerg Marx
2010-05-20 13:56           ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).