Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Joerg Marx <joerg.marx@secunet.com>
Cc: Netfilter Development Mailinglist
	<netfilter-devel@vger.kernel.org>,
	Patrick McHardy <kaber@trash.net>,
	Mail List - Netfilter <netfilter@vger.kernel.org>
Subject: Re: [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache
Date: Mon, 10 May 2010 00:57:39 +0200	[thread overview]
Message-ID: <4BE73DE3.6080304@netfilter.org> (raw)
In-Reply-To: <4BE3D31F.6000607@secunet.com>

Hi Joerg,

I'm cc'ing netfilter-devel instead, that is used for development stuff.

Joerg Marx wrote:
> Because nobody commented this so far on the 'linux-netdev' list, now a
> re-post on the 'netfilter' list with a kind request for confirmation
> and/or discussion...
> 
> ###################################################################
> 
> 
> Hi,
> 
> we encountered a weird problem of a 'stalled' connection when using
> 'conntrack -F' on a box with heavy network load. 'conntrack -L' gave us
> sometimes a [UNREPLIED] entry for the traffic in question, but no traffic
> flow, no matching of packets in or out, only the timer went down from 600
> to 0 (it was ESP traffic - the default generic timeout = 600 seconds).
> After the entry vanished by timeout (or doing a 'conntrack -F' once more),
> all worked normal again.
> 
> The reason we finally found, is a race window in 'nf_conntrack_confirm'
> when calling '__nf_conntrack_confirm':
> 
> In 'nf_conntrack_confirm' is checked (without holding a lock), if the entry
> to be confirmed is possibly dying: !nf_ct_is_dying(ct).
> If not, then __nf_conntrack_confirm will do some sanity checking, grab a
> spin_lock_bh and insert the 'ct' into the lookup cache.
> 
> Now consider the following scenario:
> 
> 1. a connection has seen the first packet already -> state is UNREPLIED
> 2. now the answer is to be sent, conntrack wants to confirm the connection
> 3. the !nf_ct_is_dying(ct) check is passed, __nf_conntrack_confirm is just
> started
> 4. in a user context a 'conntrack -F' command is running right now e.g. on
> another CPU
> 5. this will flag all unconfirmed connections as 'dying' in
> get_next_corpse(...), including the entry going to be confirmed!
> 6. now the already 'dying' entry is included into the hash cache in
> __nf_conntrack_confirm - BOOM!
> 
> After this step the connection in question is dead, because no packets are
> forwarded until the entry is purged from hash cache. This was a big blocker
> for us, because each dead IPsec tunnel is a dead branch network for 10
> minutes...
> 
> For every packet from now on 'nf_conntrack_find_get' will ignore the entry,
> because it is dying and because __nf_conntrack_confirm finds the hash in
> the cache already, it will NF_DROP the packet.
> 
> The key for finding this was 'NF_CT_STAT_INC(net, insert_failed)' in
> __nf_conntrack_confirm.
> 
> The suggested solution is to check for '!nf_ct_is_dying(ct)' again, _after_
> the spin_lock_bh is grabbed in __nf_conntrack_confirm. So it is clear, that
> no other softirq or user context can set that 'evil' dying flag ;-)
> The return value in this case should be NF_ACCEPT, so we loose no packets
> then, this is also important for us.
> 
> 
> ---
>  net/netfilter/nf_conntrack_core.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/net/netfilter/nf_conntrack_core.c
> b/net/netfilter/nf_conntrack_core.c
> index 1374179..e2c8bfe 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -413,6 +413,11 @@ __nf_conntrack_confirm(struct sk_buff *skb)
> 
>  	spin_lock_bh(&nf_conntrack_lock);
> 
> +	if (unlikely(nf_ct_is_dying(ct))) {
> +		spin_unlock_bh(&nf_conntrack_lock);
> +		return NF_ACCEPT;
> +	}
> +
>  	/* See if there's one in the list already, including reverse:
>  	   NAT could have grabbed it without realizing, since we're
>  	   not in the hash.  If there is, we lost race. */
> -- 1.5.6.5

I think this patch is fine. Patrick, would you apply it and pass it to
-stable, please? Thanks!

Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>

next      parent reply	other threads:[~2010-05-09 22:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4BE3D31F.6000607@secunet.com>
2010-05-09 22:57 ` Pablo Neira Ayuso [this message]
2010-05-10 15:23   ` [PATCH] nf_conntrack_core.c: fix for dead connection after flushing conntrack cache Patrick McHardy
2010-05-10 17:08     ` Joerg Marx
2010-05-14 11:05       ` Patrick McHardy
2010-05-17  8:48         ` Joerg Marx
2010-05-20 13:56           ` Patrick McHardy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BE73DE3.6080304@netfilter.org \
    --to=pablo@netfilter.org \
    --cc=joerg.marx@secunet.com \
    --cc=kaber@trash.net \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=netfilter@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).