From: Patrick McHardy <kaber@trash.net>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, David Miller <davem@davemloft.net>,
Thomas Gleixner <tglx@linutronix.de>,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad()
Date: Wed, 17 Jun 2009 13:51:45 +0200 [thread overview]
Message-ID: <4A38D8D1.6060004@trash.net> (raw)
In-Reply-To: <4A38D5BD.2040502@trash.net>
[-- Attachment #1: Type: text/plain, Size: 846 bytes --]
Patrick McHardy wrote:
> Eric Dumazet wrote:
>> IPS_CONFIRMED_BIT is set under nf_conntrack_lock (in
>> __nf_conntrack_confirm()),
>> we probably want to add a synchronisation under ct->lock as well,
>> or __nf_ct_refresh_acct() could set ct->timeout.expires to extra_jiffies,
>> while a different cpu could confirm the conntrack.
>
> Before the conntrack is confirmed, it is exclusively handled by a
> single CPU. I agree that we need to make sure the IPS_CONFIRMED_BIT
> is visible before we add the conntrack to the hash table since the
> lookup is lockless, but simply moving the set_bit before the hash
> insertion should be fine I think.
>
A slightly changed version which moves hash insertion to the end
and adds a comment about ordering. This make sure the timer is
actually running before the conntrack can be found be other CPUs.
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1174 bytes --]
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5f72b94..e2cc707 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -425,7 +425,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
/* Remove from unconfirmed list */
hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
- __nf_conntrack_hash_insert(ct, hash, repl_hash);
/* Timer relative to confirmation time, not original
setting time, otherwise we'd get timer wrap in
weird delay cases. */
@@ -433,8 +432,15 @@ __nf_conntrack_confirm(struct sk_buff *skb)
add_timer(&ct->timeout);
atomic_inc(&ct->ct_general.use);
set_bit(IPS_CONFIRMED_BIT, &ct->status);
+
+ /* Since the lookup is lockless, hash insertion must be after starting the
+ * timer and setting the CONFIRMED bit. The RCU barriers guarantee that no
+ * other CPU can find the conntrack before the above stores are visible.
+ */
+ __nf_conntrack_hash_insert(ct, hash, repl_hash);
NF_CT_STAT_INC(net, insert);
spin_unlock_bh(&nf_conntrack_lock);
+
help = nfct_help(ct);
if (help && help->helper)
nf_conntrack_event_cache(IPCT_HELPER, ct);
next prev parent reply other threads:[~2009-06-17 11:51 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-15 12:04 [GIT]: Networking David Miller
2009-06-15 12:04 ` David Miller
2009-06-16 9:15 ` Ingo Molnar
2009-06-16 9:19 ` David Miller
2009-06-16 9:33 ` Ingo Molnar
2009-06-16 9:44 ` Ingo Molnar
2009-06-16 9:44 ` Alan Cox
2009-06-16 9:48 ` Ingo Molnar
2009-06-16 9:56 ` David Miller
2009-06-16 10:11 ` Ingo Molnar
2009-06-16 10:35 ` David Miller
2009-06-16 13:38 ` Eric Dumazet
2009-06-16 14:19 ` Eric Dumazet
2009-06-16 18:59 ` Linus Torvalds
2009-06-16 19:08 ` David Miller
2009-06-16 19:37 ` Eric Dumazet
2009-06-16 20:12 ` Eric Dumazet
2009-06-17 6:41 ` [PATCH] net: correct off-by-one write allocations reports Eric Dumazet
2009-06-17 11:31 ` [PATCH] atm: sk_wmem_alloc initial value is one Eric Dumazet
2009-06-18 2:06 ` David Miller
2009-06-18 2:05 ` [PATCH] net: correct off-by-one write allocations reports David Miller
2009-06-17 11:32 ` [GIT]: Networking David Miller
2009-06-16 9:21 ` David Miller
2009-06-16 9:26 ` Ingo Molnar
2009-06-16 10:47 ` David Miller
2009-06-16 10:53 ` Ingo Molnar
2009-06-16 12:24 ` Ingo Molnar
2009-06-16 12:39 ` David Miller
2009-06-17 9:21 ` [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad() Ingo Molnar
2009-06-17 10:18 ` Eric Dumazet
2009-06-17 11:08 ` Ingo Molnar
2009-06-17 11:35 ` David Miller
2009-06-18 5:23 ` Ingo Molnar
2009-06-18 5:23 ` Ingo Molnar
2009-06-18 5:58 ` Eric Dumazet
2009-06-18 9:47 ` Patrick McHardy
2009-06-18 14:56 ` Patrick McHardy
2009-06-18 15:46 ` Eric Dumazet
2009-06-18 16:09 ` Patrick McHardy
2009-06-18 16:13 ` Pablo Neira Ayuso
2009-06-18 22:46 ` [PATCH] netfilter: conntrack: death_by_timeout() fix Eric Dumazet
2009-06-19 11:15 ` Patrick McHardy
2009-06-20 15:47 ` [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30 __list_add+0x7d/0xad() Ingo Molnar
2009-06-20 15:56 ` Patrick McHardy
2009-06-17 11:38 ` Patrick McHardy
2009-06-17 11:51 ` Patrick McHardy [this message]
2009-06-17 11:55 ` Eric Dumazet
2009-06-17 12:00 ` Patrick McHardy
2009-06-17 12:33 ` Eric Dumazet
2009-06-17 12:36 ` Patrick McHardy
2009-06-17 13:27 ` Eric Dumazet
2009-06-17 13:29 ` Patrick McHardy
2009-06-17 14:23 ` Patrick McHardy
2009-06-17 15:29 ` Eric Dumazet
2009-06-17 15:34 ` Patrick McHardy
2009-06-18 23:18 ` [GIT]: Networking Tilman Schmidt
2009-06-19 3:36 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A38D8D1.6060004@trash.net \
--to=kaber@trash.net \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.