From: Eric Dumazet <eric.dumazet@gmail.com>
To: kapil dakhane <kdakhane@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, netfilter@vger.kernel.org,
Evgeniy Polyakov <zbr@ioremap.net>
Subject: [PATCH] tcp: fix a timewait refcnt race
Date: Thu, 03 Dec 2009 11:49:01 +0100 [thread overview]
Message-ID: <4B17979D.4040301@gmail.com> (raw)
In-Reply-To: <99d458640912021843y21ad07a4j724003328da07e9@mail.gmail.com>
kapil dakhane a écrit :
> Either there are more places for race condition, or the fix didn't
> address the issue effectively.
Thanks a lot for all these details ! It definitly is very usefull to
localize problems.
I believe I found another timewait problem, I am not sure
it is what makes your test fail, but we make progress :)
I cooked a patch against last net-next-2.6 + my previous patch.
(2nd take of [PATCH net-next-2.6] tcp: connect() race with timewait reuse)
[PATCH net-next-2.6] tcp: fix a timewait refcnt race
After TCP RCU conversion, tw->tw_refcnt should not be set to 1 in
inet_twsk_alloc(). It allows a RCU reader to get this timewait
socket, while we not yet stabilized it.
Only choice we have is to set tw_refcnt to 0 in inet_twsk_alloc(),
then atomic_add() it later, once everything is done.
Location of this atomic_add() is tricky, because we dont want another
writer to find this timewait in ehash, while tw_refcnt is still zero !
Thanks to Kapil Dakhane tests and reports.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
net/ipv4/inet_timewait_sock.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index 11380e6..91680ec 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -109,7 +109,6 @@ void __inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
tw->tw_tb = icsk->icsk_bind_hash;
WARN_ON(!icsk->icsk_bind_hash);
inet_twsk_add_bind_node(tw, &tw->tw_tb->owners);
- atomic_inc(&tw->tw_refcnt);
spin_unlock(&bhead->lock);
spin_lock(lock);
@@ -119,13 +118,22 @@ void __inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
* Should be done before removing sk from established chain
* because readers are lockless and search established first.
*/
- atomic_inc(&tw->tw_refcnt);
inet_twsk_add_node_rcu(tw, &ehead->twchain);
/* Step 3: Remove SK from established hash. */
if (__sk_nulls_del_node_init_rcu(sk))
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
+ /*
+ * Notes :
+ * - We initially set tw_refcnt to 0 in inet_twsk_alloc()
+ * - We add one reference for the bhash link
+ * - We add one reference for the ehash link
+ * - We want this refcnt update done before allowing other
+ * threads to find this tw in ehash chain.
+ */
+ atomic_add(1 + 1 + 1, &tw->tw_refcnt);
+
spin_unlock(lock);
}
@@ -157,7 +165,12 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, const int stat
tw->tw_transparent = inet->transparent;
tw->tw_prot = sk->sk_prot_creator;
twsk_net_set(tw, hold_net(sock_net(sk)));
- atomic_set(&tw->tw_refcnt, 1);
+ /*
+ * Because we use RCU lookups, we should not set tw_refcnt
+ * to a non null value before everything is setup for this
+ * timewait socket.
+ */
+ atomic_set(&tw->tw_refcnt, 0);
inet_twsk_dead_node_init(tw);
__module_get(tw->tw_prot->owner);
}
next prev parent reply other threads:[~2009-12-03 10:49 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-01 2:02 soft lockup in inet_csk_get_port kapil dakhane
2009-12-01 6:10 ` Eric Dumazet
2009-12-03 4:13 ` kapil dakhane
2009-12-01 15:00 ` [PATCH] tcp: Fix a connect() race with timewait sockets Eric Dumazet
2009-12-02 8:59 ` David Miller
2009-12-02 9:23 ` Eric Dumazet
2009-12-02 10:33 ` Eric Dumazet
2009-12-02 11:32 ` Evgeniy Polyakov
2009-12-02 19:18 ` kapil dakhane
2009-12-03 2:43 ` kapil dakhane
2009-12-03 10:49 ` Eric Dumazet [this message]
2009-12-04 0:19 ` [PATCH] tcp: fix a timewait refcnt race David Miller
2009-12-04 3:20 ` kapil dakhane
2009-12-04 6:29 ` Eric Dumazet
2009-12-04 6:39 ` David Miller
2009-12-04 6:39 ` David Miller
2009-12-02 15:08 ` [PATCH net-next-2.6] tcp: connect() race with timewait reuse Eric Dumazet
2009-12-02 22:15 ` Evgeniy Polyakov
2009-12-03 6:44 ` Eric Dumazet
2009-12-03 8:31 ` Eric Dumazet
2009-12-03 23:22 ` Evgeniy Polyakov
2009-12-04 0:18 ` David Miller
2009-12-02 16:05 ` [PATCH] tcp: Fix a connect() race with timewait sockets Ashwani Wason
2009-12-03 6:38 ` David Miller
2009-12-04 13:45 ` [PATCH 0/2] tcp: Fix connect() races " Eric Dumazet
2009-12-04 13:46 ` [PATCH 1/2] tcp: Fix a connect() race " Eric Dumazet
2009-12-05 21:21 ` Evgeniy Polyakov
2009-12-07 9:59 ` [PATCH] tcp: documents timewait refcnt tricks Eric Dumazet
2009-12-07 16:06 ` Randy Dunlap
2009-12-09 4:20 ` David Miller
2009-12-09 4:20 ` David Miller
2009-12-09 4:18 ` [PATCH 1/2] tcp: Fix a connect() race with timewait sockets David Miller
2009-12-04 13:47 ` [PATCH 2/2] " Eric Dumazet
2009-12-09 4:19 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B17979D.4040301@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=davem@davemloft.net \
--cc=kdakhane@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=netfilter@vger.kernel.org \
--cc=zbr@ioremap.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.