netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] tcp/dccp: block bh before arming time_wait timer
@ 2017-12-01 18:06 Eric Dumazet
  2017-12-01 18:48 ` Maciej Żenczykowski
  2017-12-01 20:12 ` David Miller
  0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2017-12-01 18:06 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Maciej Żenczykowski

From: Eric Dumazet <edumazet@google.com>

Maciej Żenczykowski reported some panics in tcp_twsk_destructor()
that might be caused by the following bug.

timewait timer is pinned to the cpu, because we want to transition
timwewait refcount from 0 to 4 in one go, once everything has been
initialized.

At the time commit ed2e92394589 ("tcp/dccp: fix timewait races in timer
handling") was merged, TCP was always running from BH habdler.

After commit 5413d1babe8f ("net: do not block BH while processing
socket backlog") we definitely can run tcp_time_wait() from process
context.

We need to block BH in the critical section so that the pinned timer
has still its purpose.

This bug is more likely to happen under stress and when very small RTO
are used in datacenter flows.

Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
---
 net/dccp/minisocks.c     |    6 ++++++
 net/ipv4/tcp_minisocks.c |    6 ++++++
 2 files changed, 12 insertions(+)


diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c
index abd07a443219853b022bef41cb072e90ff8f07f0..178bb9833311f83205317b07fe64cb2e45a9f734 100644
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -57,10 +57,16 @@ void dccp_time_wait(struct sock *sk, int state, int timeo)
 		if (state == DCCP_TIME_WAIT)
 			timeo = DCCP_TIMEWAIT_LEN;
 
+		/* tw_timer is pinned, so we need to make sure BH are disabled
+		 * in following section, otherwise timer handler could run before
+		 * we complete the initialization.
+		 */
+		local_bh_disable();
 		inet_twsk_schedule(tw, timeo);
 		/* Linkage updates. */
 		__inet_twsk_hashdance(tw, sk, &dccp_hashinfo);
 		inet_twsk_put(tw);
+		local_bh_enable();
 	} else {
 		/* Sorry, if we're out of memory, just CLOSE this
 		 * socket up.  We've got bigger problems than
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index e36eff0403f4e80c4f7291a70614f40125652133..b079b619b60ca577d5ef20a5065fce87acecd96c 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -310,10 +310,16 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
 		if (state == TCP_TIME_WAIT)
 			timeo = TCP_TIMEWAIT_LEN;
 
+		/* tw_timer is pinned, so we need to make sure BH are disabled
+		 * in following section, otherwise timer handler could run before
+		 * we complete the initialization.
+		 */
+		local_bh_disable();
 		inet_twsk_schedule(tw, timeo);
 		/* Linkage updates. */
 		__inet_twsk_hashdance(tw, sk, &tcp_hashinfo);
 		inet_twsk_put(tw);
+		local_bh_enable();
 	} else {
 		/* Sorry, if we're out of memory, just CLOSE this
 		 * socket up.  We've got bigger problems than

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp/dccp: block bh before arming time_wait timer
  2017-12-01 18:06 [PATCH net] tcp/dccp: block bh before arming time_wait timer Eric Dumazet
@ 2017-12-01 18:48 ` Maciej Żenczykowski
  2017-12-01 20:12 ` David Miller
  1 sibling, 0 replies; 4+ messages in thread
From: Maciej Żenczykowski @ 2017-12-01 18:48 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

Acked-by: Maciej Żenczykowski <maze@google.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp/dccp: block bh before arming time_wait timer
  2017-12-01 18:06 [PATCH net] tcp/dccp: block bh before arming time_wait timer Eric Dumazet
  2017-12-01 18:48 ` Maciej Żenczykowski
@ 2017-12-01 20:12 ` David Miller
  2017-12-01 20:51   ` Eric Dumazet
  1 sibling, 1 reply; 4+ messages in thread
From: David Miller @ 2017-12-01 20:12 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, maze

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 01 Dec 2017 10:06:56 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Maciej Żenczykowski reported some panics in tcp_twsk_destructor()
> that might be caused by the following bug.
> 
> timewait timer is pinned to the cpu, because we want to transition
> timwewait refcount from 0 to 4 in one go, once everything has been
> initialized.
> 
> At the time commit ed2e92394589 ("tcp/dccp: fix timewait races in timer
> handling") was merged, TCP was always running from BH habdler.
> 
> After commit 5413d1babe8f ("net: do not block BH while processing
> socket backlog") we definitely can run tcp_time_wait() from process
> context.
> 
> We need to block BH in the critical section so that the pinned timer
> has still its purpose.
> 
> This bug is more likely to happen under stress and when very small RTO
> are used in datacenter flows.
> 
> Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Maciej Żenczykowski <maze@google.com>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp/dccp: block bh before arming time_wait timer
  2017-12-01 20:12 ` David Miller
@ 2017-12-01 20:51   ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2017-12-01 20:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, maze

On Fri, 2017-12-01 at 15:12 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 01 Dec 2017 10:06:56 -0800
> 
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > Maciej Żenczykowski reported some panics in tcp_twsk_destructor()
> > that might be caused by the following bug.
> > 
> > timewait timer is pinned to the cpu, because we want to transition
> > timwewait refcount from 0 to 4 in one go, once everything has been
> > initialized.
> > 
> > At the time commit ed2e92394589 ("tcp/dccp: fix timewait races in
> timer
> > handling") was merged, TCP was always running from BH habdler.
> > 
> > After commit 5413d1babe8f ("net: do not block BH while processing
> > socket backlog") we definitely can run tcp_time_wait() from process
> > context.
> > 
> > We need to block BH in the critical section so that the pinned
> timer
> > has still its purpose.
> > 
> > This bug is more likely to happen under stress and when very small
> RTO
> > are used in datacenter flows.
> > 
> > Fixes: 5413d1babe8f ("net: do not block BH while processing socket
> backlog")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Reported-by: Maciej Żenczykowski <maze@google.com>
> 
> Applied and queued up for -stable, thanks Eric.

It just occurred to me that we can now revert 614bdd4d6e61d26
("tcp: must block bh in __inet_twsk_hashdance()")

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-12-01 20:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-01 18:06 [PATCH net] tcp/dccp: block bh before arming time_wait timer Eric Dumazet
2017-12-01 18:48 ` Maciej Żenczykowski
2017-12-01 20:12 ` David Miller
2017-12-01 20:51   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).