netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: David Miller <davem@davemloft.net>
Cc: kuznet@ms2.inr.ac.ru, vgusev@openvz.org, mcmanus@ducksong.com,
	xemul@openvz.org, netdev@vger.kernel.org,
	ilpo.jarvinen@helsinki.fi, linux-kernel@vger.kernel.org
Subject: Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets
Date: Fri, 13 Jun 2008 08:30:37 +0200	[thread overview]
Message-ID: <20080613063037.GA16943@elte.hu> (raw)
In-Reply-To: <20080612.163212.148965080.davem@davemloft.net>


* David Miller <davem@davemloft.net> wrote:

> From: David Miller <davem@davemloft.net>
> Date: Wed, 11 Jun 2008 16:52:55 -0700 (PDT)
> 
> > More and more, the arguments are mounting to completely revert the 
> > established code path changes, and frankly that is likely what I am 
> > going to do by the end of today.
> 
> Here is the revert patch I intend to send to Linus:
> 
> tcp: Revert 'process defer accept as established' changes.
> 
> This reverts two changesets, ec3c0982a2dd1e671bad8e9d26c28dcba0039d87
> ("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
> the follow-on bug fix 9ae27e0adbf471c7a6b80102e38e1d5a346b3b38
> ("tcp: Fix slab corruption with ipv6 and tcp6fuzz").
> 
> This change causes several problems, first reported by Ingo Molnar
> as a distcc-over-loopback regression where connections were getting
> stuck.
> 
> Ilpo Järvinen first spotted the locking problems.  The new function
> added by this code, tcp_defer_accept_check(), only has the
> child socket locked, yet it is modifying state of the parent
> listening socket.
> 
> Fixing that is non-trivial at best, because we can't simply just grab
> the parent listening socket lock at this point, because it would
> create an ABBA deadlock.  The normal ordering is parent listening
> socket --> child socket, but this code path would require the
> reverse lock ordering.
> 
> Next is a problem noticed by Vitaliy Gusev, he noted:
> 
> ----------------------------------------
> >--- a/net/ipv4/tcp_timer.c
> >+++ b/net/ipv4/tcp_timer.c
> >@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> > 		goto death;
> > 	}
> >
> >+	if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
> >+		tcp_send_active_reset(sk, GFP_ATOMIC);
> >+		goto death;
> 
> Here socket sk is not attached to listening socket's request queue. tcp_done()
> will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
> release this sk) as socket is not DEAD. Therefore socket sk will be lost for
> freeing.
> ----------------------------------------
> 
> Finally, Alexey Kuznetsov argues that there might not even be any
> real value or advantage to these new semantics even if we fix all
> of the bugs:
> 
> ----------------------------------------
> Hiding from accept() sockets with only out-of-order data only
> is the only thing which is impossible with old approach. Is this really
> so valuable? My opinion: no, this is nothing but a new loophole
> to consume memory without control.
> ----------------------------------------
> 
> So revert this thing for now.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>

the 3 reverts have been extensively tested in -tip via:

 # tip/out-of-tree: 9e5b6ca: tcp: revert DEFER_ACCEPT modifications

and the distcc problems are fixed. (The locking fix alone did not fix it 
conclusively in my testing, possibly due to the follow-on observations 
outlined in your description.)

Tested-by: Ingo Molnar <mingo@elte.hu>

	Ingo

  reply	other threads:[~2008-06-13  6:30 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-11 12:58 [TCP]: TCP_DEFER_ACCEPT causes leak sockets Vitaliy Gusev
2008-06-11 13:57 ` Alexey Kuznetsov
2008-06-11 23:52   ` David Miller
2008-06-12 23:32     ` David Miller
2008-06-13  6:30       ` Ingo Molnar [this message]
2008-06-13  9:32         ` David Miller
2008-06-13 11:09           ` Ingo Molnar
2008-06-13 11:47             ` Ingo Molnar
2008-06-13 21:10               ` Ingo Molnar
2008-06-16 23:59               ` David Miller
2008-06-17  7:26                 ` Ingo Molnar
2008-06-17  7:38                   ` David Miller
2008-06-17  8:09                     ` Ingo Molnar
2008-06-17  8:32                       ` Ingo Molnar
2008-06-17  9:08                         ` David Miller
2008-06-17  9:27                           ` Ingo Molnar
2008-06-17  9:29                             ` David Miller
2008-06-17  9:39                               ` Ingo Molnar
2008-06-18 18:50                                 ` [E1000-devel] " Kok, Auke
2008-06-18 20:08                                   ` Ingo Molnar
2008-06-18 21:25                                     ` [E1000-devel] " Kok, Auke
2008-06-18 22:12                                       ` David Miller
2008-06-19  7:06                                         ` Jarek Poplawski
2008-06-18 21:32                                     ` Ingo Molnar
2008-06-18 21:41                                       ` Denys Fedoryshchenko
2008-06-18 22:05                                         ` Ingo Molnar
2008-06-18 22:44                                           ` Denys Fedoryshchenko
2008-06-18 23:14                                   ` Ingo Molnar
2008-06-17  8:43                       ` Vitaliy Gusev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080613063037.GA16943@elte.hu \
    --to=mingo@elte.hu \
    --cc=davem@davemloft.net \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcmanus@ducksong.com \
    --cc=netdev@vger.kernel.org \
    --cc=vgusev@openvz.org \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).