From: Willy Tarreau <w@1wt.eu>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Subject: Re: TCP: orphans broken by RFC 2525 #2.17
Date: Mon, 27 Sep 2010 09:34:43 +0200 [thread overview]
Message-ID: <20100927073443.GR12373@1wt.eu> (raw)
In-Reply-To: <20100926.234202.241938788.davem@davemloft.net>
On Sun, Sep 26, 2010 at 11:42:02PM -0700, David Miller wrote:
> From: Willy Tarreau <w@1wt.eu>
> Date: Mon, 27 Sep 2010 07:39:01 +0200
>
> > On Sun, Sep 26, 2010 at 06:12:02PM -0700, David Miller wrote:
> >> From: Willy Tarreau <w@1wt.eu>
> >> Date: Mon, 27 Sep 2010 01:25:30 +0200
> >>
> >> > Agreed. But that's not a reason for killing outgoing data that is
> >> > being sent when there are some data left in the rcv buffer.
> >>
> >> What alternative notification to the peer do you suggest other than a
> >> reset, then? TCP gives us no other.
> >
> > I know, and I agree to send the reset, but after the data are correctly
> > transferred. This reset's purpose is only to inform the other side that
> > the data it sent were destroyed. It is not a requirement to tell it they
> > were destroyed earlier or later. What matters is that it's informed they
> > were destroyed.
>
> So you want us to hold onto to the full connection state for however
> long it takes to send the pending data
Not for however long it takes, just as we do right now with orphans, nothing
more, nothing less.
> just because your application
> doesn't want to wait around to sink a pending newline character?
it's not that it *doesn't want* to wait for the pending newline character,
it's that this character has no reason to be there and cannot be predicted,
and even when you find it, nothing tells the application that it's the last
one.
> Is that what this boils down to?
No, it's the opposite in fact, the goal is to ensure we can reliably
release the whole connection ASAP instead of being forced to sink any
possible data that may come from it and that will not be consumed nor
will lead to a reset. Look :
case A (current one) :
we send the response to the client from an orphaned connection.
Most of the times, the client won't have any issue and will get the
response. In some rare circumstances, some data sent by the client
after the response causes an RST to be emitted, which may destroy
in flight data. But those issues are extremely rare, still they
happen.
case B (my proposal, and was the case before the RFC2525 fix) :
we send the response to the client.
it acks it
we send an RST. End of the transfer. Total time: 50ms (avg RTT over ADSL).
case C (alternative) :
we send the response to the client.
the application can't know it has acked it, and must maintain the
connection open for however long is necessary to get the only form
of ACK the application can detect: the FIN from the client, which
is 6 minutes on my ADSL line for 10 meg.
In case C, not only the state remains *a lot* longer, but the bandwidth
usage is much worse, and in the end the client does not even get the reset
that we're trying to ensure it gets to indicate that the data were dropped.
So while case C is a reliable workaround, it's the least efficient method
and the most expensive one in terms of memory, CPU, network bandwidth,
socket usage, file descriptor usage and perceived time.
You see, I'm not trying to make dirty dangerous things to save a few
lines of code. I'm even OK to have a lot of linux-specific code to make
use of the features the linux stack provides that makes it more efficient
than other implementations. I'm just seeking reliability.
Willy
next prev parent reply other threads:[~2010-09-27 7:34 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-26 13:17 TCP: orphans broken by RFC 2525 #2.17 Willy Tarreau
2010-09-26 17:02 ` Eric Dumazet
2010-09-26 17:40 ` Willy Tarreau
2010-09-26 18:35 ` Eric Dumazet
2010-09-26 18:49 ` Willy Tarreau
2010-09-26 21:01 ` Eric Dumazet
2010-09-26 21:46 ` Willy Tarreau
2010-09-26 22:19 ` David Miller
2010-09-26 22:10 ` David Miller
2010-09-26 19:16 ` Willy Tarreau
2010-09-26 22:14 ` David Miller
2010-09-26 22:13 ` David Miller
2010-09-26 22:34 ` Willy Tarreau
2010-09-26 22:38 ` David Miller
2010-09-26 22:54 ` Willy Tarreau
2010-09-26 23:08 ` David Miller
2010-09-26 23:25 ` Willy Tarreau
2010-09-27 1:12 ` David Miller
2010-09-27 5:39 ` Willy Tarreau
2010-09-27 5:48 ` Eric Dumazet
2010-09-27 6:04 ` Willy Tarreau
2010-09-27 6:44 ` David Miller
2010-09-27 6:42 ` David Miller
2010-09-27 7:34 ` Willy Tarreau [this message]
2010-09-27 7:42 ` David Miller
2010-09-27 19:21 ` Willy Tarreau
2010-09-27 23:28 ` Herbert Xu
2010-09-28 5:12 ` Willy Tarreau
2010-09-28 5:32 ` David Miller
2010-09-28 5:37 ` Willy Tarreau
2010-09-27 9:12 ` Julian Anastasov
2010-09-27 19:24 ` Willy Tarreau
2010-09-27 20:00 ` Eric Dumazet
2010-09-28 9:01 ` Julian Anastasov
2010-09-28 9:26 ` Willy Tarreau
2010-09-27 8:02 ` Herbert Xu
2010-09-27 20:00 ` Willy Tarreau
2010-09-27 20:08 ` Rick Jones
2010-09-27 20:20 ` Willy Tarreau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100927073443.GR12373@1wt.eu \
--to=w@1wt.eu \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).