From: John Heffner <jheffner@psc.edu>
To: David Miller <davem@davemloft.net>
Cc: ilpo.jarvinen@helsinki.fi, lachlan.andrew@gmail.com,
netdev@vger.kernel.org, quetchen@caltech.edu
Subject: Re: SACK scoreboard
Date: Tue, 08 Jan 2008 11:51:53 -0500 [thread overview]
Message-ID: <4783AA29.3080406@psc.edu> (raw)
In-Reply-To: <20080107.233617.203640686.davem@davemloft.net>
David Miller wrote:
> Ilpo, just trying to keep an old conversation from dying off.
>
> Did you happen to read a recent blog posting of mine?
>
> http://vger.kernel.org/~davem/cgi-bin/blog.cgi/2007/12/31#tcp_overhead
>
> I've been thinking more and more and I think we might be able
> to get away with enforcing that SACKs are always increasing in
> coverage.
>
> I doubt there are any real systems out there that drop out of order
> packets that are properly formed and are in window, even though the
> SACK specification (foolishly, in my opinion) allows this.
>
> If we could free packets as SACK blocks cover them, all the problems
> go away.
>
> For one thing, this will allow the retransmit queue liberation during
> loss recovery to be spread out over the event, instead of batched up
> like crazy to the point where the cumulative ACK finally moves and
> releases an entire window's worth of data.
>
> Next, it would simplify all of this scanning code trying to figure out
> which holes to fill during recovery.
>
> And for SACK scoreboard marking, the RB trie would become very nearly
> unecessary as far as I can tell.
>
> I would not even entertain this kind of crazy idea unless I thought
> the fundamental complexity simplification payback was enormous. And
> in this case I think it is.
>
> What we could do is put some experimental hack in there for developers
> to start playing with, which would enforce that SACKs always increase
> in coverage. If violated the connection reset and a verbose log
> message is logged so we can analyze any cases that occur.
>
> Sounds crazy, but maybe has potential. What do you think?
Linux has a code path where this can happen under memory over-commit, in
tcp_prune_queue(). Also, I think one of the motivations for making SACK
strictly advisory is there was some concern about buggy SACK
implementations. Keeping data in your retransmit queue allows you to
fall back to timeout and go-back-n if things completely fall apart. For
better or worse, we have to deal with the spec the way it is.
Even if you made this assumption of "hard" SACKs, you still have to
worry about large ACKs if SACK is disabled, though I guess you could say
people running with large windows without SACK deserve what they get. :)
I haven't thought about this too hard, but can we approximate this by
moving scaked data into a sacked queue, then if something bad happens
merge this back into the retransmit queue? The code will have to deal
with non-contiguous data in the retransmit queue; I'm not sure offhand
if that violates any assumptions. You still have a single expensive ACK
at the end of recovery, though I wonder how much this really hurts. If
you want to ameliorate this, you could save this sacked queue to be
batch processed later, in application context for instance.
-John
next prev parent reply other threads:[~2008-01-08 16:52 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <aa7d2c6d0711261023m3d2dd850o76a8f44aef022f39@mail.gmail.com>
[not found] ` <001001c83063$9adbc9d0$d5897e82@csp.uiuc.edu>
2007-11-28 23:47 ` [PATCH] tcp-illinois: incorrect beta usage Stephen Hemminger
2007-11-29 0:25 ` Lachlan Andrew
2007-11-29 0:43 ` Stephen Hemminger
2007-11-29 5:26 ` Shao Liu
2007-12-03 22:52 ` [RFC] TCP illinois max rtt aging Stephen Hemminger
2007-12-03 23:06 ` Lachlan Andrew
2007-12-03 23:59 ` Shao Liu
2007-12-04 0:32 ` Stephen Hemminger
2007-12-04 1:23 ` Lachlan Andrew
2007-12-04 8:37 ` Ilpo Järvinen
2007-12-07 3:27 ` Lachlan Andrew
2007-12-07 11:05 ` Ilpo Järvinen
2007-12-07 12:41 ` David Miller
2007-12-07 13:05 ` Ilpo Järvinen
2007-12-07 18:27 ` Ilpo Järvinen
2007-12-08 1:32 ` David Miller
2007-12-11 11:59 ` [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking GSOed skbs when SACKed one-by-one (Was: Re: [RFC] TCP illinois max rtt aging) Ilpo Järvinen
2007-12-11 12:32 ` [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking GSOed skbs when SACKed one-by-one David Miller
2007-12-12 0:14 ` Lachlan Andrew
2007-12-12 15:11 ` David Miller
2007-12-12 23:35 ` Lachlan Andrew
2007-12-12 23:38 ` David Miller
2007-12-13 0:00 ` Stephen Hemminger
2007-12-15 9:51 ` SACK scoreboard (Was: Re: [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking GSOed skbs when SACKed one-by-one) Ilpo Järvinen
2008-01-08 7:36 ` SACK scoreboard David Miller
2008-01-08 12:12 ` Ilpo Järvinen
2008-01-09 7:58 ` David Miller
2008-01-08 16:51 ` John Heffner [this message]
2008-01-08 22:44 ` David Miller
2008-01-09 1:34 ` Lachlan Andrew
2008-01-09 6:35 ` David Miller
2008-01-09 2:25 ` Andi Kleen
2008-01-09 4:27 ` John Heffner
2008-01-09 6:41 ` David Miller
2008-01-09 14:56 ` John Heffner
2008-01-09 18:14 ` SANGTAE HA
2008-01-09 18:23 ` John Heffner
2008-01-09 12:55 ` Ilpo Järvinen
2008-01-09 6:39 ` David Miller
2008-01-09 7:03 ` Andi Kleen
2008-01-09 7:16 ` David Miller
2008-01-09 9:47 ` Evgeniy Polyakov
2008-01-09 14:02 ` Andi Kleen
2007-11-29 14:12 ` [PATCH] tcp-illinois: incorrect beta usage Herbert Xu
2008-01-09 6:04 SACK scoreboard linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4783AA29.3080406@psc.edu \
--to=jheffner@psc.edu \
--cc=davem@davemloft.net \
--cc=ilpo.jarvinen@helsinki.fi \
--cc=lachlan.andrew@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=quetchen@caltech.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).