netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Marian Ďurkovič" <md@bts.sk>
To: netdev@vger.kernel.org
Subject: Re: TCP rx window autotuning harmful at LAN context
Date: Mon, 9 Mar 2009 21:05:05 +0100	[thread overview]
Message-ID: <20090309200505.GA58375@bts.sk> (raw)
In-Reply-To: <1e41a3230903091101u536a3b3bv7f0dd9da6891781e@mail.gmail.com>

On Mon, 9 Mar 2009 11:01:52 -0700, John Heffner wrote
> On Mon, Mar 9, 2009 at 4:25 AM, Marian Ďurkovič <md@bts.sk> wrote:
> >   As rx window autotuning is enabled in all recent kernels and with 1 GB
> > of RAM the maximum tcp_rmem becomes 4 MB, this problem is spreading
> >   rapidly
> > and we believe it needs urgent attention. As demontrated above, such
> >   huge
> > rx window (which is at least 100*BDP of the example above) does not
> >   deliver
> > any performance gain but instead it seriously harms other hosts and/or
> > applications. It should also be noted, that host with autotuning enabled
> > steals an unfair share of the total available bandwidth, which might
> > look
> > like a "better" performing TCP stack at first sight - however such
> > behaviour
> > is not appropriate (RFC2914, section 3.2).
>
> It's well known that "standard" TCP fills all available drop-tail
> buffers, and that this behavior is not desirable.

Well, in practice that was always limited by receive window size, which
was by default 64 kB on most operating systems. So this undesirable behavior
was limited to hosts where receive window was manually increased to huge
values.

Today, the real effect of autotuning is the same as changing the receive window
size to 4 MB on *all* hosts, since there's no mechanism to prevent it from
growing the window to maximum even for low RTT paths.

> The situation you describe is exactly what congestion control (the
> topic of RFC2914) should fix.  It is not the role of receive window
> (flow control).  It is really the sender's job to detect and react to
> this, not the receiver's.  (We have had this discussion before on
> netdev.)

It's not of high importance whose job it is according to pure theory.
What matters is, that autotuning introduced serious problem at LAN context
by disabling any possibility to properly react to increasing RTT. Again,
it's not important whether this functionality was there by design or by
coincidence, but it was holding the system well-balanced for many years.

Now, as autotuning is enabled by default in stock kernel, this problem is
spreading into LANs without users even knowing what's going on. Therefore
I'd like to suggest to look for a decent fix which could be implemented
in relatively short time frame. My proposal is this:

- measure RTT during the initial phase of TCP connection (first X segments)
- compute maximal receive window size depending on measured RTT using
  configurable constant representing the bandwidth part of BDP
- let autotuning do its work upto that limit.

  With kind regards,

        M. 

  reply	other threads:[~2009-03-09 20:05 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-09 11:25 TCP rx window autotuning harmful at LAN context Marian Ďurkovič
2009-03-09 18:01 ` John Heffner
2009-03-09 20:05   ` Marian Ďurkovič [this message]
2009-03-09 20:24     ` Stephen Hemminger
2009-03-10  0:09     ` David Miller
2009-03-10  0:34       ` Rick Jones
2009-03-10  3:55         ` John Heffner
2009-03-10 17:20           ` Rick Jones
2009-03-11 10:03       ` Andi Kleen
2009-03-11 11:03         ` Marian Ďurkovič
2009-03-11 13:30         ` David Miller
2009-03-11 15:01           ` Andi Kleen
2009-03-11 14:56             ` Marian Ďurkovič
2009-03-11 15:34             ` John Heffner
     [not found]   ` <20090309195906.M50328@bts.sk>
2009-03-09 20:23     ` John Heffner
2009-03-09 20:33       ` Stephen Hemminger
2009-03-09 23:52       ` David Miller
2009-03-10  0:09         ` John Heffner
2009-03-10  5:19           ` Eric Dumazet
     [not found]       ` <20090310104956.GA81181@bts.sk>
2009-03-10 11:30         ` David Miller
2009-03-10 11:46           ` Marian Ďurkovič
2009-03-10 15:23             ` John Heffner
2009-03-10 16:00               ` Marian Ďurkovič
2009-03-10 16:18                 ` David Miller
2009-03-11  8:29                   ` Marian Ďurkovič
2009-03-11  8:41                     ` David Miller
2009-03-11  9:05                       ` Marian Ďurkovič
2009-03-11  9:11                       ` Eric Dumazet
2009-03-11 13:25                         ` David Miller
2009-03-11  9:02 ` Rémi Denis-Courmont

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090309200505.GA58375@bts.sk \
    --to=md@bts.sk \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).