From: Brian Bloniarz <bmb@athenacr.com>
To: dormando <dormando@rydia.net>
Cc: netdev@vger.kernel.org
Subject: Re: 3 packet TCP window limit?
Date: Wed, 05 May 2010 09:26:04 -0400 [thread overview]
Message-ID: <4BE171EC.20904@athenacr.com> (raw)
In-Reply-To: <alpine.LNX.2.00.1005050210230.8544@d>
dormando wrote:
> Hey,
>
> Noticed in Linux that no matter what sysctl variable I twiddle, or what
> TCP congestion algorithm is running, TCP will wait for remote acks after
> sending the first 3 packets. After that it's normal.
>
> Apologies, it's hard ot describe:
>
> Linux server listening.
>
> Remote -> SYN
> (RTT wait)
> Linux -> SYN/ACK
> Remote -> ACK
> Remote -> Packet (small HTTP request)
> (RTT wait)
> Linux -> Packet (x 3)
> Remote -> (returning acks per packet)
> (RTT wait)
> Linux -> More packets (up to window size)
>
> If the request response fits in 3 packets or less, that third RTT wait
> never happens. The remote client gets all its data, and sends back all the
> FIN/ACK packets for closing the connection.
>
> What's bizarre is that this 3 packet/4 packet barrier is regardless of how
> much data there is to send. I can cause the extra RTT to flip on or off by
> sending exactly +/- 1 byte to cause an extra packet.
>
> Holding the connection open and repeating the request any number of times
> runs just fine, after the initial request.
>
> You can pretty easily see this by:
> tc qdisc add dev eth0 root netem delay 100ms
> ... then fetching a 3k file, then 4k file from an http server running
> linux. Well. at least I can see this easily. I tried on a half dozen boxes
> (2.6.11 through 2.6.32).
>
> I'm trying to track down where in the code this is, or why my sysctl
> tuning isn't affecting it. I can't discern its purpose. The lag it causes
> is pretty awful for far away clients; adding 300ms of latency will make a
> small request take a full second, instead of 600ms.
>
> I'm slugging through the code but any insight would be greatly
> appreciated!
This sounds like TCP slow start.
http://en.wikipedia.org/wiki/Slow-start
As far as tunables you might want to play with the initcwnd route
flag (see "ip route help")
>
> -Dormando
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-05-05 13:26 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-05 9:10 3 packet TCP window limit? dormando
2010-05-05 13:26 ` Brian Bloniarz [this message]
2010-05-05 20:01 ` dormando
2010-05-05 20:23 ` Rick Jones
2010-05-05 21:31 ` dormando
2010-05-06 6:15 ` Lars Eggert
2010-05-06 8:51 ` dormando
[not found] ` <p2h349f35ee1005061513x1db24de0ld98a40256c481ac2@mail.gmail.com>
[not found] ` <q2ud1c2719f1005061613yf90cd7c6r46ee23cc49858e74@mail.gmail.com>
2010-05-06 23:15 ` Jerry Chu
2010-05-05 20:56 ` Brian Bloniarz
2010-05-05 22:03 ` Stephen Hemminger
2010-05-06 1:37 ` [PATCH iproute2] document initcwnd Brian Bloniarz
2010-05-06 2:33 ` Stephen Hemminger
2010-05-19 15:31 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BE171EC.20904@athenacr.com \
--to=bmb@athenacr.com \
--cc=dormando@rydia.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).