From: Rick Jones <rick.jones2@hp.com>
To: netdev@oss.sgi.com
Subject: Re: on the wire behaviour of TSO on/off is supposed to be the same yes?
Date: Fri, 21 Jan 2005 14:00:30 -0800 [thread overview]
Message-ID: <41F17B7E.2020002@hp.com> (raw)
In-Reply-To: <20050121124441.76cbbfb9.davem@davemloft.net>
David S. Miller wrote:
> Don't set tcp_tso_win_divisor to such a low value, that's why
> TCP is being so bursty in your case. The default value
> of "8" keeps TCP reasonable well ACK clocked, thus avoiding
> the throughput lossage you are seeing with it set to "1".
If my only interest were bulk throughput then that would be fine, but I'm also concerned about shorter lived, request/response sorts
of workloads. The netperf TCP_STREAM test was simply a convenient vehicle. If it would be better, I could switch to a different
netperf test.
> With a value of "1", TCP will wait for the entire congestion
> window to be ACK'd before it will spit out a huge TSO frame.
It looks though like it then is not spitting-out a full congestion window. Here is the openeing from the TSO on case:
000031 IP 192.168.13.223.33287 > 192.168.13.1.64632: S 2243249440:2243249440(0) win 5840 <mss 1460,sackOK,timestamp 168858934
0,nop,wscale 2>
000095 IP 192.168.13.1.64632 > 192.168.13.223.33287: S 3684332982:3684332982(0) ack 2243249441 win 65535 <mss
1460,nop,nop,sackOK,wscale 2,nop,nop,nop,timestamp 960528547 168858934>
000014 IP 192.168.13.223.33287 > 192.168.13.1.64632: . ack 1 win 1460 <nop,nop,timestamp 168858934 960528547>
000118 IP 192.168.13.223.33287 > 192.168.13.1.64632: . 1:4345(4344) ack 1 win 1460 <nop,nop,timestamp 168858934 960528547>
000117 IP 192.168.13.1.64632 > 192.168.13.223.33287: . ack 1449 win 32768 <nop,nop,timestamp 960528547 168858934>
000002 IP 192.168.13.1.64632 > 192.168.13.223.33287: . ack 4345 win 32768 <nop,nop,timestamp 960528547 168858934>
000248 IP 192.168.13.223.33287 > 192.168.13.1.64632: . 4345:8689(4344) ack 1 win 1460 <nop,nop,timestamp 168858935 960528547>
Indeed, it waited for the ACK 4335, but then shouldn't it have emitted 4344+1448 or 5792 bytes or perhaps 7240 (since there were two
ACKs?
(this is a hacked tcpdump to treat an IP length field of zero as a TSO segment and use the other reported length - a patch went to
tcpdump-workers, not sure if they will like it or not...)
In the TSO off case it does send a full cwnd:
000031 IP 192.168.13.223.33289 > 192.168.13.1.64633: S 2252401705:2252401705(0) win 5840 <mss 1460,sackOK,timestamp 168870470
0,nop,wscale 2>
000099 IP 192.168.13.1.64633 > 192.168.13.223.33289: S 3685848941:3685848941(0) ack 2252401706 win 65535 <mss
1460,nop,nop,sackOK,wscale 2,nop,nop,nop,timestamp 960529700 168870470>
000014 IP 192.168.13.223.33289 > 192.168.13.1.64633: . ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000080 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 1:1449(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000009 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 1449:2897(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000010 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 2897:4345(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000145 IP 192.168.13.1.64633 > 192.168.13.223.33289: . ack 1449 win 32768 <nop,nop,timestamp 960529700 168870470>
000001 IP 192.168.13.1.64633 > 192.168.13.223.33289: . ack 4345 win 32768 <nop,nop,timestamp 960529700 168870470>
000190 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 4345:5793(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000006 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 5793:7241(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000013 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 7241:8689(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000005 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 8689:10137(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
000004 IP 192.168.13.223.33289 > 192.168.13.1.64633: . 10137:11585(1448) ack 1 win 1460 <nop,nop,timestamp 168870470 960529700>
Given the relative timestamps (tcpdump -ttt... taken on the sender) it _seems_ that even in the TSO-off case it was waiting for the
full cwnd to be ACKed, buth then once ACKed, it send the full 5 segment cwnd. (Although that seeming to wait would really need to be
confirmed by an intra-stack trace I suppose...)
rick jones
next prev parent reply other threads:[~2005-01-21 22:00 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-21 19:01 on the wire behaviour of TSO on/off is supposed to be the same yes? Rick Jones
2005-01-21 19:58 ` Jon Mason
2005-01-21 20:18 ` Rick Jones
2005-01-21 20:44 ` David S. Miller
2005-01-21 22:00 ` Rick Jones [this message]
2005-01-21 22:18 ` David S. Miller
2005-01-21 22:48 ` Rick Jones
2005-01-21 22:58 ` Rick Jones
2005-01-22 4:44 ` David S. Miller
2005-01-22 18:58 ` rick jones
2005-01-22 4:49 ` David S. Miller
2005-01-22 19:05 ` rick jones
2005-01-24 20:33 ` Rick Jones
2005-01-24 20:43 ` David S. Miller
2005-01-24 21:22 ` Rick Jones
2005-01-28 0:10 ` Rick Jones
2005-01-28 0:57 ` David S. Miller
2005-01-28 1:36 ` Rick Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41F17B7E.2020002@hp.com \
--to=rick.jones2@hp.com \
--cc=netdev@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).