All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rick Jones <rick.jones2@hp.com>
To: netdev@oss.sgi.com
Subject: Re: on the wire behaviour of TSO on/off is supposed to be the same yes?
Date: Fri, 21 Jan 2005 14:48:08 -0800	[thread overview]
Message-ID: <41F186A8.9030805@hp.com> (raw)
In-Reply-To: <20050121141820.7d59a2d1.davem@davemloft.net>

David S. Miller wrote:
> On Fri, 21 Jan 2005 14:00:30 -0800 Rick Jones <rick.jones2@hp.com> wrote:
> 
> 
>> Indeed, it waited for the ACK 4335, but then shouldn't it have emitted
>> 4344+1448 or 5792 bytes or perhaps 7240 (since there were two ACKs?
> 
> 
> The tcp_tso_win_divisor calculation occurs on the congestion window at the 
> time of the user request, not at the time of the ACK.

Ah, _that_ explains why in so many of my traces it stays at one value for sooo 
long.  And in some places it seemed to jump by fairly large quantities. I 
thought it was related to the window size, but in a netperf TCP_STREAM test, 
unless the sender sets the -m option, it is set based on the getsockopt() that 
follows the setsockopt() from the -s, and since -S was 128K, and since Linux 
doubles that on the getsockopt().... that explains the O(200K) bit before > 1448 
byte sends when the divisor was set to 8.

> That's an interesting observation actually, thanks for showing it.

My pleasure.

> It means that ideally we might want to try and find a way to either:
> 
> 1) defer the TSO window size calculation to some later moment, ie. at
> tcp_write_xmit() time
> 
> 2) use an optimistic TSO size calculation at the same moment we compute it
> now, and later if it is found to be too aggressive we chop up the TSO frame
> and resegment the transmit queue to accomodate
> 
> Neither is easy to implement as far as I can tell, but it should fix all the
> problems IBM and others are trying to work around by setting the
> tcp_tso_win_divisor really small.

Indeed, it seems that one would want to decide about TSO when one is about to 
transmit, not when the user does a send since otherwise, you penalize users 
doing larger sends.  Someone doing say a sendfile() of a large file would be 
pretty much precluded from getting benefit from TSO the way things are now right?

(There is a netperf TCP_SENDFILE test, but it defaults the send size to the 
socket buffer size just like TCP_STREAM)

And I suspect that is the case for some of the (un)spoken workloads of interest 
among the system vendors.  That's not to say that we still won't have incentive 
to set tcp_tso_win_divisor (shouldn't that really be tcp_tso_cwnd_divisor?) to 1 
:)  I suspect we will still want that initial "4380" cwnd bytes to be a single 
TSO transmission... every cycle's sacred, every cycle's great... :)

rick jones

BTW, has the whole "reply-to" question already been thrashed about on this list? 
  Is it an open or closed list?  I ask because I keep getting two copies of 
everyone's replies - one to me, one to the list... just a nit...

  reply	other threads:[~2005-01-21 22:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-01-21 19:01 on the wire behaviour of TSO on/off is supposed to be the same yes? Rick Jones
2005-01-21 19:58 ` Jon Mason
2005-01-21 20:18   ` Rick Jones
2005-01-21 20:44     ` David S. Miller
2005-01-21 22:00       ` Rick Jones
2005-01-21 22:18         ` David S. Miller
2005-01-21 22:48           ` Rick Jones [this message]
2005-01-21 22:58             ` Rick Jones
2005-01-22  4:44               ` David S. Miller
2005-01-22 18:58                 ` rick jones
2005-01-22  4:49             ` David S. Miller
2005-01-22 19:05               ` rick jones
2005-01-24 20:33               ` Rick Jones
2005-01-24 20:43                 ` David S. Miller
2005-01-24 21:22                   ` Rick Jones
2005-01-28  0:10                   ` Rick Jones
2005-01-28  0:57                     ` David S. Miller
2005-01-28  1:36                       ` Rick Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41F186A8.9030805@hp.com \
    --to=rick.jones2@hp.com \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.