netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@mellanox.co.il>
To: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org, rdreier@cisco.com, rick.jones2@hp.com,
	linux-kernel@vger.kernel.org, openib-general@openib.org
Subject: Re: TSO and IPoIB performance degradation
Date: Mon, 20 Mar 2006 11:06:29 +0200	[thread overview]
Message-ID: <20060320090629.GA11352@mellanox.co.il> (raw)
In-Reply-To: <20060309.232301.77550306.davem@davemloft.net>

Quoting r. David S. Miller <davem@davemloft.net>:
> > well, there are stacks which do "stretch acks" (after a fashion) that 
> > make sure when they see packet loss to "do the right thing" wrt sending 
> > enough acks to allow cwnds to open again in a timely fashion.
> 
> Once a loss happens, it's too late to stop doing the stretch ACKs, the
> damage is done already.  It is going to take you at least one
> extra RTT to recover from the loss compared to if you were not doing
> stretch ACKs.
> 
> You have to keep giving consistent well spaced ACKs back to the
> receiver in order to recover from loss optimally.

Is it the case then that this requirement is less essential on
networks such as IP over InfiniBand, which are very low latency
and essencially lossless (with explicit congestion contifications
in hardware)?

> The ACK every 2 full sized frames behavior of TCP is absolutely
> essential.

Interestingly, I was pointed towards the following RFC draft
http://www.ietf.org/internet-drafts/draft-ietf-tcpm-rfc2581bis-00.txt

    The requirement that an ACK "SHOULD" be generated for at least every
    second full-sized segment is listed in [RFC1122] in one place as a
    SHOULD and another as a MUST.  Here we unambiguously state it is a
    SHOULD.  We also emphasize that this is a SHOULD, meaning that an
    implementor should indeed only deviate from this requirement after
    careful consideration of the implications.

And as Matt Leininger's research appears to show, stretch ACKs
are good for performance in case of IP over InfiniBand.

Given all this, would it make sense to add a per-netdevice (or per-neighbour)
flag to re-enable the trick for these net devices (as was done before
314324121f9b94b2ca657a494cf2b9cb0e4a28cc)?
IP over InfiniBand driver would then simply set this flag.

David, would you accept such a patch? It would be nice to get 2.6.17
back to within at least 10% of 2.6.11.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies

  parent reply	other threads:[~2006-03-20  9:06 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-06 22:34 TSO and IPoIB performance degradation Michael S. Tsirkin
2006-03-06 22:40 ` David S. Miller
2006-03-06 22:50 ` Stephen Hemminger
2006-03-07  3:13   ` Shirley Ma
2006-03-07 21:44     ` Matt Leininger
2006-03-07 21:49       ` Stephen Hemminger
2006-03-07 21:53         ` Michael S. Tsirkin
2006-03-08  0:11         ` Matt Leininger
2006-03-08  0:18           ` David S. Miller
2006-03-08  1:17             ` Roland Dreier
2006-03-08  1:23               ` David S. Miller
2006-03-08  1:34                 ` Roland Dreier
2006-03-08 12:53                 ` Michael S. Tsirkin
2006-03-08 20:53                   ` David S. Miller
2006-03-09 23:48                   ` David S. Miller
2006-03-10  0:10                     ` Michael S. Tsirkin
2006-03-10  0:38                       ` Michael S. Tsirkin
2006-03-10  7:18                       ` David S. Miller
2006-03-10  0:21                     ` Rick Jones
2006-03-10  7:23                       ` David S. Miller
2006-03-10 17:44                         ` Rick Jones
2006-03-20  9:06                         ` Michael S. Tsirkin [this message]
2006-03-20  9:55                           ` David S. Miller
2006-03-20 10:22                             ` Michael S. Tsirkin
2006-03-20 10:37                               ` David S. Miller
2006-03-20 11:27                                 ` Michael S. Tsirkin
2006-03-20 11:47                                   ` Arjan van de Ven
2006-03-20 11:49                                     ` Lennert Buytenhek
2006-03-20 11:53                                       ` Arjan van de Ven
2006-03-20 13:35                                         ` Michael S. Tsirkin
2006-03-20 12:04                                       ` Michael S. Tsirkin
2006-03-20 15:09                                         ` Benjamin LaHaise
2006-03-20 18:58                                           ` Rick Jones
2006-03-20 23:00                                           ` David S. Miller
2006-04-27  4:13                                 ` Troy Benjegerdes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060320090629.GA11352@mellanox.co.il \
    --to=mst@mellanox.co.il \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=openib-general@openib.org \
    --cc=rdreier@cisco.com \
    --cc=rick.jones2@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).