netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rick Jones <rick.jones2@hp.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Joris van Rantwijk <joris@jorisvr.nl>, netdev@vger.kernel.org
Subject: Re: Question about LRO/GRO and TCP acknowledgements
Date: Mon, 13 Jun 2011 10:55:26 -0700	[thread overview]
Message-ID: <1307987726.8149.3312.camel@tardy> (raw)
In-Reply-To: <1307890657.2872.158.camel@edumazet-laptop>

On Sun, 2011-06-12 at 16:57 +0200, Eric Dumazet wrote:
> Le dimanche 12 juin 2011 à 13:24 +0200, Joris van Rantwijk a écrit :
> > On 2011-06-12, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > So your concern is more a Sender side implementation missing this
> > > recommendation, not GRO per se...
> > 
> > Not really. The same RFC says:
> >   Specifically, an ACK SHOULD be generated for at least every
> >   second full-sized segment, ...
> > 
> 
> Well, SHOULD is not MUST.
> 
> 
> > I can see how the world may have been a better place if every sender
> > implemented Appropriate Byte Counting and TCP receivers were allowed to
> > send fewer ACKs. However, current reality is that ABC is optional,
> > disabled by default in Linux, and receivers are recommended to send one
> > ACK per two segments.
> > 
> 
> ABC might be nice for stacks that use byte counters for cwnd. We use
> segments.
> 
> > I suspect that GRO currently hurts throughput of isolated TCP
> > connections. This is based on a purely theoretic argument. I may be
> > wrong and I have absolutely no data to confirm my suspicion.
> > 
> > If you can point out the flaw in my reasoning, I would be greatly
> > relieved. Until then, I remain concerned that there may be something
> > wrong with GRO and TCP ACKs.
> 
> Think of GRO being a receiver facility against stress/load, typically in
> datacenter.
> 
> Only when receiver is overloaded, GRO kicks in and can coalesce several
> frames before being handled in TCP stack in one run.

How is that affected by interrupt coalescing in the NIC and the sending
side doing TSO (and so, ostensibly sending back-to-back frames)?  Are we
assured that a NIC is updating its completion pointer on the rx ring
continuously rather than just before a coalesced interrupt?

Does GRO "never" kick-in over a 1GbE link (making the handwaving
assumption that cores today are >> faster than a 1GbE link on a bulk
transfer).

It was just a quick and dirty test, but it does seem there is a positive
hit from GRO being enabled on a 1GbE link on a system with "fast
processors"

raj@tardy:~/netperf2_trunk$ sudo ethtool -K eth1 gro off
raj@tardy:~/netperf2_trunk$ src/netperf -t TCP_MAERTS -H 192.168.1.3 -i
10,3 -c -- -k foo
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.1.3 (192.168.1.3) port 0 AF_INET : +/-2.500% @ 99% conf.  :
histogram : demo
THROUGHPUT=935.07
LOCAL_INTERFACE_NAME=eth1
LOCAL_CPU_UTIL=16.64
LOCAL_SD=5.830
raj@tardy:~/netperf2_trunk$ sudo ethtool -K eth1 gro on
raj@tardy:~/netperf2_trunk$ src/netperf -t TCP_MAERTS -H 192.168.1.3 -i
10,3 -c -- -k foo
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.1.3 (192.168.1.3) port 0 AF_INET : +/-2.500% @ 99% conf.  :
histogram : demo
THROUGHPUT=934.81
LOCAL_INTERFACE_NAME=eth1
LOCAL_CPU_UTIL=16.21
LOCAL_SD=5.684
raj@tardy:~/netperf2_trunk$ uname -a
Linux tardy 2.6.35-28-generic #50-Ubuntu SMP Fri Mar 18 18:42:20 UTC
2011 x86_64 GNU/Linux

The receiver system here has a 3.07 GHz W3550 in it and eth1 is a port
on an Intel 82571EB-based four-port card.

raj@tardy:~/netperf2_trunk$ ethtool -i eth1
driver: e1000e
version: 1.0.2-k4
firmware-version: 5.10-2
bus-info: 0000:2a:00.0

> If receiver is so loaded that more than 2 frames are coalesced in a NAPI
> run, it certainly helps to not allow sender to increase its cwnd more
> than one SMSS. We probably are right before packet drops anyway.

If we are indeed statistically certain we are right before packet drops
(or I suppose asserting pause) then shouldn't ECN get set by the GRO
code?

rick


  parent reply	other threads:[~2011-06-13 17:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-11 19:59 Question about LRO/GRO and TCP acknowledgements Joris van Rantwijk
2011-06-12  3:43 ` Ben Hutchings
2011-06-12  7:51   ` Joris van Rantwijk
2011-06-12  9:07     ` Eric Dumazet
2011-06-12  9:30       ` Joris van Rantwijk
2011-06-12 10:48         ` Eric Dumazet
2011-06-12 11:24           ` Joris van Rantwijk
2011-06-12 12:01             ` Alexander Zimmermann
2011-06-12 14:57             ` Eric Dumazet
2011-06-12 19:37               ` Joris van Rantwijk
2011-06-14 10:53                 ` Ilpo Järvinen
2011-06-14 19:37                   ` Joris van Rantwijk
2011-06-13 17:55               ` Rick Jones [this message]
2011-06-13 17:34 ` Rick Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1307987726.8149.3312.camel@tardy \
    --to=rick.jones2@hp.com \
    --cc=eric.dumazet@gmail.com \
    --cc=joris@jorisvr.nl \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).