All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rick Jones <rick.jones2@hpe.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Manish Chopra <manish.chopra@qlogic.com>,
	David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>,
	Ariel Elior <Ariel.Elior@qlogic.com>,
	Tom Herbert <tom@herbertland.com>,
	Hannes Frederic Sowa <hannes@redhat.com>
Subject: Re: [PATCH net-next 0/5] qed/qede: Tunnel hardware GRO support
Date: Wed, 22 Jun 2016 16:10:32 -0700	[thread overview]
Message-ID: <576B1AE8.6030309@hpe.com> (raw)
In-Reply-To: <1466635664.6850.90.camel@edumazet-glaptop3.roam.corp.google.com>

On 06/22/2016 03:47 PM, Eric Dumazet wrote:
> On Wed, 2016-06-22 at 14:52 -0700, Rick Jones wrote:
>> On 06/22/2016 11:22 AM, Yuval Mintz wrote:
>>> But seriously, this isn't really anything new but rather a step forward in
>>> the direction we've already taken - bnx2x/qede are already performing
>>> the same for non-encapsulated TCP.
>>
>> Since you mention bnx2x...   I would argue that the NIC firmware on
>> those NICs driven by bnx2x is doing it badly.  Not so much from a
>> functional standpoint I suppose, but from a performance one.  The
>> NIC-firmware GRO done there has this rather unfortunate assumption about
>> "all MSSes will be directly driven by my own physical MTU" and when it
>> sees segments of a size other than would be suggested by the physical
>> MTU, will coalesce only two segments together.  They then do not get
>> further coalesced in the stack.
>>
>> Suffice it to say this does not do well from a performance standpoint.
>>
>> One can disable LRO via ethtool for these NICs, but what that does is
>> disable old-school LRO, not GRO-in-the-NIC.  To get that disabled, one
>> must also get the bnx2x module loaded with "disable-tpa=1" so the Linux
>> stack GRO gets used instead.
>>
>> Had the bnx2x-driven NICs' firmware not had that rather unfortunate
>> assumption about MSSes I probably would never have noticed.
>
> I do not see this behavior on my bnx2x nics ?
>
> ip ro add 10.246.11.52 via 10.246.11.254 dev eth0 mtu 1000
> lpk51:~# ./netperf -H 10.246.11.52 -l 1000
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 10.246.11.52 () port 0 AF_INET

I first saw this with VMs which themselves had 1400 byte MTUs on their 
vNICs, speaking though bnx2x-driven NICs with a 1500 byte MTU, but I did 
later reproduce it by tweaking the MTU of my sending side NIC to 
something like 1400 bytes and running a "bare iron" netperf.  I believe 
you may be able to achieve the same thing by having netperf set a 
smaller MSS via the test-specific -G option.

My systems are presently in the midst of an install but I should be able 
to demonstrate it in the morning (US Pacific time, modulo the shuttle 
service of a car repair place)

> On receiver :

Paranoid question, but is LRO disabled on the receiver?  I don't know 
that LRO exhibits the behaviour, just GRO-in-the-NIC.

rick

>
> 15:46:08.296241 IP 10.246.11.52.46907 > 10.246.11.51.34131: Flags [.],
> ack 303360, win 8192, options [nop,nop,TS val 1245217243 ecr
> 1245306446], length 0
> 15:46:08.296430 IP 10.246.11.51.34131 > 10.246.11.52.46907: Flags [.],
> seq 303360:327060, ack 1, win 229, options [nop,nop,TS val 1245306446
> ecr 1245217242], length 23700
> 15:46:08.296441 IP 10.246.11.52.46907 > 10.246.11.51.34131: Flags [.],
> ack 327060, win 8192, options [nop,nop,TS val 1245217243 ecr
> 1245306446], length 0
> 15:46:08.296644 IP 10.246.11.51.34131 > 10.246.11.52.46907: Flags [.],
> seq 327060:350760, ack 1, win 229, options [nop,nop,TS val 1245306446
> ecr 1245217242], length 23700
> 15:46:08.296655 IP 10.246.11.52.46907 > 10.246.11.51.34131: Flags [.],
> ack 350760, win 8192, options [nop,nop,TS val 1245217244 ecr
> 1245306446], length 0
> 15:46:08.296854 IP 10.246.11.51.34131 > 10.246.11.52.46907: Flags [.],
> seq 350760:374460, ack 1, win 229, options [nop,nop,TS val 1245306446
> ecr 1245217242], length 23700
> 15:46:08.296897 IP 10.246.11.52.46907 > 10.246.11.51.34131: Flags [.],
> ack 374460, win 8192, options [nop,nop,TS val 1245217244 ecr
> 1245306446], length 0
> 15:46:08.297054 IP 10.246.11.51.34131 > 10.246.11.52.46907: Flags [.],
> seq 374460:398160, ack 1, win 229, options [nop,nop,TS val 1245306446
> ecr 1245217242], length 23700
> 15:46:08.297099 IP 10.246.11.52.46907 > 10.246.11.51.34131: Flags [.],
> ack 398160, win 8192, options [nop,nop,TS val 1245217244 ecr
> 1245306446], length 0
> 15:46:08.297258 IP 10.246.11.51.34131 > 10.246.11.52.46907: Flags [.],
> seq 398160:420912, ack 1, win 229, options [nop,nop,TS val 1245306446
> ecr 1245217242], length 22752
> 15:46:08.297301 IP 10.246.11.52.46907 > 10.246.11.51.34131: Flags [.],
> ack 420912, win 8192, options [nop,nop,TS val 1245217244 ecr
> 1245306446], length 0
>

  parent reply	other threads:[~2016-06-22 23:10 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-22  8:25 [PATCH net-next 0/5] qed/qede: Tunnel hardware GRO support Manish Chopra
2016-06-22  8:25 ` [PATCH net-next 1/5] net: export udp and gre gro_complete() APIs Manish Chopra
2016-06-22  8:25 ` [PATCH net-next 2/5] qede: Add support to handle VXLAN hardware GRO packets Manish Chopra
2016-06-22  8:25 ` [PATCH net-next 3/5] qede: Add support to handle GENEVE " Manish Chopra
2016-06-22  8:25 ` [PATCH net-next 4/5] qede: Add support to handle GRE " Manish Chopra
2016-06-22  8:25 ` [PATCH net-next 5/5] qed: Enable hardware GRO feature for encapsulated packets Manish Chopra
2016-06-22 16:27 ` [PATCH net-next 0/5] qed/qede: Tunnel hardware GRO support Alexander Duyck
2016-06-22 17:16   ` Yuval Mintz
2016-06-22 17:45     ` Alexander Duyck
2016-06-22 18:22       ` Yuval Mintz
2016-06-22 21:32         ` Alexander Duyck
2016-06-22 22:32           ` Hannes Frederic Sowa
2016-06-22 23:42           ` Eric Dumazet
2016-06-22 21:52         ` Rick Jones
2016-06-22 22:47           ` Eric Dumazet
2016-06-22 22:56             ` Alexander Duyck
2016-06-22 23:31               ` Eric Dumazet
2016-06-22 23:59                 ` Tom Herbert
2016-06-23  0:11                 ` Alexander Duyck
2016-06-23  4:10                   ` Yuval Mintz
2016-06-23  4:17                     ` Yuval Mintz
2016-06-23 17:07                       ` Alexander Duyck
2016-06-23 21:06                         ` Yuval Mintz
2016-06-23 23:20                           ` Alexander Duyck
2016-06-24  5:20                             ` Yuval Mintz
2016-06-24 16:44                               ` Alexander Duyck
2016-06-24 13:09                         ` Edward Cree
2016-06-24 16:31                           ` Tom Herbert
2016-06-24 17:21                             ` Edward Cree
2016-06-26  6:09                               ` Yuval Mintz
2016-06-22 23:52               ` Rick Jones
2016-06-23  0:18                 ` Alexander Duyck
2016-06-22 23:10             ` Rick Jones [this message]
2016-06-23  0:48               ` Rick Jones
2016-06-23  9:03                 ` Yuval Mintz
2016-06-26 19:53           ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=576B1AE8.6030309@hpe.com \
    --to=rick.jones2@hpe.com \
    --cc=Ariel.Elior@qlogic.com \
    --cc=Yuval.Mintz@qlogic.com \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=hannes@redhat.com \
    --cc=manish.chopra@qlogic.com \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.