Re: [RFC] GRO scalability

Netdev List
 help / color / mirror / Atom feed

From: Eric Dumazet <eric.dumazet@gmail.com>
To: Rick Jones <rick.jones2@hp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>, Jesse Gross <jesse@nicira.com>
Subject: Re: [RFC] GRO scalability
Date: Mon, 08 Oct 2012 18:59:21 +0200	[thread overview]
Message-ID: <1349715561.21172.3463.camel@edumazet-glaptop> (raw)
In-Reply-To: <50730217.6020206@hp.com>

On Mon, 2012-10-08 at 09:40 -0700, Rick Jones wrote:
> On 10/05/2012 01:06 PM, Eric Dumazet wrote:
> > On Fri, 2012-10-05 at 12:35 -0700, Rick Jones wrote:
> >
> >> Just how much code path is there between NAPI and the socket?? (And I
> >> guess just how much combining are you hoping for?)
> >>
> >
> > When GRO correctly works, you can save about 30% of cpu cycles, it
> > depends...
> >
> > Doubling MAX_SKB_FRAGS (allowing 32+1 MSS per GRO skb instead of 16+1)
> > gives an improvement as well...
> 
> OK, but how much of that 30% come from where?  Each coalesced segment is 
> saving the cycles between NAPI and the socket.  Each avoided ACK is 
> saving the cycles from TCP to the bottom of the driver and a (share of) 
> transmit completion.

It comes from the fact that you have less competition between Bottom
Half handler and application on socket lock, not counting all layers
that we have to cross (IP, netfilter ...)

Each time a TCP packet is delivered and socket owned by the user, packet
is placed on a special 'backlog queue', and application has to process
this packet right before releasing socket lock. It sucks because it adds
latencies, and other frames are queued to backlokg since application
processes the backlog (very expensive because of cache line misses)

So GRO really makes this kind of event less probable.

> 
> Whe I say shuffle I mean something along the lines of interleave.  So, 
> if we have four flows, 1-4, a perfect shuffle of their segments would be 
> something like:
> 
> 1 2 3 4 1 2 3 4 1 2 3 4
> 
> but not well shuffled might look like
> 
> 1 1 3 2 3 2 4 4 4 1 3 2
> 

If all these packets are delivered in the same NAPI run, and correctly
aggregated, their order doesnt matter.

In first case, we will deliver  B1, B2, B3, B4   (B being a GRO packet
with 3 MSS)

In second case we will deliver

B1 B3 B2 B4

next prev parent reply	other threads:[~2012-10-08 16:59 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-27 12:48 [PATCH net-next 3/3] ipv4: gre: add GRO capability Eric Dumazet
2012-09-27 17:52 ` Jesse Gross
2012-09-27 18:08   ` Eric Dumazet
2012-09-27 18:19     ` Eric Dumazet
2012-09-27 22:03       ` Jesse Gross
2012-09-28 14:04         ` Eric Dumazet
2012-10-01 20:56           ` Jesse Gross
2012-10-05 14:52             ` [RFC] GRO scalability Eric Dumazet
2012-10-05 18:16               ` Rick Jones
2012-10-05 19:00                 ` Eric Dumazet
2012-10-05 19:35                   ` Rick Jones
2012-10-05 20:06                     ` Eric Dumazet
2012-10-08 16:40                       ` Rick Jones
2012-10-08 16:59                         ` Eric Dumazet [this message]
2012-10-08 17:49                           ` Rick Jones
2012-10-08 17:55                             ` Eric Dumazet
2012-10-08 17:56                               ` Eric Dumazet
2012-10-08 18:58                                 ` [RFC] napi: limit GRO latency Stephen Hemminger
2012-10-08 19:10                                   ` David Miller
2012-10-08 19:12                                     ` Stephen Hemminger
2012-10-08 19:30                                       ` Eric Dumazet
2012-10-08 19:40                                         ` Stephen Hemminger
2012-10-08 19:46                                           ` Eric Dumazet
2012-10-08 19:21                                   ` Eric Dumazet
2012-10-08 18:21                               ` [RFC] GRO scalability Rick Jones
2012-10-08 18:28                                 ` Eric Dumazet
2012-10-06  4:11               ` Herbert Xu
2012-10-06  5:08                 ` Eric Dumazet
2012-10-06  5:14                   ` Herbert Xu
2012-10-06  6:22                     ` Eric Dumazet
2012-10-06  7:00                       ` Eric Dumazet
2012-10-06 10:56                         ` Herbert Xu
2012-10-06 18:08                           ` [PATCH] net: gro: selective flush of packets Eric Dumazet
2012-10-07  0:32                             ` Herbert Xu
2012-10-07  5:29                               ` Eric Dumazet
2012-10-08  7:39                                 ` Eric Dumazet
2012-10-08 16:42                                   ` Rick Jones
2012-10-08 17:10                                     ` Eric Dumazet
2012-10-08 18:52                             ` David Miller
2012-09-27 22:03     ` [PATCH net-next 3/3] ipv4: gre: add GRO capability Jesse Gross
2012-10-01 21:04 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1349715561.21172.3463.camel@edumazet-glaptop \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=jesse@nicira.com \
    --cc=netdev@vger.kernel.org \
    --cc=rick.jones2@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox