netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Tariq Toukan <tariqt@mellanox.com>,
	brouer@redhat.com
Subject: Re: [PATCH net-next 2/2] udp: implement and use per cpu rx skbs cache
Date: Mon, 23 Apr 2018 10:52:03 +0200	[thread overview]
Message-ID: <20180423105203.53600545@redhat.com> (raw)
In-Reply-To: <1524396178.10317.18.camel@redhat.com>

On Sun, 22 Apr 2018 13:22:58 +0200
Paolo Abeni <pabeni@redhat.com> wrote:

> On Fri, 2018-04-20 at 15:48 +0200, Jesper Dangaard Brouer wrote:
> > On Thu, 19 Apr 2018 06:47:10 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote:  
> > > On 04/19/2018 12:40 AM, Paolo Abeni wrote:  
> > > > On Wed, 2018-04-18 at 12:21 -0700, Eric Dumazet wrote:    
> > > > > On 04/18/2018 10:15 AM, Paolo Abeni wrote:  
> > 
> > [...]  
> > > > 
> > > > Any suggestions for better results are more than welcome!    
> > > 
> > > Yes, remote skb freeing. I mentioned this idea to Jesper and Tariq in
> > > Seoul (netdev conference). Not tied to UDP, but a generic solution.  
> > 
> > Yes, I remember.  I think... was it the idea, where you basically
> > wanted to queue back SKBs to the CPU that allocated them, right?
> > 
> > Freeing an SKB on the same CPU that allocated it, have multiple
> > advantages. (1) the SLUB allocator can use a non-atomic
> > "cpu-local" (double)cmpxchg. (2) the 4 cache-lines memset cleared of
> > the SKB stay local.  (3) the atomic SKB refcnt/users stay local.  
> 
> By the time the skb is returned to the ingress cpu, isn't that skb most
> probably out of the cache?

This is a too simplistic view.  You have to look at the cache
coherence state[1] of the individual cache lines (SKB consist of 4
cache-lines). And newer Intel CPUs [2] can "Forward(F)" cache-lines
between caches.  The SKB cache-line that have atomic refcnt/users
important to analyze (Read For Ownership (RFO) case).  Analyzing the
other cache-lines is actually more complicated due to techniques like
"Store Buffer" and "Invalidate Queues".

[1] https://en.wikipedia.org/wiki/MESI_protocol
[2] https://en.wikipedia.org/wiki/MESIF_protocol

There is also a lot of detail in point (1) about how the SLUB
alloactor works internally, and how it avoids bouncing the struct-page
cache-line.  Some of the performance benefit from you current patch
also comes from this...


> > We just have to avoid that queue back SKB's mechanism, doesn't cost
> > more than the operations we expect to save.  Bulk transfer is an
> > obvious approach.  For storing SKBs until they are returned, we already
> > have a fast mechanism see napi_consume_skb calling _kfree_skb_defer,
> > which SLUB/SLAB-bulk free to amortize cost (1).
> > 
> > I guess, the missing information is that we don't know what CPU the SKB
> > were created on...
> > 
> > Where to store this CPU info?
> > 
> > (a) In struct sk_buff, in a cache-line that is already read on remote
> > CPU in UDP code?
> > 
> > (b) In struct page, as SLUB alloc hand-out objects/SKBs on a per page
> > basis, we could have SLUB store a hint about the CPU it was allocated
> > on, and bet on returning to that CPU ? (might be bad to read the
> > struct-page cache-line)  
> 
> Bulking would be doable only for connected sockets, elsewhere would be
> difficult to assemble a burst long enough to amortize the handshake
> with the remote CPU (spinlock + ipi needed ?!?)

We obviously need some level of bulking.

I would likely try to avoid any explicit IPI calls, but instead use a
queue like the ptr_ring queue, because it have good separation between
cache-lines used by consumer and producer (but it might be overkill
for this use-case).

 
> Would be good enough for unconnected sockets sending a whole skb burst
> back to one of the (several) ingress CPU? e.g. peeking the CPU
> associated with the first skb inside the burst, we would somewhat
> balance the load between the ingress CPUs.

See, Willem de Bruijn suggestions...

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-04-23  8:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-18 10:22 [PATCH net-next 0/2] UDP: introduce RX skb cache Paolo Abeni
2018-04-18 10:22 ` [PATCH net-next 1/2] udp: if the rx queue is full, free the skb in __udp_enqueue_schedule_skb() Paolo Abeni
2018-04-18 10:22 ` [PATCH net-next 2/2] udp: implement and use per cpu rx skbs cache Paolo Abeni
2018-04-18 16:56   ` Eric Dumazet
2018-04-18 17:15     ` Paolo Abeni
2018-04-18 19:21       ` Eric Dumazet
2018-04-19  7:40         ` Paolo Abeni
2018-04-19 13:47           ` Eric Dumazet
2018-04-20 13:48             ` Jesper Dangaard Brouer
2018-04-21 15:54               ` Willem de Bruijn
2018-04-21 16:45                 ` Eric Dumazet
2018-04-22 11:22               ` Paolo Abeni
2018-04-23  8:52                 ` Jesper Dangaard Brouer [this message]
2018-04-23  8:13               ` Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180423105203.53600545@redhat.com \
    --to=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tariqt@mellanox.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).