All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: davem@davemloft.net, brouer@redhat.com, netdev@vger.kernel.org,
	edumazet@google.com, pabeni@redhat.com, hawk@kernel.org,
	ilias.apalodimas@linaro.org
Subject: Re: [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI
Date: Fri, 31 Mar 2023 12:06:43 -0700	[thread overview]
Message-ID: <20230331120643.09ce0e59@kernel.org> (raw)
In-Reply-To: <c646d51c-4e91-bd86-9a5b-97b5a1ce33d0@redhat.com>

On Fri, 31 Mar 2023 11:31:31 +0200 Jesper Dangaard Brouer wrote:
> On 31/03/2023 06.39, Jakub Kicinski wrote:
> > With that and patched bnxt (XDP enabled to engage the page pool, sigh,
> > bnxt really needs page pool work :() I see a 2.6% perf boost with
> > a TCP stream test (app on a different physical core than softirq).
> > 
> > The CPU use of relevant functions decreases as expected:
> > 
> >    page_pool_refill_alloc_cache   1.17% -> 0%
> >    _raw_spin_lock                 2.41% -> 0.98%
> > 
> > Only consider lockless path to be safe when NAPI is scheduled
> > - in practice this should cover majority if not all of steady state
> > workloads. It's usually the NAPI kicking in that causes the skb flush.
> 
> Make sense, but do read the comment above struct pp_alloc_cache.
> The sizing of pp_alloc_cache is important for this trick/heuristic to
> work, meaning the pp_alloc_cache have enough room.
> It is definitely on purpose that pp_alloc_cache have 128 elements and is
> only refill'ed with 64 elements, which leaves room for this kind of
> trick.  But if Eric's deferred skb freeing have more than 64 pages to
> free, then we will likely fallback to ptr_ring recycling.
> 
> Code wise, I suggest that you/we change page_pool_put_page_bulk() to
> have a variant that 'allow_direct' (looking at code below, you might
> already do this as this patch over-steer 'allow_direct').  Using the
> bulk API, would then bulk into ptr_ring in the cases we cannot use
> direct cache.

Interesting point, let me re-run some tests with the statistics enabled.
For a simple stream test I think it may just be too steady to trigger
over/underflow. Each skb will carry at most 18 pages, and driver should
only produce 64 packets / consume 64 pages. Each NAPI cycle will start
by flushing the deferred free. So unless there is a hiccup either at the
app or NAPI side - the flows of pages in each direction should be steady
enough to do well with just 128 cache entries. Let me get the data and
report back.

> >   /* If the page refcnt == 1, this will try to recycle the page.
> >    * if PP_FLAG_DMA_SYNC_DEV is set, we'll try to sync the DMA area for
> >    * the configured size min(dma_sync_size, pool->max_len).
> > @@ -570,6 +583,9 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
> >   			page_pool_dma_sync_for_device(pool, page,
> >   						      dma_sync_size);
> >   
> > +		if (!allow_direct)
> > +			allow_direct = page_pool_safe_producer(pool);
> > +  
> 
> I remember some use-case for veth, that explicitly disables
> "allow_direct".  I cannot remember why exactly, but we need to make sure
> that doesn't break something (as this code can undo the allow_direct).

I can't find anything in veth :( Trying to grep drivers for
page_pool_put / page_pool_recycle problems best I could find is commit
e38553bdc377 ("net: fec: Use page_pool_put_full_page when freeing rx buffers").

  reply	other threads:[~2023-03-31 19:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-31  4:39 [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI Jakub Kicinski
2023-03-31  4:39 ` [RFC net-next 2/2] bnxt: hook NAPIs to page pools Jakub Kicinski
2023-03-31  5:15 ` [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI Jakub Kicinski
2023-03-31  9:31 ` Jesper Dangaard Brouer
2023-03-31 19:06   ` Jakub Kicinski [this message]
2023-03-31 22:17     ` Jakub Kicinski
2023-04-03  9:16 ` Ilias Apalodimas
2023-04-03 15:05   ` Jakub Kicinski
2023-04-03 15:30     ` Ilias Apalodimas
2023-04-03 17:12       ` Jakub Kicinski
2023-04-05 17:04         ` Dragos Tatulea
2023-04-04  0:53 ` Yunsheng Lin
2023-04-04  1:45   ` Jakub Kicinski
2023-04-04  3:18     ` Yunsheng Lin
2023-04-04  4:21   ` Eric Dumazet
2023-04-04 10:50     ` Yunsheng Lin
2023-04-05 17:11       ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230331120643.09ce0e59@kernel.org \
    --to=kuba@kernel.org \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jbrouer@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.