From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
pabeni@redhat.com, hawk@kernel.org
Subject: Re: [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI
Date: Mon, 3 Apr 2023 12:16:04 +0300 [thread overview]
Message-ID: <ZCqZVNvhjLqBh2cv@hera> (raw)
In-Reply-To: <20230331043906.3015706-1-kuba@kernel.org>
Hi Jakub
On Thu, Mar 30, 2023 at 09:39:05PM -0700, Jakub Kicinski wrote:
> Recent patches to mlx5 mentioned a regression when moving from
> driver local page pool to only using the generic page pool code.
> Page pool has two recycling paths (1) direct one, which runs in
> safe NAPI context (basically consumer context, so producing
> can be lockless); and (2) via a ptr_ring, which takes a spin
> lock because the freeing can happen from any CPU; producer
> and consumer may run concurrently.
>
> Since the page pool code was added, Eric introduced a revised version
> of deferred skb freeing. TCP skbs are now usually returned to the CPU
> which allocated them, and freed in softirq context. This places the
> freeing (producing of pages back to the pool) enticingly close to
> the allocation (consumer).
>
> If we can prove that we're freeing in the same softirq context in which
> the consumer NAPI will run - lockless use of the cache is perfectly fine,
> no need for the lock.
>
> Let drivers link the page pool to a NAPI instance. If the NAPI instance
> is scheduled on the same CPU on which we're freeing - place the pages
> in the direct cache.
>
> With that and patched bnxt (XDP enabled to engage the page pool, sigh,
> bnxt really needs page pool work :() I see a 2.6% perf boost with
> a TCP stream test (app on a different physical core than softirq).
>
> The CPU use of relevant functions decreases as expected:
>
> page_pool_refill_alloc_cache 1.17% -> 0%
> _raw_spin_lock 2.41% -> 0.98%
>
> Only consider lockless path to be safe when NAPI is scheduled
> - in practice this should cover majority if not all of steady state
> workloads. It's usually the NAPI kicking in that causes the skb flush.
>
> The main case we'll miss out on is when application runs on the same
> CPU as NAPI. In that case we don't use the deferred skb free path.
> We could disable softirq one that path, too... maybe?
This whole thing makes a lot of sense to me.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: hawk@kernel.org
> CC: ilias.apalodimas@linaro.org
[...]
> return true;
> }
>
> +/* If caller didn't allow direct recycling check if we have other reasons
> + * to believe that the producer and consumer can't race.
> + *
> + * Result is only meaningful in softirq context.
> + */
> +static bool page_pool_safe_producer(struct page_pool *pool)
> +{
> + struct napi_struct *napi = pool->p.napi;
> +
> + return napi && READ_ONCE(napi->list_owner) == smp_processor_id();
> +}
> +
> /* If the page refcnt == 1, this will try to recycle the page.
> * if PP_FLAG_DMA_SYNC_DEV is set, we'll try to sync the DMA area for
> * the configured size min(dma_sync_size, pool->max_len).
> @@ -570,6 +583,9 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
> page_pool_dma_sync_for_device(pool, page,
> dma_sync_size);
>
> + if (!allow_direct)
> + allow_direct = page_pool_safe_producer(pool);
> +
Do we want to hide the decision in __page_pool_put_page(). IOW wouldn't it
be better for this function to honor whatever allow_direct dictates and
have the allow_direct = page_pool_safe_producer(pool); in callers?
Thanks
/Ilias
> if (allow_direct && in_softirq() &&
> page_pool_recycle_in_cache(page, pool))
> return NULL;
> --
> 2.39.2
>
next prev parent reply other threads:[~2023-04-03 9:16 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-31 4:39 [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI Jakub Kicinski
2023-03-31 4:39 ` [RFC net-next 2/2] bnxt: hook NAPIs to page pools Jakub Kicinski
2023-03-31 5:15 ` [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI Jakub Kicinski
2023-03-31 9:31 ` Jesper Dangaard Brouer
2023-03-31 19:06 ` Jakub Kicinski
2023-03-31 22:17 ` Jakub Kicinski
2023-04-03 9:16 ` Ilias Apalodimas [this message]
2023-04-03 15:05 ` Jakub Kicinski
2023-04-03 15:30 ` Ilias Apalodimas
2023-04-03 17:12 ` Jakub Kicinski
2023-04-05 17:04 ` Dragos Tatulea
2023-04-04 0:53 ` Yunsheng Lin
2023-04-04 1:45 ` Jakub Kicinski
2023-04-04 3:18 ` Yunsheng Lin
2023-04-04 4:21 ` Eric Dumazet
2023-04-04 10:50 ` Yunsheng Lin
2023-04-05 17:11 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCqZVNvhjLqBh2cv@hera \
--to=ilias.apalodimas@linaro.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.