netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
	pabeni@redhat.com, hawk@kernel.org
Subject: Re: [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI
Date: Mon, 3 Apr 2023 12:16:04 +0300	[thread overview]
Message-ID: <ZCqZVNvhjLqBh2cv@hera> (raw)
In-Reply-To: <20230331043906.3015706-1-kuba@kernel.org>

Hi Jakub

On Thu, Mar 30, 2023 at 09:39:05PM -0700, Jakub Kicinski wrote:
> Recent patches to mlx5 mentioned a regression when moving from
> driver local page pool to only using the generic page pool code.
> Page pool has two recycling paths (1) direct one, which runs in
> safe NAPI context (basically consumer context, so producing
> can be lockless); and (2) via a ptr_ring, which takes a spin
> lock because the freeing can happen from any CPU; producer
> and consumer may run concurrently.
>
> Since the page pool code was added, Eric introduced a revised version
> of deferred skb freeing. TCP skbs are now usually returned to the CPU
> which allocated them, and freed in softirq context. This places the
> freeing (producing of pages back to the pool) enticingly close to
> the allocation (consumer).
>
> If we can prove that we're freeing in the same softirq context in which
> the consumer NAPI will run - lockless use of the cache is perfectly fine,
> no need for the lock.
>
> Let drivers link the page pool to a NAPI instance. If the NAPI instance
> is scheduled on the same CPU on which we're freeing - place the pages
> in the direct cache.
>
> With that and patched bnxt (XDP enabled to engage the page pool, sigh,
> bnxt really needs page pool work :() I see a 2.6% perf boost with
> a TCP stream test (app on a different physical core than softirq).
>
> The CPU use of relevant functions decreases as expected:
>
>   page_pool_refill_alloc_cache   1.17% -> 0%
>   _raw_spin_lock                 2.41% -> 0.98%
>
> Only consider lockless path to be safe when NAPI is scheduled
> - in practice this should cover majority if not all of steady state
> workloads. It's usually the NAPI kicking in that causes the skb flush.
>
> The main case we'll miss out on is when application runs on the same
> CPU as NAPI. In that case we don't use the deferred skb free path.
> We could disable softirq one that path, too... maybe?

This whole thing makes a lot of sense to me.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: hawk@kernel.org
> CC: ilias.apalodimas@linaro.org

[...]

>  	return true;
>  }
>
> +/* If caller didn't allow direct recycling check if we have other reasons
> + * to believe that the producer and consumer can't race.
> + *
> + * Result is only meaningful in softirq context.
> + */
> +static bool page_pool_safe_producer(struct page_pool *pool)
> +{
> +	struct napi_struct *napi = pool->p.napi;
> +
> +	return napi && READ_ONCE(napi->list_owner) == smp_processor_id();
> +}
> +
>  /* If the page refcnt == 1, this will try to recycle the page.
>   * if PP_FLAG_DMA_SYNC_DEV is set, we'll try to sync the DMA area for
>   * the configured size min(dma_sync_size, pool->max_len).
> @@ -570,6 +583,9 @@ __page_pool_put_page(struct page_pool *pool, struct page *page,
>  			page_pool_dma_sync_for_device(pool, page,
>  						      dma_sync_size);
>
> +		if (!allow_direct)
> +			allow_direct = page_pool_safe_producer(pool);
> +

Do we want to hide the decision in __page_pool_put_page().  IOW wouldn't it
be better for this function to honor whatever allow_direct dictates and
have the allow_direct = page_pool_safe_producer(pool); in callers?

Thanks
/Ilias
>  		if (allow_direct && in_softirq() &&
>  		    page_pool_recycle_in_cache(page, pool))
>  			return NULL;
> --
> 2.39.2
>

  parent reply	other threads:[~2023-04-03  9:16 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-31  4:39 [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI Jakub Kicinski
2023-03-31  4:39 ` [RFC net-next 2/2] bnxt: hook NAPIs to page pools Jakub Kicinski
2023-03-31  5:15 ` [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI Jakub Kicinski
2023-03-31  9:31 ` Jesper Dangaard Brouer
2023-03-31 19:06   ` Jakub Kicinski
2023-03-31 22:17     ` Jakub Kicinski
2023-04-03  9:16 ` Ilias Apalodimas [this message]
2023-04-03 15:05   ` Jakub Kicinski
2023-04-03 15:30     ` Ilias Apalodimas
2023-04-03 17:12       ` Jakub Kicinski
2023-04-05 17:04         ` Dragos Tatulea
2023-04-04  0:53 ` Yunsheng Lin
2023-04-04  1:45   ` Jakub Kicinski
2023-04-04  3:18     ` Yunsheng Lin
2023-04-04  4:21   ` Eric Dumazet
2023-04-04 10:50     ` Yunsheng Lin
2023-04-05 17:11       ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCqZVNvhjLqBh2cv@hera \
    --to=ilias.apalodimas@linaro.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).