netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Alexander Lobakin <alobakin@pm.me>
Cc: brouer@redhat.com, "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	Matteo Croce <mcroce@linux.microsoft.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next] page_pool: let the compiler optimize and inline core functions
Date: Tue, 23 Mar 2021 10:01:38 +0100	[thread overview]
Message-ID: <20210323100138.791a77ce@carbon> (raw)
In-Reply-To: <20210322183047.10768-1-alobakin@pm.me>

On Mon, 22 Mar 2021 18:30:55 +0000
Alexander Lobakin <alobakin@pm.me> wrote:

> As per disscussion in Page Pool bulk allocator thread [0],
> there are two functions in Page Pool core code that are marked as
> 'noinline'. The reason for this is not so clear, and even if it
> was made to reduce hotpath overhead, in fact it only makes things
> worse.
> As both of these functions as being called only once through the
> code, they could be inlined/folded into the non-static entry point.
> However, 'noinline' marks effectively prevent from doing that and
> induce totally unneeded fragmentation (baseline -> after removal):
> 
> add/remove: 0/3 grow/shrink: 1/0 up/down: 1024/-1096 (-72)
> Function                                     old     new   delta
> page_pool_alloc_pages                        100    1124   +1024
> page_pool_dma_map                            164       -    -164
> page_pool_refill_alloc_cache                 332       -    -332
> __page_pool_alloc_pages_slow                 600       -    -600
> 
> (taken from Mel's branch, hence factored-out page_pool_dma_map())

I see that the refactor of page_pool_dma_map() caused it to be
uninlined, that were a mistake.  Thanks for high-lighting that again
as I forgot about this (even-though I think Alex Duyck did point this
out earlier).

I am considering if we should allow compiler to inline
page_pool_refill_alloc_cache + __page_pool_alloc_pages_slow, for the
sake of performance and I loose the ability to diagnose the behavior
from perf-report.  Mind that page_pool avoids stat for the sake of
performance, but these noinline makes it possible to diagnose the
behavior anyway.

> 
> 1124 is a normal hotpath frame size, but these jumps between tiny
> page_pool_alloc_pages(), page_pool_refill_alloc_cache() and
> __page_pool_alloc_pages_slow() are really redundant and harmful
> for performance.

Well, I disagree. (this is a NACK)

If pages were recycled then the code never had to visit
__page_pool_alloc_pages_slow().  And today without the bulk page-alloc
(that we are working on adding together with Mel) we have to visit
__page_pool_alloc_pages_slow() every time, which is a bad design, but
I'm trying to fix that.

Matteo is working on recycling here[1]:
 [1] https://lore.kernel.org/netdev/20210322170301.26017-1-mcroce@linux.microsoft.com/

It would be really great if you could try out his patchset, as it will
help your driver avoid the slow path of the page_pool.  Given you are
very detailed oriented, I do want to point out that Matteo's patchset
is only the first step, as to really improve performance for page_pool,
we need to bulk return these page_pool pages (it is requires some
restructure of the core code, that will be confusing at this point).


> This simple removal of 'noinline' keywords bumps the throughput
> on XDP_PASS + napi_build_skb() + napi_gro_receive() on 25+ Mbps
> for 1G embedded NIC.
> 
> [0] https://lore.kernel.org/netdev/20210317222506.1266004-1-alobakin@pm.me
> 
> Signed-off-by: Alexander Lobakin <alobakin@pm.me>
> ---
>  net/core/page_pool.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index ad8b0707af04..589e4df6ef2b 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -102,7 +102,6 @@ EXPORT_SYMBOL(page_pool_create);
> 
>  static void page_pool_return_page(struct page_pool *pool, struct page *page);
> 
> -noinline
>  static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
>  {
>  	struct ptr_ring *r = &pool->ring;
> @@ -181,7 +180,6 @@ static void page_pool_dma_sync_for_device(struct page_pool *pool,
>  }
> 
>  /* slow path */
> -noinline
>  static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
>  						 gfp_t _gfp)
>  {
> --
> 2.31.0
> 
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2021-03-23  9:02 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-22 18:30 [PATCH net-next] page_pool: let the compiler optimize and inline core functions Alexander Lobakin
2021-03-23  9:01 ` Jesper Dangaard Brouer [this message]
2021-03-23 15:36   ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210323100138.791a77ce@carbon \
    --to=brouer@redhat.com \
    --cc=alobakin@pm.me \
    --cc=davem@davemloft.net \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcroce@linux.microsoft.com \
    --cc=mgorman@techsingularity.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).