Re: [PATCH net-next] page_pool: Clamp ring size to 32K

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Alexander H Duyck <alexander.duyck@gmail.com>
To: Jesper Dangaard Brouer <hawk@kernel.org>,
	Ratheesh Kannoth <rkannoth@marvell.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com,
	 Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	Alexander Lobakin <aleksander.lobakin@intel.com>,
	Yunsheng Lin <linyunsheng@huawei.com>
Subject: Re: [PATCH net-next] page_pool: Clamp ring size to 32K
Date: Mon, 07 Aug 2023 07:18:21 -0700	[thread overview]
Message-ID: <0aa395ee0386b4b470c152b95cc8a0517ee2d2cd.camel@gmail.com> (raw)
In-Reply-To: <b8eb926e-cfc9-b082-5bb9-719be3937c5d@kernel.org>

On Mon, 2023-08-07 at 13:42 +0200, Jesper Dangaard Brouer wrote:
> 
> On 07/08/2023 05.49, Ratheesh Kannoth wrote:
> > https://lore.kernel.org/netdev/20230804133512.4dbbbc16@kernel.org/T/
> > Capping the recycle ring to 32k instead of returning the error.
> > 
> 
> Page pool (PP) is just a cache of pages.  The driver octeontx2 (in link)
> is creating an excessive large cache of pages.  The drivers RX
> descriptor ring size should be independent of the PP ptr_ring size, as
> it is just a cache that grows as a functions of the in-flight packet
> workload, it functions as a "shock absorber".
> 
> 32768 pages (4KiB) is approx 128 MiB, and this will be per RX-queue.
> 
> The RX-desc ring (obviously) pins down these pages (immediately), but PP
> ring starts empty.  As the workload varies the "shock absorber" effect
> will let more pages into the system, that will travel the PP ptr_ring.
> As all pages originating from the same PP instance will get recycled,
> the in-flight pages in the "system" (PP ptr_ring) will grow over time.
> 
> The PP design have the problem that it never releases or reduces pages
> in this shock absorber "closed" system. (Cc. PP people/devel) we should
> consider implementing a MM shrinker callback (include/linux/shrinker.h).
> 
> Are the systems using driver octeontx2 ready to handle 128MiB memory per
> RX-queue getting pinned down overtime? (this could lead to some strange
> do debug situation if the memory is not sufficient)
> 
> --Jesper

I'm with Jesper on this. It doesn't make sense to be tying the
page_pool size strictly to the ring size. The amount of recycling you
get will depend on how long the packets are on the stack, not in the
driver.

For example, in the case of something like a software router or bridge
that is just taking the Rx packets and routing them to Tx you could
theoretically get away with a multiple of NAPI_POLL_WEIGHT since you
would likely never need much more than that as the Tx would likely be
cleaned about as fast as the Rx can consume the pages.

Rather than overriding the size here wouldn't it make more sense to do
it in the octeontx2 driver? With that at least you would know that you
were the one that limited the size instead of having the value modified
out from underneath you.

That said, one change that might help to enable this kind of change
would be look at adding a #define so that this value wouldn't be so
much a magic number and would be visible to the drivers should it ever
be changed in the future.

next prev parent reply	other threads:[~2023-08-07 14:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-07  3:49 [PATCH net-next] page_pool: Clamp ring size to 32K Ratheesh Kannoth
2023-08-07 11:42 ` Jesper Dangaard Brouer
2023-08-07 14:18   ` Alexander H Duyck [this message]
2023-08-07 17:20     ` Jakub Kicinski
2023-08-07 20:11       ` Jesper Dangaard Brouer
2023-08-08  2:26         ` [EXT] " Ratheesh Kannoth
2023-08-08 13:29         ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0aa395ee0386b4b470c152b95cc8a0517ee2d2cd.camel@gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rkannoth@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).