From: Jakub Kicinski <kuba@kernel.org>
To: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: Leon Hwang <leon.hwang@linux.dev>,
netdev@vger.kernel.org,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
kerneljasonxing@gmail.com, lance.yang@linux.dev,
jiayuan.chen@linux.dev, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org,
Leon Huang Fu <leon.huangfu@shopee.com>,
Dragos Tatulea <dtatulea@nvidia.com>,
kernel-team <kernel-team@cloudflare.com>,
Yan Zhai <yan@cloudflare.com>
Subject: Re: [PATCH net-next v3] page_pool: Add page_pool_release_stalled tracepoint
Date: Sun, 4 Jan 2026 08:43:47 -0800 [thread overview]
Message-ID: <20260104084347.5de3a537@kernel.org> (raw)
In-Reply-To: <011ca15e-107b-4679-8203-f5f821f27900@kernel.org>
On Fri, 2 Jan 2026 12:43:46 +0100 Jesper Dangaard Brouer wrote:
> On 02/01/2026 08.17, Leon Hwang wrote:
> > Introduce a new tracepoint to track stalled page pool releases,
> > providing better observability for page pool lifecycle issues.
>
> In general I like/support adding this tracepoint for "debugability" of
> page pool lifecycle issues.
>
> For "observability" @Kuba added a netlink scheme[1][2] for page_pool[3],
> which gives us the ability to get events and list page_pools from userspace.
> I've not used this myself (yet) so I need input from others if this is
> something that others have been using for page pool lifecycle issues?
My input here is the least valuable (since one may expect the person
who added the code uses it) - but FWIW yes, we do use the PP stats to
monitor PP lifecycle issues at Meta. That said - we only monitor for
accumulation of leaked memory from orphaned pages, as the whole reason
for adding this code was that in practice the page may be sitting in
a socket rx queue (or defer free queue etc.) IOW a PP which is not
getting destroyed for a long time is not necessarily a kernel issue.
> Need input from @Kuba/others as the "page-pool-get"[4] state that "Only
> Page Pools associated with a net_device can be listed". Don't we want
> the ability to list "invisible" page_pool's to allow debugging issues?
>
> [1] https://docs.kernel.org/userspace-api/netlink/intro-specs.html
> [2] https://docs.kernel.org/userspace-api/netlink/index.html
> [3] https://docs.kernel.org/netlink/specs/netdev.html
> [4] https://docs.kernel.org/netlink/specs/netdev.html#page-pool-get
The documentation should probably be updated :(
I think what I meant is that most _drivers_ didn't link their PP to the
netdev via params when the API was added. So if the user doesn't see the
page pools - the driver is probably not well maintained.
In practice only page pools which are not accessible / visible via the
API are page pools from already destroyed network namespaces (assuming
their netdevs were also destroyed and not re-parented to init_net).
Which I'd think is a rare case?
> Looking at the code, I see that NETDEV_CMD_PAGE_POOL_CHANGE_NTF netlink
> notification is only generated once (in page_pool_destroy) and not when
> we retry in page_pool_release_retry (like this patch). In that sense,
> this patch/tracepoint is catching something more than netlink provides.
> First I though we could add a netlink notification, but I can imagine
> cases this could generate too many netlink messages e.g. a netdev with
> 128 RX queues generating these every second for every RX queue.
FWIW yes, we can add more notifications. Tho, as I mentioned at the
start of my reply - the expectation is that page pools waiting for
a long time to be destroyed is something that _will_ happen in
production.
> Guess, I've talked myself into liking this change, what do other
> maintainers think? (e.g. netlink scheme and debugging balance)
We added the Netlink API to mute the pr_warn() in all practical cases.
If Xiang Mei is seeing the pr_warn() I think we should start by asking
what kernel and driver they are using, and what the usage pattern is :(
As I mentioned most commonly the pr_warn() will trigger because driver
doesn't link the pp to a netdev.
next prev parent reply other threads:[~2026-01-04 16:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 7:17 [PATCH net-next v3] page_pool: Add page_pool_release_stalled tracepoint Leon Hwang
2026-01-02 11:43 ` Jesper Dangaard Brouer
2026-01-02 13:54 ` Leon Hwang
2026-01-04 2:18 ` Yunsheng Lin
2026-01-05 6:23 ` Leon Hwang
2026-01-04 16:43 ` Jakub Kicinski [this message]
2026-01-19 8:49 ` Leon Hwang
2026-01-19 9:54 ` Jesper Dangaard Brouer
2026-01-19 16:43 ` Jakub Kicinski
2026-01-19 17:26 ` Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260104084347.5de3a537@kernel.org \
--to=kuba@kernel.org \
--cc=davem@davemloft.net \
--cc=dtatulea@nvidia.com \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jiayuan.chen@linux.dev \
--cc=kernel-team@cloudflare.com \
--cc=kerneljasonxing@gmail.com \
--cc=lance.yang@linux.dev \
--cc=leon.huangfu@shopee.com \
--cc=leon.hwang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rostedt@goodmis.org \
--cc=yan@cloudflare.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.