From: Jakub Kicinski <kuba@kernel.org>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com,
hawk@kernel.org, ilias.apalodimas@linaro.org, dsahern@gmail.com,
dtatulea@nvidia.com, willemb@google.com, almasrymina@google.com,
shakeelb@google.com, Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH net-next v4 00/13] net: page_pool: add netlink-based introspection
Date: Sun, 26 Nov 2023 15:07:27 -0800 [thread overview]
Message-ID: <20231126230740.2148636-1-kuba@kernel.org> (raw)
We recently started to deploy newer kernels / drivers at Meta,
making significant use of page pools for the first time.
We immediately run into page pool leaks both real and false positive
warnings. As Eric pointed out/predicted there's no guarantee that
applications will read / close their sockets so a page pool page
may be stuck in a socket (but not leaked) forever. This happens
a lot in our fleet. Most of these are obviously due to application
bugs but we should not be printing kernel warnings due to minor
application resource leaks.
Conversely the page pool memory may get leaked at runtime, and
we have no way to detect / track that, unless someone reconfigures
the NIC and destroys the page pools which leaked the pages.
The solution presented here is to expose the memory use of page
pools via netlink. This allows for continuous monitoring of memory
used by page pools, regardless if they were destroyed or not.
Sample in patch 15 can print the memory use and recycling
efficiency:
$ ./page-pool
eth0[2] page pools: 10 (zombies: 0)
refs: 41984 bytes: 171966464 (refs: 0 bytes: 0)
recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201)
v4:
- use dev_net(netdev)->loopback_dev
- extend inflight doc
v3: https://lore.kernel.org/all/20231122034420.1158898-1-kuba@kernel.org/
- ID is still here, can't decide if it matters
- rename destroyed -> detach-time, good enough?
- fix build for netsec
v2: https://lore.kernel.org/r/20231121000048.789613-1-kuba@kernel.org
- hopefully fix build with PAGE_POOL=n
v1: https://lore.kernel.org/all/20231024160220.3973311-1-kuba@kernel.org/
- The main change compared to the RFC is that the API now exposes
outstanding references and byte counts even for "live" page pools.
The warning is no longer printed if page pool is accessible via netlink.
RFC: https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/
Jakub Kicinski (13):
net: page_pool: factor out uninit
net: page_pool: id the page pools
net: page_pool: record pools per netdev
net: page_pool: stash the NAPI ID for easier access
eth: link netdev to page_pools in drivers
net: page_pool: add nlspec for basic access to page pools
net: page_pool: implement GET in the netlink API
net: page_pool: add netlink notifications for state changes
net: page_pool: report amount of memory held by page pools
net: page_pool: report when page pool was destroyed
net: page_pool: expose page pool stats via netlink
net: page_pool: mute the periodic warning for visible page pools
tools: ynl: add sample for getting page-pool information
Documentation/netlink/specs/netdev.yaml | 172 +++++++
Documentation/networking/page_pool.rst | 10 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 +
.../net/ethernet/mellanox/mlx5/core/en_main.c | 1 +
drivers/net/ethernet/microsoft/mana/mana_en.c | 1 +
drivers/net/ethernet/socionext/netsec.c | 2 +
include/linux/list.h | 20 +
include/linux/netdevice.h | 4 +
include/linux/poison.h | 2 +
include/net/page_pool/helpers.h | 8 +-
include/net/page_pool/types.h | 10 +
include/uapi/linux/netdev.h | 36 ++
net/core/Makefile | 2 +-
net/core/netdev-genl-gen.c | 60 +++
net/core/netdev-genl-gen.h | 11 +
net/core/page_pool.c | 69 ++-
net/core/page_pool_priv.h | 12 +
net/core/page_pool_user.c | 408 +++++++++++++++++
tools/include/uapi/linux/netdev.h | 36 ++
tools/net/ynl/generated/netdev-user.c | 419 ++++++++++++++++++
tools/net/ynl/generated/netdev-user.h | 171 +++++++
tools/net/ynl/lib/ynl.h | 2 +-
tools/net/ynl/samples/.gitignore | 1 +
tools/net/ynl/samples/Makefile | 2 +-
tools/net/ynl/samples/page-pool.c | 147 ++++++
25 files changed, 1574 insertions(+), 33 deletions(-)
create mode 100644 net/core/page_pool_priv.h
create mode 100644 net/core/page_pool_user.c
create mode 100644 tools/net/ynl/samples/page-pool.c
--
2.42.0
next reply other threads:[~2023-11-26 23:08 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-26 23:07 Jakub Kicinski [this message]
2023-11-26 23:07 ` [PATCH net-next v4 01/13] net: page_pool: factor out uninit Jakub Kicinski
2023-11-27 6:56 ` Shakeel Butt
2023-11-26 23:07 ` [PATCH net-next v4 02/13] net: page_pool: id the page pools Jakub Kicinski
2023-11-27 7:07 ` Shakeel Butt
2023-11-28 14:47 ` Paolo Abeni
2023-11-26 23:07 ` [PATCH net-next v4 03/13] net: page_pool: record pools per netdev Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 04/13] net: page_pool: stash the NAPI ID for easier access Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 05/13] eth: link netdev to page_pools in drivers Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 06/13] net: page_pool: add nlspec for basic access to page pools Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 07/13] net: page_pool: implement GET in the netlink API Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 08/13] net: page_pool: add netlink notifications for state changes Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 09/13] net: page_pool: report amount of memory held by page pools Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 10/13] net: page_pool: report when page pool was destroyed Jakub Kicinski
2023-11-26 23:07 ` [PATCH net-next v4 11/13] net: page_pool: expose page pool stats via netlink Jakub Kicinski
2023-11-28 15:01 ` Ilias Apalodimas
2023-11-26 23:07 ` [PATCH net-next v4 12/13] net: page_pool: mute the periodic warning for visible page pools Jakub Kicinski
2023-11-28 15:00 ` Ilias Apalodimas
2023-11-26 23:07 ` [PATCH net-next v4 13/13] tools: ynl: add sample for getting page-pool information Jakub Kicinski
2023-11-28 15:10 ` [PATCH net-next v4 00/13] net: page_pool: add netlink-based introspection patchwork-bot+netdevbpf
2023-11-29 20:10 ` Daniel Golle
2023-11-29 21:12 ` Eric Dumazet
2023-11-30 0:03 ` Jakub Kicinski
2023-11-30 2:09 ` Daniel Golle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231126230740.2148636-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=almasrymina@google.com \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=dtatulea@nvidia.com \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shakeelb@google.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.