netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/15] net: page_pool: add netlink-based introspection
@ 2023-10-24 16:02 Jakub Kicinski
  2023-10-24 16:02 ` [PATCH net-next 01/15] net: page_pool: split the page_pool_params into fast and slow Jakub Kicinski
                   ` (15 more replies)
  0 siblings, 16 replies; 39+ messages in thread
From: Jakub Kicinski @ 2023-10-24 16:02 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, almasrymina, hawk, ilias.apalodimas,
	Jakub Kicinski

This is a new revision of the RFC posted in August:
https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/
There's been a handful of fixes and tweaks but the overall
architecture is unchanged.

As a reminder the RFC was posted as the first step towards
an API which could configure the page pools (GET API as a stepping
stone for a SET API to come later). I wasn't sure whether we should
commit to the GET API before the SET takes shape, hence the large
delay between versions.

Unfortunately, real deployment experience made this series much more
urgent. We recently started to deploy newer kernels / drivers
at Meta, making significant use of page pools for the first time.
We immediately run into page pool leaks both real and false positive
warnings. As Eric pointed out/predicted there's no guarantee that
applications will read / close their sockets so a page pool page
may be stuck in a socket (but not leaked) forever. This happens
a lot in our fleet. Most of these are obviously due to application
bugs but we should not be printing kernel warnings due to minor
application resource leaks.

Conversely the page pool memory may get leaked at runtime, and
we have no way to detect / track that, unless someone reconfigures
the NIC and destroys the page pools which leaked the pages.

The solution presented here is to expose the memory use of page
pools via netlink. This allows for continuous monitoring of memory
used by page pools, regardless if they were destroyed or not.
Sample in patch 15 can print the memory use and recycling
efficiency:

$ ./page-pool
    eth0[2]	page pools: 10 (zombies: 0)
		refs: 41984 bytes: 171966464 (refs: 0 bytes: 0)
		recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201)

The main change compared to the RFC is that the API now exposes
outstanding references and byte counts even for "live" page pools.
The warning is no longer printed if page pool is accessible via netlink.

Jakub Kicinski (15):
  net: page_pool: split the page_pool_params into fast and slow
  net: page_pool: avoid touching slow on the fastpath
  net: page_pool: factor out uninit
  net: page_pool: id the page pools
  net: page_pool: record pools per netdev
  net: page_pool: stash the NAPI ID for easier access
  eth: link netdev to page_pools in drivers
  net: page_pool: add nlspec for basic access to page pools
  net: page_pool: implement GET in the netlink API
  net: page_pool: add netlink notifications for state changes
  net: page_pool: report amount of memory held by page pools
  net: page_pool: report when page pool was destroyed
  net: page_pool: expose page pool stats via netlink
  net: page_pool: mute the periodic warning for visible page pools
  tools: ynl: add sample for getting page-pool information

 Documentation/netlink/specs/netdev.yaml       | 161 +++++++
 Documentation/networking/page_pool.rst        |  10 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   1 +
 .../net/ethernet/mellanox/mlx5/core/en_main.c |   1 +
 drivers/net/ethernet/microsoft/mana/mana_en.c |   1 +
 include/linux/list.h                          |  20 +
 include/linux/netdevice.h                     |   4 +
 include/linux/poison.h                        |   2 +
 include/net/page_pool/helpers.h               |   8 +-
 include/net/page_pool/types.h                 |  43 +-
 include/uapi/linux/netdev.h                   |  36 ++
 net/core/Makefile                             |   2 +-
 net/core/netdev-genl-gen.c                    |  52 +++
 net/core/netdev-genl-gen.h                    |  11 +
 net/core/page_pool.c                          |  78 ++--
 net/core/page_pool_priv.h                     |  12 +
 net/core/page_pool_user.c                     | 414 +++++++++++++++++
 tools/include/uapi/linux/netdev.h             |  36 ++
 tools/net/ynl/generated/netdev-user.c         | 419 ++++++++++++++++++
 tools/net/ynl/generated/netdev-user.h         | 171 +++++++
 tools/net/ynl/lib/ynl.h                       |   2 +-
 tools/net/ynl/samples/.gitignore              |   1 +
 tools/net/ynl/samples/Makefile                |   2 +-
 tools/net/ynl/samples/page-pool.c             | 147 ++++++
 24 files changed, 1586 insertions(+), 48 deletions(-)
 create mode 100644 net/core/page_pool_priv.h
 create mode 100644 net/core/page_pool_user.c
 create mode 100644 tools/net/ynl/samples/page-pool.c

-- 
2.41.0


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2023-11-09 17:05 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-24 16:02 [PATCH net-next 00/15] net: page_pool: add netlink-based introspection Jakub Kicinski
2023-10-24 16:02 ` [PATCH net-next 01/15] net: page_pool: split the page_pool_params into fast and slow Jakub Kicinski
2023-11-09  8:13   ` Ilias Apalodimas
2023-10-24 16:02 ` [PATCH net-next 02/15] net: page_pool: avoid touching slow on the fastpath Jakub Kicinski
2023-11-09  9:00   ` Ilias Apalodimas
2023-10-24 16:02 ` [PATCH net-next 03/15] net: page_pool: factor out uninit Jakub Kicinski
2023-10-25 18:33   ` Mina Almasry
2023-10-24 16:02 ` [PATCH net-next 04/15] net: page_pool: id the page pools Jakub Kicinski
2023-10-25 18:49   ` Mina Almasry
2023-11-09  9:21   ` Ilias Apalodimas
2023-11-09 16:22     ` Jakub Kicinski
2023-11-09 16:48       ` Ilias Apalodimas
2023-10-24 16:02 ` [PATCH net-next 05/15] net: page_pool: record pools per netdev Jakub Kicinski
2023-10-24 17:31   ` David Ahern
2023-10-24 17:49     ` Jakub Kicinski
2023-10-24 19:19       ` David Ahern
2023-10-24 19:45         ` Jakub Kicinski
2023-10-25 19:56   ` Mina Almasry
2023-10-25 20:17     ` Jakub Kicinski
2023-11-09  3:28       ` Mina Almasry
2023-10-24 16:02 ` [PATCH net-next 06/15] net: page_pool: stash the NAPI ID for easier access Jakub Kicinski
2023-10-24 16:02 ` [PATCH net-next 07/15] eth: link netdev to page_pools in drivers Jakub Kicinski
2023-11-09  9:11   ` Ilias Apalodimas
2023-11-09 16:26     ` Jakub Kicinski
2023-11-09 16:51       ` Ilias Apalodimas
2023-10-24 16:02 ` [PATCH net-next 08/15] net: page_pool: add nlspec for basic access to page pools Jakub Kicinski
2023-10-24 16:02 ` [PATCH net-next 09/15] net: page_pool: implement GET in the netlink API Jakub Kicinski
2023-10-25 10:51   ` kernel test robot
2023-10-25 22:08   ` kernel test robot
2023-10-24 16:02 ` [PATCH net-next 10/15] net: page_pool: add netlink notifications for state changes Jakub Kicinski
2023-10-24 16:02 ` [PATCH net-next 11/15] net: page_pool: report amount of memory held by page pools Jakub Kicinski
2023-10-24 16:02 ` [PATCH net-next 12/15] net: page_pool: report when page pool was destroyed Jakub Kicinski
2023-11-09 17:05   ` Dragos Tatulea
2023-10-24 16:02 ` [PATCH net-next 13/15] net: page_pool: expose page pool stats via netlink Jakub Kicinski
2023-10-25 13:50   ` kernel test robot
2023-10-24 16:02 ` [PATCH net-next 14/15] net: page_pool: mute the periodic warning for visible page pools Jakub Kicinski
2023-10-24 16:02 ` [PATCH net-next 15/15] tools: ynl: add sample for getting page-pool information Jakub Kicinski
2023-11-09  8:11 ` [PATCH net-next 00/15] net: page_pool: add netlink-based introspection Ilias Apalodimas
2023-11-09 16:14   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).