linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v3 00/18] Split netmem from struct page
@ 2025-05-29  3:10 Byungchul Park
  2025-05-29  3:10 ` [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
                   ` (18 more replies)
  0 siblings, 19 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

The MM subsystem is trying to reduce struct page to a single pointer.
The first step towards that is splitting struct page by its individual
users, as has already been done with folio and slab.  This patchset does
that for netmem which is used for page pools.

Matthew Wilcox tried and stopped the same work, you can see in:

   https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/

Mina Almasry already has done a lot fo prerequisite works by luck, he
said :).  I stacked my patches on the top of his work e.i. netmem.

I focused on removing the page pool members in struct page this time,
not moving the allocation code of page pool from net to mm.  It can be
done later if needed.

The final patch removing the page pool fields will be submitted once
all the converting work of page to netmem are done:

   1. converting of libeth_fqe by Tony Nguyen.
   2. converting of mlx5 by Tariq Toukan.
   3. converting of prueth_swdata (on me).
   4. converting of freescale driver (on me).

For our discussion, I'm sharing what the final patch looks like the
following.

	Byungchul
--8<--
commit 86be39ea488df859cff6bc398a364f1dc486f2f9
Author: Byungchul Park <byungchul@sk.com>
Date:   Wed May 28 20:44:55 2025 +0900

    mm, netmem: remove the page pool members in struct page
    
    Now that all the users of the page pool members in struct page have been
    gone, the members can be removed from struct page.
    
    However, since struct netmem_desc still uses the space in struct page,
    the important offsets should be checked properly, until struct
    netmem_desc has its own instance from slab.
    
    Remove the page pool members in struct page and modify static checkers
    for the offsets.
    
    Signed-off-by: Byungchul Park <byungchul@sk.com>

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 56d07edd01f9..5a7864eb9d76 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -119,17 +119,6 @@ struct page {
 			 */
 			unsigned long private;
 		};
-		struct {	/* page_pool used by netstack */
-			/**
-			 * @pp_magic: magic value to avoid recycling non
-			 * page_pool allocated pages.
-			 */
-			unsigned long pp_magic;
-			struct page_pool *pp;
-			unsigned long _pp_mapping_pad;
-			unsigned long dma_addr;
-			atomic_long_t pp_ref_count;
-		};
 		struct {	/* Tail pages of compound page */
 			unsigned long compound_head;	/* Bit zero is set */
 		};
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 9e4ed3530788..e88e299dd0f0 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -39,11 +39,8 @@ struct netmem_desc {
 	static_assert(offsetof(struct page, pg) == \
 		      offsetof(struct netmem_desc, desc))
 NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
-NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
-NETMEM_DESC_ASSERT_OFFSET(pp, pp);
-NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
-NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
-NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
+NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic);
+NETMEM_DESC_ASSERT_OFFSET(mapping, _pp_mapping_pad);
 #undef NETMEM_DESC_ASSERT_OFFSET
 
 /*
---
Changes from v2:
	1. Introduce a netmem API, virt_to_head_netmem(), and use it
	   when it's needed.
	2. Introduce struct netmem_desc as a new struct and union'ed
	   with the existing fields in struct net_iov.
	3. Make page_pool_page_is_pp() access ->pp_magic through struct
	   netmem_desc instead of struct page.
	4. Move netmem alloc APIs from include/net/netmem.h to
	   net/core/netmem_priv.h.
	5. Apply trivial feedbacks, thanks to Mina, Pavel, and Toke.
	6. Add given 'Reviewed-by's, thanks to Mina.

Changes from v1:
	1. Rebase on net-next's main as of May 26.
	2. Check checkpatch.pl, feedbacked by SJ Park.
	3. Add converting of page to netmem in mt76.
	4. Revert 'mlx5: use netmem descriptor and APIs for page pool'
	   since it's on-going by Tariq Toukan.  I will wait for his
	   work to be done.
	5. Revert 'page_pool: use netmem APIs to access page->pp_magic
	   in page_pool_page_is_pp()' since we need more discussion.
	6. Revert 'mm, netmem: remove the page pool members in struct
	   page' since there are some prerequisite works to remove the
	   page pool fields from struct page.  I can submit this patch
	   separatedly later.
	7. Cancel relocating a page pool member in struct page.
	8. Modify static assert for offests and size of struct
	   netmem_desc.

Changes from rfc:
	1. Rebase on net-next's main branch.
	   https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
	2. Fix a build error reported by kernel test robot.
	   https://lore.kernel.org/all/202505100932.uzAMBW1y-lkp@intel.com/
	3. Add given 'Reviewed-by's, thanks to Mina and Ilias.
	4. Do static_assert() on the size of struct netmem_desc instead
	   of placing place-holder in struct page, feedbacked by
	   Matthew.
	5. Do struct_group_tagged(netmem_desc) on struct net_iov instead
	   of wholly renaming it to strcut netmem_desc, feedbacked by
	   Mina and Pavel.

Byungchul Park (18):
  netmem: introduce struct netmem_desc mirroring struct page
  netmem: introduce netmem alloc APIs to wrap page alloc APIs
  page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
  page_pool: rename __page_pool_alloc_page_order() to
    __page_pool_alloc_netmem_order()
  page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
  page_pool: rename page_pool_return_page() to page_pool_return_netmem()
  page_pool: use netmem put API in page_pool_return_netmem()
  page_pool: rename __page_pool_release_page_dma() to
    __page_pool_release_netmem_dma()
  page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
  page_pool: rename __page_pool_alloc_pages_slow() to
    __page_pool_alloc_netmems_slow()
  mlx4: use netmem descriptor and APIs for page pool
  netmem: use _Generic to cover const casting for page_to_netmem()
  netmem: remove __netmem_get_pp()
  page_pool: make page_pool_get_dma_addr() just wrap
    page_pool_get_dma_addr_netmem()
  netdevsim: use netmem descriptor and APIs for page pool
  netmem: introduce a netmem API, virt_to_head_netmem()
  mt76: use netmem descriptor and APIs for page pool
  page_pool: access ->pp_magic through struct netmem_desc in
    page_pool_page_is_pp()

 drivers/net/ethernet/mellanox/mlx4/en_rx.c    |  48 +++---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c    |   8 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h  |   4 +-
 drivers/net/netdevsim/netdev.c                |  19 +--
 drivers/net/netdevsim/netdevsim.h             |   2 +-
 drivers/net/wireless/mediatek/mt76/dma.c      |   6 +-
 drivers/net/wireless/mediatek/mt76/mt76.h     |  12 +-
 .../net/wireless/mediatek/mt76/sdio_txrx.c    |  24 +--
 drivers/net/wireless/mediatek/mt76/usb.c      |  10 +-
 include/linux/mm.h                            |  12 --
 include/net/netmem.h                          | 145 +++++++++++++-----
 include/net/page_pool/helpers.h               |   7 +-
 mm/page_alloc.c                               |   1 +
 net/core/netmem_priv.h                        |  14 ++
 net/core/page_pool.c                          | 101 ++++++------
 15 files changed, 239 insertions(+), 174 deletions(-)


base-commit: d09a8a4ab57849d0401d7c0bc6583e367984d9f7
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring struct page
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29 16:31   ` Mina Almasry
  2025-05-29  3:10 ` [RFC v3 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

To simplify struct page, the page pool members of struct page should be
moved to other, allowing these members to be removed from struct page.

Introduce a network memory descriptor to store the members, struct
netmem_desc and make it union'ed with the existing fields in struct
net_iov, allowing to organize the fields of struct net_iov.  The final
look of struct net_iov should be like:

	struct net_iov {
		struct netmem_desc;
		net_field1; /* e.g. struct net_iov_area *owner; */
		net_field2;
		...
	};

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/net/netmem.h | 101 +++++++++++++++++++++++++++++++++----------
 1 file changed, 79 insertions(+), 22 deletions(-)

diff --git a/include/net/netmem.h b/include/net/netmem.h
index 386164fb9c18..d52f86082271 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -12,6 +12,47 @@
 #include <linux/mm.h>
 #include <net/net_debug.h>
 
+/* These fields in struct page are used by the page_pool and net stack:
+ *
+ *        struct {
+ *                unsigned long pp_magic;
+ *                struct page_pool *pp;
+ *                unsigned long _pp_mapping_pad;
+ *                unsigned long dma_addr;
+ *                atomic_long_t pp_ref_count;
+ *        };
+ *
+ * We mirror the page_pool fields here so the page_pool can access these
+ * fields without worrying whether the underlying fields belong to a
+ * page or netmem_desc.
+ */
+struct netmem_desc {
+	unsigned long _flags;
+	unsigned long pp_magic;
+	struct page_pool *pp;
+	unsigned long _pp_mapping_pad;
+	unsigned long dma_addr;
+	atomic_long_t pp_ref_count;
+};
+
+#define NETMEM_DESC_ASSERT_OFFSET(pg, desc)        \
+	static_assert(offsetof(struct page, pg) == \
+		      offsetof(struct netmem_desc, desc))
+NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
+NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
+NETMEM_DESC_ASSERT_OFFSET(pp, pp);
+NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
+NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
+NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
+#undef NETMEM_DESC_ASSERT_OFFSET
+
+/*
+ * Since struct netmem_desc uses the space in struct page, the size
+ * should be checked, until struct netmem_desc has its own instance from
+ * slab, to avoid conflicting with other members within struct page.
+ */
+static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
+
 /* net_iov */
 
 DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
@@ -31,12 +72,33 @@ enum net_iov_type {
 };
 
 struct net_iov {
-	enum net_iov_type type;
-	unsigned long pp_magic;
-	struct page_pool *pp;
-	struct net_iov_area *owner;
-	unsigned long dma_addr;
-	atomic_long_t pp_ref_count;
+	union {
+		struct netmem_desc desc;
+
+		/* XXX: The following part should be removed once all
+		 * the references to them are converted so as to be
+		 * accessed via netmem_desc e.g. niov->desc.pp instead
+		 * of niov->pp.
+		 *
+		 * Plus, once struct netmem_desc has it own instance
+		 * from slab, network's fields of the following can be
+		 * moved out of struct netmem_desc like:
+		 *
+		 *    struct net_iov {
+		 *       struct netmem_desc desc;
+		 *       struct net_iov_area *owner;
+		 *       ...
+		 *    };
+		 */
+		struct {
+			enum net_iov_type type;
+			unsigned long pp_magic;
+			struct page_pool *pp;
+			struct net_iov_area *owner;
+			unsigned long dma_addr;
+			atomic_long_t pp_ref_count;
+		};
+	};
 };
 
 struct net_iov_area {
@@ -48,27 +110,22 @@ struct net_iov_area {
 	unsigned long base_virtual;
 };
 
-/* These fields in struct page are used by the page_pool and net stack:
+/* net_iov is union'ed with struct netmem_desc mirroring struct page, so
+ * the page_pool can access these fields without worrying whether the
+ * underlying fields are accessed via netmem_desc or directly via
+ * net_iov, until all the references to them are converted so as to be
+ * accessed via netmem_desc e.g. niov->desc.pp instead of niov->pp.
  *
- *        struct {
- *                unsigned long pp_magic;
- *                struct page_pool *pp;
- *                unsigned long _pp_mapping_pad;
- *                unsigned long dma_addr;
- *                atomic_long_t pp_ref_count;
- *        };
- *
- * We mirror the page_pool fields here so the page_pool can access these fields
- * without worrying whether the underlying fields belong to a page or net_iov.
- *
- * The non-net stack fields of struct page are private to the mm stack and must
- * never be mirrored to net_iov.
+ * The non-net stack fields of struct page are private to the mm stack
+ * and must never be mirrored to net_iov.
  */
-#define NET_IOV_ASSERT_OFFSET(pg, iov)             \
-	static_assert(offsetof(struct page, pg) == \
+#define NET_IOV_ASSERT_OFFSET(desc, iov)                    \
+	static_assert(offsetof(struct netmem_desc, desc) == \
 		      offsetof(struct net_iov, iov))
+NET_IOV_ASSERT_OFFSET(_flags, type);
 NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic);
 NET_IOV_ASSERT_OFFSET(pp, pp);
+NET_IOV_ASSERT_OFFSET(_pp_mapping_pad, owner);
 NET_IOV_ASSERT_OFFSET(dma_addr, dma_addr);
 NET_IOV_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
 #undef NET_IOV_ASSERT_OFFSET
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
  2025-05-29  3:10 ` [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

To eliminate the use of struct page in page pool, the page pool code
should use netmem descriptor and APIs instead.

As part of the work, introduce netmem alloc APIs allowing the code to
use them rather than the existing APIs for struct page.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 net/core/netmem_priv.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
index cd95394399b4..32e390908bb2 100644
--- a/net/core/netmem_priv.h
+++ b/net/core/netmem_priv.h
@@ -59,4 +59,18 @@ static inline void netmem_set_dma_index(netmem_ref netmem,
 	magic = netmem_get_pp_magic(netmem) | (id << PP_DMA_INDEX_SHIFT);
 	__netmem_clear_lsb(netmem)->pp_magic = magic;
 }
+
+static inline netmem_ref alloc_netmems_node(int nid, gfp_t gfp_mask,
+					    unsigned int order)
+{
+	return page_to_netmem(alloc_pages_node(nid, gfp_mask, order));
+}
+
+static inline unsigned long alloc_netmems_bulk_node(gfp_t gfp, int nid,
+						    unsigned long nr_netmems,
+						    netmem_ref *netmem_array)
+{
+	return alloc_pages_bulk_node(gfp, nid, nr_netmems,
+			(struct page **)netmem_array);
+}
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
  2025-05-29  3:10 ` [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
  2025-05-29  3:10 ` [RFC v3 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Use netmem alloc/put APIs instead of page alloc/put APIs and make it
return netmem_ref instead of struct page * in
__page_pool_alloc_page_order().

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 974f3eef2efa..e101c39d65c7 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -518,29 +518,29 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
 	return false;
 }
 
-static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
-						 gfp_t gfp)
+static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
+					       gfp_t gfp)
 {
-	struct page *page;
+	netmem_ref netmem;
 
 	gfp |= __GFP_COMP;
-	page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
-	if (unlikely(!page))
-		return NULL;
+	netmem = alloc_netmems_node(pool->p.nid, gfp, pool->p.order);
+	if (unlikely(!netmem))
+		return 0;
 
-	if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page), gfp))) {
-		put_page(page);
-		return NULL;
+	if (pool->dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
+		put_netmem(netmem);
+		return 0;
 	}
 
 	alloc_stat_inc(pool, slow_high_order);
-	page_pool_set_pp_info(pool, page_to_netmem(page));
+	page_pool_set_pp_info(pool, netmem);
 
 	/* Track how many pages are held 'in-flight' */
 	pool->pages_state_hold_cnt++;
-	trace_page_pool_state_hold(pool, page_to_netmem(page),
+	trace_page_pool_state_hold(pool, netmem,
 				   pool->pages_state_hold_cnt);
-	return page;
+	return netmem;
 }
 
 /* slow path */
@@ -555,7 +555,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
 
 	/* Don't support bulk alloc for high-order pages */
 	if (unlikely(pp_order))
-		return page_to_netmem(__page_pool_alloc_page_order(pool, gfp));
+		return __page_pool_alloc_page_order(pool, gfp);
 
 	/* Unnecessary as alloc cache is empty, but guarantees zero count */
 	if (unlikely(pool->alloc.count > 0))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (2 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Now that __page_pool_alloc_page_order() uses netmem alloc/put APIs, not
page alloc/put APIs, rename it to __page_pool_alloc_netmem_order() to
reflect what it does.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index e101c39d65c7..a44acdb16a9a 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -518,8 +518,8 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
 	return false;
 }
 
-static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
-					       gfp_t gfp)
+static netmem_ref __page_pool_alloc_netmem_order(struct page_pool *pool,
+						 gfp_t gfp)
 {
 	netmem_ref netmem;
 
@@ -555,7 +555,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
 
 	/* Don't support bulk alloc for high-order pages */
 	if (unlikely(pp_order))
-		return __page_pool_alloc_page_order(pool, gfp);
+		return __page_pool_alloc_netmem_order(pool, gfp);
 
 	/* Unnecessary as alloc cache is empty, but guarantees zero count */
 	if (unlikely(pool->alloc.count > 0))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (3 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Use netmem alloc/put APIs instead of page alloc/put APIs in
__page_pool_alloc_pages_slow().

While at it, improved some comments.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index a44acdb16a9a..0e7a336aafdf 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -551,7 +551,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
 	unsigned int pp_order = pool->p.order;
 	bool dma_map = pool->dma_map;
 	netmem_ref netmem;
-	int i, nr_pages;
+	int i, nr_netmems;
 
 	/* Don't support bulk alloc for high-order pages */
 	if (unlikely(pp_order))
@@ -561,21 +561,21 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
 	if (unlikely(pool->alloc.count > 0))
 		return pool->alloc.cache[--pool->alloc.count];
 
-	/* Mark empty alloc.cache slots "empty" for alloc_pages_bulk */
+	/* Mark empty alloc.cache slots "empty" for alloc_netmems_bulk_node() */
 	memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
 
-	nr_pages = alloc_pages_bulk_node(gfp, pool->p.nid, bulk,
-					 (struct page **)pool->alloc.cache);
-	if (unlikely(!nr_pages))
+	nr_netmems = alloc_netmems_bulk_node(gfp, pool->p.nid, bulk,
+					     pool->alloc.cache);
+	if (unlikely(!nr_netmems))
 		return 0;
 
-	/* Pages have been filled into alloc.cache array, but count is zero and
-	 * page element have not been (possibly) DMA mapped.
+	/* Netmems have been filled into alloc.cache array, but count is
+	 * zero and elements have not been (possibly) DMA mapped.
 	 */
-	for (i = 0; i < nr_pages; i++) {
+	for (i = 0; i < nr_netmems; i++) {
 		netmem = pool->alloc.cache[i];
 		if (dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
-			put_page(netmem_to_page(netmem));
+			put_netmem(netmem);
 			continue;
 		}
 
@@ -587,7 +587,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
 					   pool->pages_state_hold_cnt);
 	}
 
-	/* Return last page */
+	/* Return the last netmem */
 	if (likely(pool->alloc.count > 0)) {
 		netmem = pool->alloc.cache[--pool->alloc.count];
 		alloc_stat_inc(pool, slow);
@@ -595,7 +595,9 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
 		netmem = 0;
 	}
 
-	/* When page just alloc'ed is should/must have refcnt 1. */
+	/* When a netmem has been just allocated, it should/must have
+	 * refcnt 1.
+	 */
 	return netmem;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (4 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Now that page_pool_return_page() is for returning netmem, not struct
page, rename it to page_pool_return_netmem() to reflect what it does.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 0e7a336aafdf..633e10196de5 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -371,7 +371,7 @@ struct page_pool *page_pool_create(const struct page_pool_params *params)
 }
 EXPORT_SYMBOL(page_pool_create);
 
-static void page_pool_return_page(struct page_pool *pool, netmem_ref netmem);
+static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem);
 
 static noinline netmem_ref page_pool_refill_alloc_cache(struct page_pool *pool)
 {
@@ -409,7 +409,7 @@ static noinline netmem_ref page_pool_refill_alloc_cache(struct page_pool *pool)
 			 * (2) break out to fallthrough to alloc_pages_node.
 			 * This limit stress on page buddy alloactor.
 			 */
-			page_pool_return_page(pool, netmem);
+			page_pool_return_netmem(pool, netmem);
 			alloc_stat_inc(pool, waive);
 			netmem = 0;
 			break;
@@ -714,7 +714,7 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
  * a regular page (that will eventually be returned to the normal
  * page-allocator via put_page).
  */
-void page_pool_return_page(struct page_pool *pool, netmem_ref netmem)
+static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
 {
 	int count;
 	bool put;
@@ -831,7 +831,7 @@ __page_pool_put_page(struct page_pool *pool, netmem_ref netmem,
 	 * will be invoking put_page.
 	 */
 	recycle_stat_inc(pool, released_refcnt);
-	page_pool_return_page(pool, netmem);
+	page_pool_return_netmem(pool, netmem);
 
 	return 0;
 }
@@ -874,7 +874,7 @@ void page_pool_put_unrefed_netmem(struct page_pool *pool, netmem_ref netmem,
 	if (netmem && !page_pool_recycle_in_ring(pool, netmem)) {
 		/* Cache full, fallback to free pages */
 		recycle_stat_inc(pool, ring_full);
-		page_pool_return_page(pool, netmem);
+		page_pool_return_netmem(pool, netmem);
 	}
 }
 EXPORT_SYMBOL(page_pool_put_unrefed_netmem);
@@ -917,7 +917,7 @@ static void page_pool_recycle_ring_bulk(struct page_pool *pool,
 	 * since put_page() with refcnt == 1 can be an expensive operation.
 	 */
 	for (; i < bulk_len; i++)
-		page_pool_return_page(pool, bulk[i]);
+		page_pool_return_netmem(pool, bulk[i]);
 }
 
 /**
@@ -1000,7 +1000,7 @@ static netmem_ref page_pool_drain_frag(struct page_pool *pool,
 		return netmem;
 	}
 
-	page_pool_return_page(pool, netmem);
+	page_pool_return_netmem(pool, netmem);
 	return 0;
 }
 
@@ -1014,7 +1014,7 @@ static void page_pool_free_frag(struct page_pool *pool)
 	if (!netmem || page_pool_unref_netmem(netmem, drain_count))
 		return;
 
-	page_pool_return_page(pool, netmem);
+	page_pool_return_netmem(pool, netmem);
 }
 
 netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool,
@@ -1081,7 +1081,7 @@ static void page_pool_empty_ring(struct page_pool *pool)
 			pr_crit("%s() page_pool refcnt %d violation\n",
 				__func__, netmem_ref_count(netmem));
 
-		page_pool_return_page(pool, netmem);
+		page_pool_return_netmem(pool, netmem);
 	}
 }
 
@@ -1114,7 +1114,7 @@ static void page_pool_empty_alloc_cache_once(struct page_pool *pool)
 	 */
 	while (pool->alloc.count) {
 		netmem = pool->alloc.cache[--pool->alloc.count];
-		page_pool_return_page(pool, netmem);
+		page_pool_return_netmem(pool, netmem);
 	}
 }
 
@@ -1254,7 +1254,7 @@ void page_pool_update_nid(struct page_pool *pool, int new_nid)
 	/* Flush pool alloc cache, as refill will check NUMA node */
 	while (pool->alloc.count) {
 		netmem = pool->alloc.cache[--pool->alloc.count];
-		page_pool_return_page(pool, netmem);
+		page_pool_return_netmem(pool, netmem);
 	}
 }
 EXPORT_SYMBOL(page_pool_update_nid);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 07/18] page_pool: use netmem put API in page_pool_return_netmem()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (5 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Use netmem put API, put_netmem(), instead of put_page() in
page_pool_return_netmem().

While at it, delete #include <linux/mm.h> since the last put_page() in
page_pool.c has been just removed with this patch.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 633e10196de5..4368beda1e08 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -20,7 +20,6 @@
 #include <linux/dma-direction.h>
 #include <linux/dma-mapping.h>
 #include <linux/page-flags.h>
-#include <linux/mm.h> /* for put_page() */
 #include <linux/poison.h>
 #include <linux/ethtool.h>
 #include <linux/netdevice.h>
@@ -712,7 +711,7 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
 /* Disconnects a page (from a page_pool).  API users can have a need
  * to disconnect a page (from a page_pool), to allow it to be used as
  * a regular page (that will eventually be returned to the normal
- * page-allocator via put_page).
+ * page-allocator via put_netmem()).
  */
 static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
 {
@@ -733,7 +732,7 @@ static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
 
 	if (put) {
 		page_pool_clear_pp_info(netmem);
-		put_page(netmem_to_page(netmem));
+		put_netmem(netmem);
 	}
 	/* An optimization would be to call __free_pages(page, pool->p.order)
 	 * knowing page is not part of page-cache (thus avoiding a
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (6 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Now that __page_pool_release_page_dma() is for releasing netmem, not
struct page, rename it to __page_pool_release_netmem_dma() to reflect
what it does.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 4368beda1e08..0a5e008df744 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -674,8 +674,8 @@ void page_pool_clear_pp_info(netmem_ref netmem)
 	netmem_set_pp(netmem, NULL);
 }
 
-static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
-							 netmem_ref netmem)
+static __always_inline void __page_pool_release_netmem_dma(struct page_pool *pool,
+							   netmem_ref netmem)
 {
 	struct page *old, *page = netmem_to_page(netmem);
 	unsigned long id;
@@ -722,7 +722,7 @@ static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
 	if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
 		put = pool->mp_ops->release_netmem(pool, netmem);
 	else
-		__page_pool_release_page_dma(pool, netmem);
+		__page_pool_release_netmem_dma(pool, netmem);
 
 	/* This may be the last page returned, releasing the pool, so
 	 * it is not safe to reference pool afterwards.
@@ -1140,7 +1140,7 @@ static void page_pool_scrub(struct page_pool *pool)
 		}
 
 		xa_for_each(&pool->dma_mapped, id, ptr)
-			__page_pool_release_page_dma(pool, page_to_netmem(ptr));
+			__page_pool_release_netmem_dma(pool, page_to_netmem((struct page *)ptr));
 	}
 
 	/* No more consumers should exist, but producers could still
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (7 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Now that __page_pool_put_page() puts netmem, not struct page, rename it
to __page_pool_put_netmem() to reflect what it does.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 0a5e008df744..9eae57e47112 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -790,8 +790,8 @@ static bool __page_pool_page_can_be_recycled(netmem_ref netmem)
  * subsystem.
  */
 static __always_inline netmem_ref
-__page_pool_put_page(struct page_pool *pool, netmem_ref netmem,
-		     unsigned int dma_sync_size, bool allow_direct)
+__page_pool_put_netmem(struct page_pool *pool, netmem_ref netmem,
+		       unsigned int dma_sync_size, bool allow_direct)
 {
 	lockdep_assert_no_hardirq();
 
@@ -850,7 +850,7 @@ static bool page_pool_napi_local(const struct page_pool *pool)
 	/* Allow direct recycle if we have reasons to believe that we are
 	 * in the same context as the consumer would run, so there's
 	 * no possible race.
-	 * __page_pool_put_page() makes sure we're not in hardirq context
+	 * __page_pool_put_netmem() makes sure we're not in hardirq context
 	 * and interrupts are enabled prior to accessing the cache.
 	 */
 	cpuid = smp_processor_id();
@@ -869,7 +869,7 @@ void page_pool_put_unrefed_netmem(struct page_pool *pool, netmem_ref netmem,
 		allow_direct = page_pool_napi_local(pool);
 
 	netmem =
-		__page_pool_put_page(pool, netmem, dma_sync_size, allow_direct);
+		__page_pool_put_netmem(pool, netmem, dma_sync_size, allow_direct);
 	if (netmem && !page_pool_recycle_in_ring(pool, netmem)) {
 		/* Cache full, fallback to free pages */
 		recycle_stat_inc(pool, ring_full);
@@ -970,8 +970,8 @@ void page_pool_put_netmem_bulk(netmem_ref *data, u32 count)
 				continue;
 			}
 
-			netmem = __page_pool_put_page(pool, netmem, -1,
-						      allow_direct);
+			netmem = __page_pool_put_netmem(pool, netmem, -1,
+							allow_direct);
 			/* Approved for bulk recycling in ptr_ring cache */
 			if (netmem)
 				bulk[bulk_len++] = netmem;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (8 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 11/18] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Now that __page_pool_alloc_pages_slow() is for allocating netmem, not
struct page, rename it to __page_pool_alloc_netmems_slow() to reflect
what it does.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/page_pool.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 9eae57e47112..11d759302d19 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -543,8 +543,8 @@ static netmem_ref __page_pool_alloc_netmem_order(struct page_pool *pool,
 }
 
 /* slow path */
-static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
-							gfp_t gfp)
+static noinline netmem_ref __page_pool_alloc_netmems_slow(struct page_pool *pool,
+							  gfp_t gfp)
 {
 	const int bulk = PP_ALLOC_CACHE_REFILL;
 	unsigned int pp_order = pool->p.order;
@@ -616,7 +616,7 @@ netmem_ref page_pool_alloc_netmems(struct page_pool *pool, gfp_t gfp)
 	if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
 		netmem = pool->mp_ops->alloc_netmems(pool, gfp);
 	else
-		netmem = __page_pool_alloc_pages_slow(pool, gfp);
+		netmem = __page_pool_alloc_netmems_slow(pool, gfp);
 	return netmem;
 }
 EXPORT_SYMBOL(page_pool_alloc_netmems);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 11/18] mlx4: use netmem descriptor and APIs for page pool
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (9 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.

Use netmem descriptor and APIs for page pool in mlx4 code.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 48 +++++++++++---------
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   |  8 ++--
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  4 +-
 3 files changed, 32 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index b33285d755b9..7cf0d2dc5011 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -62,18 +62,18 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
 	int i;
 
 	for (i = 0; i < priv->num_frags; i++, frags++) {
-		if (!frags->page) {
-			frags->page = page_pool_alloc_pages(ring->pp, gfp);
-			if (!frags->page) {
+		if (!frags->netmem) {
+			frags->netmem = page_pool_alloc_netmems(ring->pp, gfp);
+			if (!frags->netmem) {
 				ring->alloc_fail++;
 				return -ENOMEM;
 			}
-			page_pool_fragment_page(frags->page, 1);
+			page_pool_fragment_netmem(frags->netmem, 1);
 			frags->page_offset = priv->rx_headroom;
 
 			ring->rx_alloc_pages++;
 		}
-		dma = page_pool_get_dma_addr(frags->page);
+		dma = page_pool_get_dma_addr_netmem(frags->netmem);
 		rx_desc->data[i].addr = cpu_to_be64(dma + frags->page_offset);
 	}
 	return 0;
@@ -83,10 +83,10 @@ static void mlx4_en_free_frag(const struct mlx4_en_priv *priv,
 			      struct mlx4_en_rx_ring *ring,
 			      struct mlx4_en_rx_alloc *frag)
 {
-	if (frag->page)
-		page_pool_put_full_page(ring->pp, frag->page, false);
+	if (frag->netmem)
+		page_pool_put_full_netmem(ring->pp, frag->netmem, false);
 	/* We need to clear all fields, otherwise a change of priv->log_rx_info
-	 * could lead to see garbage later in frag->page.
+	 * could lead to see garbage later in frag->netmem.
 	 */
 	memset(frag, 0, sizeof(*frag));
 }
@@ -440,29 +440,33 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
 	unsigned int truesize = 0;
 	bool release = true;
 	int nr, frag_size;
-	struct page *page;
+	netmem_ref netmem;
 	dma_addr_t dma;
 
 	/* Collect used fragments while replacing them in the HW descriptors */
 	for (nr = 0;; frags++) {
 		frag_size = min_t(int, length, frag_info->frag_size);
 
-		page = frags->page;
-		if (unlikely(!page))
+		netmem = frags->netmem;
+		if (unlikely(!netmem))
 			goto fail;
 
-		dma = page_pool_get_dma_addr(page);
+		dma = page_pool_get_dma_addr_netmem(netmem);
 		dma_sync_single_range_for_cpu(priv->ddev, dma, frags->page_offset,
 					      frag_size, priv->dma_dir);
 
-		__skb_fill_page_desc(skb, nr, page, frags->page_offset,
-				     frag_size);
+		__skb_fill_netmem_desc(skb, nr, netmem, frags->page_offset,
+				       frag_size);
 
 		truesize += frag_info->frag_stride;
 		if (frag_info->frag_stride == PAGE_SIZE / 2) {
+			struct page *page = netmem_to_page(netmem);
+			atomic_long_t *pp_ref_count =
+				netmem_get_pp_ref_count_ref(netmem);
+
 			frags->page_offset ^= PAGE_SIZE / 2;
 			release = page_count(page) != 1 ||
-				  atomic_long_read(&page->pp_ref_count) != 1 ||
+				  atomic_long_read(pp_ref_count) != 1 ||
 				  page_is_pfmemalloc(page) ||
 				  page_to_nid(page) != numa_mem_id();
 		} else if (!priv->rx_headroom) {
@@ -476,9 +480,9 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
 			release = frags->page_offset + frag_info->frag_size > PAGE_SIZE;
 		}
 		if (release) {
-			frags->page = NULL;
+			frags->netmem = 0;
 		} else {
-			page_pool_ref_page(page);
+			page_pool_ref_netmem(netmem);
 		}
 
 		nr++;
@@ -719,7 +723,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 		int nr;
 
 		frags = ring->rx_info + (index << priv->log_rx_info);
-		va = page_address(frags[0].page) + frags[0].page_offset;
+		va = netmem_address(frags[0].netmem) + frags[0].page_offset;
 		net_prefetchw(va);
 		/*
 		 * make sure we read the CQE after we read the ownership bit
@@ -748,7 +752,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 			/* Get pointer to first fragment since we haven't
 			 * skb yet and cast it to ethhdr struct
 			 */
-			dma = page_pool_get_dma_addr(frags[0].page);
+			dma = page_pool_get_dma_addr_netmem(frags[0].netmem);
 			dma += frags[0].page_offset;
 			dma_sync_single_for_cpu(priv->ddev, dma, sizeof(*ethh),
 						DMA_FROM_DEVICE);
@@ -788,7 +792,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 			void *orig_data;
 			u32 act;
 
-			dma = page_pool_get_dma_addr(frags[0].page);
+			dma = page_pool_get_dma_addr_netmem(frags[0].netmem);
 			dma += frags[0].page_offset;
 			dma_sync_single_for_cpu(priv->ddev, dma,
 						priv->frag_info[0].frag_size,
@@ -818,7 +822,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 				if (likely(!xdp_do_redirect(dev, &mxbuf.xdp, xdp_prog))) {
 					ring->xdp_redirect++;
 					xdp_redir_flush = true;
-					frags[0].page = NULL;
+					frags[0].netmem = 0;
 					goto next;
 				}
 				ring->xdp_redirect_fail++;
@@ -828,7 +832,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 				if (likely(!mlx4_en_xmit_frame(ring, frags, priv,
 							length, cq_ring,
 							&doorbell_pending))) {
-					frags[0].page = NULL;
+					frags[0].netmem = 0;
 					goto next;
 				}
 				trace_xdp_exception(dev, xdp_prog, act);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 87f35bcbeff8..b564a953da09 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -354,7 +354,7 @@ u32 mlx4_en_recycle_tx_desc(struct mlx4_en_priv *priv,
 	struct page_pool *pool = ring->recycle_ring->pp;
 
 	/* Note that napi_mode = 0 means ndo_close() path, not budget = 0 */
-	page_pool_put_full_page(pool, tx_info->page, !!napi_mode);
+	page_pool_put_full_netmem(pool, tx_info->netmem, !!napi_mode);
 
 	return tx_info->nr_txbb;
 }
@@ -1191,10 +1191,10 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 	tx_desc = ring->buf + (index << LOG_TXBB_SIZE);
 	data = &tx_desc->data;
 
-	dma = page_pool_get_dma_addr(frame->page);
+	dma = page_pool_get_dma_addr_netmem(frame->netmem);
 
-	tx_info->page = frame->page;
-	frame->page = NULL;
+	tx_info->netmem = frame->netmem;
+	frame->netmem = 0;
 	tx_info->map0_dma = dma;
 	tx_info->nr_bytes = max_t(unsigned int, length, ETH_ZLEN);
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index ad0d91a75184..3ef9a0a1f783 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -213,7 +213,7 @@ enum cq_type {
 struct mlx4_en_tx_info {
 	union {
 		struct sk_buff *skb;
-		struct page *page;
+		netmem_ref netmem;
 	};
 	dma_addr_t	map0_dma;
 	u32		map0_byte_count;
@@ -246,7 +246,7 @@ struct mlx4_en_tx_desc {
 #define MLX4_EN_CX3_HIGH_ID	0x1005
 
 struct mlx4_en_rx_alloc {
-	struct page	*page;
+	netmem_ref	netmem;
 	u32		page_offset;
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 12/18] netmem: use _Generic to cover const casting for page_to_netmem()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (10 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 11/18] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 13/18] netmem: remove __netmem_get_pp() Byungchul Park
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

The current page_to_netmem() doesn't cover const casting resulting in
trying to cast const struct page * to const netmem_ref fails.

To cover the case, change page_to_netmem() to use macro and _Generic.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 include/net/netmem.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/net/netmem.h b/include/net/netmem.h
index d52f86082271..74f269c6815d 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -200,10 +200,9 @@ static inline netmem_ref net_iov_to_netmem(struct net_iov *niov)
 	return (__force netmem_ref)((unsigned long)niov | NET_IOV);
 }
 
-static inline netmem_ref page_to_netmem(struct page *page)
-{
-	return (__force netmem_ref)page;
-}
+#define page_to_netmem(p)	(_Generic((p),			\
+	const struct page * :	(__force const netmem_ref)(p),	\
+	struct page * :		(__force netmem_ref)(p)))
 
 /**
  * virt_to_netmem - convert virtual memory pointer to a netmem reference
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 13/18] netmem: remove __netmem_get_pp()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (11 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

There are no users of __netmem_get_pp().  Remove it.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 include/net/netmem.h | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/include/net/netmem.h b/include/net/netmem.h
index 74f269c6815d..ef639b5c70ec 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -239,22 +239,6 @@ static inline struct net_iov *__netmem_clear_lsb(netmem_ref netmem)
 	return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV);
 }
 
-/**
- * __netmem_get_pp - unsafely get pointer to the &page_pool backing @netmem
- * @netmem: netmem reference to get the pointer from
- *
- * Unsafe version of netmem_get_pp(). When @netmem is always page-backed,
- * e.g. when it's a header buffer, performs faster and generates smaller
- * object code (avoids clearing the LSB). When @netmem points to IOV,
- * provokes invalid memory access.
- *
- * Return: pointer to the &page_pool (garbage if @netmem is not page-backed).
- */
-static inline struct page_pool *__netmem_get_pp(netmem_ref netmem)
-{
-	return __netmem_to_page(netmem)->pp;
-}
-
 static inline struct page_pool *netmem_get_pp(netmem_ref netmem)
 {
 	return __netmem_clear_lsb(netmem)->pp;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (12 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 13/18] netmem: remove __netmem_get_pp() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 15/18] netdevsim: use netmem descriptor and APIs for page pool Byungchul Park
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

The page pool members in struct page cannot be removed unless it's not
allowed to access any of them via struct page.

Do not access 'page->dma_addr' directly in page_pool_get_dma_addr() but
just wrap page_pool_get_dma_addr_netmem() safely.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---
 include/net/page_pool/helpers.h | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 93f2c31baf9b..387913b6c8bf 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -437,12 +437,7 @@ static inline dma_addr_t page_pool_get_dma_addr_netmem(netmem_ref netmem)
  */
 static inline dma_addr_t page_pool_get_dma_addr(const struct page *page)
 {
-	dma_addr_t ret = page->dma_addr;
-
-	if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA)
-		ret <<= PAGE_SHIFT;
-
-	return ret;
+	return page_pool_get_dma_addr_netmem(page_to_netmem(page));
 }
 
 static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 15/18] netdevsim: use netmem descriptor and APIs for page pool
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (13 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.

Use netmem descriptor and APIs for page pool in netdevsim code.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 drivers/net/netdevsim/netdev.c    | 19 ++++++++++---------
 drivers/net/netdevsim/netdevsim.h |  2 +-
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index af545d42961c..d134a6195bfa 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -821,7 +821,7 @@ nsim_pp_hold_read(struct file *file, char __user *data,
 	struct netdevsim *ns = file->private_data;
 	char buf[3] = "n\n";
 
-	if (ns->page)
+	if (ns->netmem)
 		buf[0] = 'y';
 
 	return simple_read_from_buffer(data, count, ppos, buf, 2);
@@ -841,18 +841,19 @@ nsim_pp_hold_write(struct file *file, const char __user *data,
 
 	rtnl_lock();
 	ret = count;
-	if (val == !!ns->page)
+	if (val == !!ns->netmem)
 		goto exit;
 
 	if (!netif_running(ns->netdev) && val) {
 		ret = -ENETDOWN;
 	} else if (val) {
-		ns->page = page_pool_dev_alloc_pages(ns->rq[0]->page_pool);
-		if (!ns->page)
+		ns->netmem = page_pool_alloc_netmems(ns->rq[0]->page_pool,
+						     GFP_ATOMIC | __GFP_NOWARN);
+		if (!ns->netmem)
 			ret = -ENOMEM;
 	} else {
-		page_pool_put_full_page(ns->page->pp, ns->page, false);
-		ns->page = NULL;
+		page_pool_put_full_netmem(netmem_get_pp(ns->netmem), ns->netmem, false);
+		ns->netmem = 0;
 	}
 
 exit:
@@ -1077,9 +1078,9 @@ void nsim_destroy(struct netdevsim *ns)
 		nsim_exit_netdevsim(ns);
 
 	/* Put this intentionally late to exercise the orphaning path */
-	if (ns->page) {
-		page_pool_put_full_page(ns->page->pp, ns->page, false);
-		ns->page = NULL;
+	if (ns->netmem) {
+		page_pool_put_full_netmem(netmem_get_pp(ns->netmem), ns->netmem, false);
+		ns->netmem = 0;
 	}
 
 	free_netdev(dev);
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index d04401f0bdf7..1dc51468a50c 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -138,7 +138,7 @@ struct netdevsim {
 		struct debugfs_u32_array dfs_ports[2];
 	} udp_ports;
 
-	struct page *page;
+	netmem_ref netmem;
 	struct dentry *pp_dfs;
 	struct dentry *qr_dfs;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 16/18] netmem: introduce a netmem API, virt_to_head_netmem()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (14 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 15/18] netdevsim: use netmem descriptor and APIs for page pool Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 17/18] mt76: use netmem descriptor and APIs for page pool Byungchul Park
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

To eliminate the use of struct page in page pool, the page pool code
should use netmem descriptor and APIs instead.

As part of the work, introduce a netmem API to convert a virtual address
to a head netmem allowing the code to use it rather than the existing
API, virt_to_head_page() for struct page.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/net/netmem.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/net/netmem.h b/include/net/netmem.h
index ef639b5c70ec..f05a8b008d00 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -270,6 +270,13 @@ static inline netmem_ref netmem_compound_head(netmem_ref netmem)
 	return page_to_netmem(compound_head(netmem_to_page(netmem)));
 }
 
+static inline netmem_ref virt_to_head_netmem(const void *x)
+{
+	netmem_ref netmem = virt_to_netmem(x);
+
+	return netmem_compound_head(netmem);
+}
+
 /**
  * __netmem_address - unsafely get pointer to the memory backing @netmem
  * @netmem: netmem reference to get the pointer for
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 17/18] mt76: use netmem descriptor and APIs for page pool
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (15 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29  3:10 ` [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
  2025-05-29  3:29 ` [RFC v3 00/18] Split netmem from struct page Byungchul Park
  18 siblings, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.

Use netmem descriptor and APIs for page pool in mt76 code.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 drivers/net/wireless/mediatek/mt76/dma.c      |  6 ++---
 drivers/net/wireless/mediatek/mt76/mt76.h     | 12 +++++-----
 .../net/wireless/mediatek/mt76/sdio_txrx.c    | 24 +++++++++----------
 drivers/net/wireless/mediatek/mt76/usb.c      | 10 ++++----
 4 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c
index 35b4ec91979e..41b529b95877 100644
--- a/drivers/net/wireless/mediatek/mt76/dma.c
+++ b/drivers/net/wireless/mediatek/mt76/dma.c
@@ -820,10 +820,10 @@ mt76_add_fragment(struct mt76_dev *dev, struct mt76_queue *q, void *data,
 	int nr_frags = shinfo->nr_frags;
 
 	if (nr_frags < ARRAY_SIZE(shinfo->frags)) {
-		struct page *page = virt_to_head_page(data);
-		int offset = data - page_address(page) + q->buf_offset;
+		netmem_ref netmem = virt_to_head_netmem(data);
+		int offset = data - netmem_address(netmem) + q->buf_offset;
 
-		skb_add_rx_frag(skb, nr_frags, page, offset, len, q->buf_size);
+		skb_add_rx_frag_netmem(skb, nr_frags, netmem, offset, len, q->buf_size);
 	} else {
 		mt76_put_page_pool_buf(data, allow_direct);
 	}
diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h
index 5f8d81cda6cd..16d09b6d8270 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76.h
+++ b/drivers/net/wireless/mediatek/mt76/mt76.h
@@ -1795,21 +1795,21 @@ int mt76_rx_token_consume(struct mt76_dev *dev, void *ptr,
 int mt76_create_page_pool(struct mt76_dev *dev, struct mt76_queue *q);
 static inline void mt76_put_page_pool_buf(void *buf, bool allow_direct)
 {
-	struct page *page = virt_to_head_page(buf);
+	netmem_ref netmem = virt_to_head_netmem(buf);
 
-	page_pool_put_full_page(page->pp, page, allow_direct);
+	page_pool_put_full_netmem(netmem_get_pp(netmem), netmem, allow_direct);
 }
 
 static inline void *
 mt76_get_page_pool_buf(struct mt76_queue *q, u32 *offset, u32 size)
 {
-	struct page *page;
+	netmem_ref netmem;
 
-	page = page_pool_dev_alloc_frag(q->page_pool, offset, size);
-	if (!page)
+	netmem = page_pool_dev_alloc_netmem(q->page_pool, offset, &size);
+	if (!netmem)
 		return NULL;
 
-	return page_address(page) + *offset;
+	return netmem_address(netmem) + *offset;
 }
 
 static inline void mt76_set_tx_blocked(struct mt76_dev *dev, bool blocked)
diff --git a/drivers/net/wireless/mediatek/mt76/sdio_txrx.c b/drivers/net/wireless/mediatek/mt76/sdio_txrx.c
index 0a927a7313a6..b1d89b6f663d 100644
--- a/drivers/net/wireless/mediatek/mt76/sdio_txrx.c
+++ b/drivers/net/wireless/mediatek/mt76/sdio_txrx.c
@@ -68,14 +68,14 @@ mt76s_build_rx_skb(void *data, int data_len, int buf_len)
 
 	skb_put_data(skb, data, len);
 	if (data_len > len) {
-		struct page *page;
+		netmem_ref netmem;
 
 		data += len;
-		page = virt_to_head_page(data);
-		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
-				page, data - page_address(page),
-				data_len - len, buf_len);
-		get_page(page);
+		netmem = virt_to_head_netmem(data);
+		skb_add_rx_frag_netmem(skb, skb_shinfo(skb)->nr_frags,
+				       netmem, data - netmem_address(netmem),
+				       data_len - len, buf_len);
+		get_netmem(netmem);
 	}
 
 	return skb;
@@ -88,7 +88,7 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
 	struct mt76_queue *q = &dev->q_rx[qid];
 	struct mt76_sdio *sdio = &dev->sdio;
 	int len = 0, err, i;
-	struct page *page;
+	netmem_ref netmem;
 	u8 *buf, *end;
 
 	for (i = 0; i < intr->rx.num[qid]; i++)
@@ -100,11 +100,11 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
 	if (len > sdio->func->cur_blksize)
 		len = roundup(len, sdio->func->cur_blksize);
 
-	page = __dev_alloc_pages(GFP_KERNEL, get_order(len));
-	if (!page)
+	netmem = page_to_netmem(__dev_alloc_pages(GFP_KERNEL, get_order(len)));
+	if (!netmem)
 		return -ENOMEM;
 
-	buf = page_address(page);
+	buf = netmem_address(netmem);
 
 	sdio_claim_host(sdio->func);
 	err = sdio_readsb(sdio->func, buf, MCR_WRDR(qid), len);
@@ -112,7 +112,7 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
 
 	if (err < 0) {
 		dev_err(dev->dev, "sdio read data failed:%d\n", err);
-		put_page(page);
+		put_netmem(netmem);
 		return err;
 	}
 
@@ -140,7 +140,7 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
 		}
 		buf += round_up(len + 4, 4);
 	}
-	put_page(page);
+	put_netmem(netmem);
 
 	spin_lock_bh(&q->lock);
 	q->head = (q->head + i) % q->ndesc;
diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
index f9e67b8c3b3c..1ea80c87a839 100644
--- a/drivers/net/wireless/mediatek/mt76/usb.c
+++ b/drivers/net/wireless/mediatek/mt76/usb.c
@@ -478,7 +478,7 @@ mt76u_build_rx_skb(struct mt76_dev *dev, void *data,
 
 	head_room = drv_flags & MT_DRV_RX_DMA_HDR ? 0 : MT_DMA_HDR_LEN;
 	if (SKB_WITH_OVERHEAD(buf_size) < head_room + len) {
-		struct page *page;
+		netmem_ref netmem;
 
 		/* slow path, not enough space for data and
 		 * skb_shared_info
@@ -489,10 +489,10 @@ mt76u_build_rx_skb(struct mt76_dev *dev, void *data,
 
 		skb_put_data(skb, data + head_room, MT_SKB_HEAD_LEN);
 		data += head_room + MT_SKB_HEAD_LEN;
-		page = virt_to_head_page(data);
-		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
-				page, data - page_address(page),
-				len - MT_SKB_HEAD_LEN, buf_size);
+		netmem = virt_to_head_netmem(data);
+		skb_add_rx_frag_netmem(skb, skb_shinfo(skb)->nr_frags,
+				       netmem, data - netmem_address(netmem),
+				       len - MT_SKB_HEAD_LEN, buf_size);
 
 		return skb;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (16 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 17/18] mt76: use netmem descriptor and APIs for page pool Byungchul Park
@ 2025-05-29  3:10 ` Byungchul Park
  2025-05-29 19:54   ` Mina Almasry
  2025-05-29  3:29 ` [RFC v3 00/18] Split netmem from struct page Byungchul Park
  18 siblings, 1 reply; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:10 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.

To achieve that, all the code should avoid directly accessing page pool
members of struct page.

Access ->pp_magic through struct netmem_desc instead of directly
accessing it through struct page in page_pool_page_is_pp().  Plus, move
page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
without header dependency issue.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/mm.h   | 12 ------------
 include/net/netmem.h | 14 ++++++++++++++
 mm/page_alloc.c      |  1 +
 3 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8dc012e84033..de10ad386592 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4311,16 +4311,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
  */
 #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
 
-#ifdef CONFIG_PAGE_POOL
-static inline bool page_pool_page_is_pp(struct page *page)
-{
-	return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
-}
-#else
-static inline bool page_pool_page_is_pp(struct page *page)
-{
-	return false;
-}
-#endif
-
 #endif /* _LINUX_MM_H */
diff --git a/include/net/netmem.h b/include/net/netmem.h
index f05a8b008d00..9e4ed3530788 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -53,6 +53,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
  */
 static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
 
+#ifdef CONFIG_PAGE_POOL
+static inline bool page_pool_page_is_pp(struct page *page)
+{
+	struct netmem_desc *desc = (__force struct netmem_desc *)page;
+
+	return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
+}
+#else
+static inline bool page_pool_page_is_pp(struct page *page)
+{
+	return false;
+}
+#endif
+
 /* net_iov */
 
 DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8c75433ff9a4..40f956cee2d8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -55,6 +55,7 @@
 #include <linux/delayacct.h>
 #include <linux/cacheinfo.h>
 #include <linux/pgalloc_tag.h>
+#include <net/netmem.h>
 #include <asm/div64.h>
 #include "internal.h"
 #include "shuffle.h"
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC v3 00/18] Split netmem from struct page
  2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
                   ` (17 preceding siblings ...)
  2025-05-29  3:10 ` [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
@ 2025-05-29  3:29 ` Byungchul Park
  2025-05-30 15:04   ` Alexander Lobakin
  18 siblings, 1 reply; 28+ messages in thread
From: Byungchul Park @ 2025-05-29  3:29 UTC (permalink / raw)
  To: willy, netdev
  Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola, anthony.l.nguyen

On Thu, May 29, 2025 at 12:10:29PM +0900, Byungchul Park wrote:
> The MM subsystem is trying to reduce struct page to a single pointer.
> The first step towards that is splitting struct page by its individual
> users, as has already been done with folio and slab.  This patchset does
> that for netmem which is used for page pools.
> 
> Matthew Wilcox tried and stopped the same work, you can see in:
> 
>    https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/
> 
> Mina Almasry already has done a lot fo prerequisite works by luck, he
> said :).  I stacked my patches on the top of his work e.i. netmem.
> 
> I focused on removing the page pool members in struct page this time,
> not moving the allocation code of page pool from net to mm.  It can be
> done later if needed.
> 
> The final patch removing the page pool fields will be submitted once
> all the converting work of page to netmem are done:
> 
>    1. converting of libeth_fqe by Tony Nguyen.
>    2. converting of mlx5 by Tariq Toukan.

+cc Tony Nguyen
+cc Tariq Toukan

	Byungchul

>    3. converting of prueth_swdata (on me).
>    4. converting of freescale driver (on me).
> 
> For our discussion, I'm sharing what the final patch looks like the
> following.
> 
> 	Byungchul
> --8<--
> commit 86be39ea488df859cff6bc398a364f1dc486f2f9
> Author: Byungchul Park <byungchul@sk.com>
> Date:   Wed May 28 20:44:55 2025 +0900
> 
>     mm, netmem: remove the page pool members in struct page
>     
>     Now that all the users of the page pool members in struct page have been
>     gone, the members can be removed from struct page.
>     
>     However, since struct netmem_desc still uses the space in struct page,
>     the important offsets should be checked properly, until struct
>     netmem_desc has its own instance from slab.
>     
>     Remove the page pool members in struct page and modify static checkers
>     for the offsets.
>     
>     Signed-off-by: Byungchul Park <byungchul@sk.com>
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 56d07edd01f9..5a7864eb9d76 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -119,17 +119,6 @@ struct page {
>  			 */
>  			unsigned long private;
>  		};
> -		struct {	/* page_pool used by netstack */
> -			/**
> -			 * @pp_magic: magic value to avoid recycling non
> -			 * page_pool allocated pages.
> -			 */
> -			unsigned long pp_magic;
> -			struct page_pool *pp;
> -			unsigned long _pp_mapping_pad;
> -			unsigned long dma_addr;
> -			atomic_long_t pp_ref_count;
> -		};
>  		struct {	/* Tail pages of compound page */
>  			unsigned long compound_head;	/* Bit zero is set */
>  		};
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index 9e4ed3530788..e88e299dd0f0 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -39,11 +39,8 @@ struct netmem_desc {
>  	static_assert(offsetof(struct page, pg) == \
>  		      offsetof(struct netmem_desc, desc))
>  NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
> -NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
> -NETMEM_DESC_ASSERT_OFFSET(pp, pp);
> -NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
> -NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
> -NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> +NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic);
> +NETMEM_DESC_ASSERT_OFFSET(mapping, _pp_mapping_pad);
>  #undef NETMEM_DESC_ASSERT_OFFSET
>  
>  /*
> ---
> Changes from v2:
> 	1. Introduce a netmem API, virt_to_head_netmem(), and use it
> 	   when it's needed.
> 	2. Introduce struct netmem_desc as a new struct and union'ed
> 	   with the existing fields in struct net_iov.
> 	3. Make page_pool_page_is_pp() access ->pp_magic through struct
> 	   netmem_desc instead of struct page.
> 	4. Move netmem alloc APIs from include/net/netmem.h to
> 	   net/core/netmem_priv.h.
> 	5. Apply trivial feedbacks, thanks to Mina, Pavel, and Toke.
> 	6. Add given 'Reviewed-by's, thanks to Mina.
> 
> Changes from v1:
> 	1. Rebase on net-next's main as of May 26.
> 	2. Check checkpatch.pl, feedbacked by SJ Park.
> 	3. Add converting of page to netmem in mt76.
> 	4. Revert 'mlx5: use netmem descriptor and APIs for page pool'
> 	   since it's on-going by Tariq Toukan.  I will wait for his
> 	   work to be done.
> 	5. Revert 'page_pool: use netmem APIs to access page->pp_magic
> 	   in page_pool_page_is_pp()' since we need more discussion.
> 	6. Revert 'mm, netmem: remove the page pool members in struct
> 	   page' since there are some prerequisite works to remove the
> 	   page pool fields from struct page.  I can submit this patch
> 	   separatedly later.
> 	7. Cancel relocating a page pool member in struct page.
> 	8. Modify static assert for offests and size of struct
> 	   netmem_desc.
> 
> Changes from rfc:
> 	1. Rebase on net-next's main branch.
> 	   https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
> 	2. Fix a build error reported by kernel test robot.
> 	   https://lore.kernel.org/all/202505100932.uzAMBW1y-lkp@intel.com/
> 	3. Add given 'Reviewed-by's, thanks to Mina and Ilias.
> 	4. Do static_assert() on the size of struct netmem_desc instead
> 	   of placing place-holder in struct page, feedbacked by
> 	   Matthew.
> 	5. Do struct_group_tagged(netmem_desc) on struct net_iov instead
> 	   of wholly renaming it to strcut netmem_desc, feedbacked by
> 	   Mina and Pavel.
> 
> Byungchul Park (18):
>   netmem: introduce struct netmem_desc mirroring struct page
>   netmem: introduce netmem alloc APIs to wrap page alloc APIs
>   page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
>   page_pool: rename __page_pool_alloc_page_order() to
>     __page_pool_alloc_netmem_order()
>   page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
>   page_pool: rename page_pool_return_page() to page_pool_return_netmem()
>   page_pool: use netmem put API in page_pool_return_netmem()
>   page_pool: rename __page_pool_release_page_dma() to
>     __page_pool_release_netmem_dma()
>   page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
>   page_pool: rename __page_pool_alloc_pages_slow() to
>     __page_pool_alloc_netmems_slow()
>   mlx4: use netmem descriptor and APIs for page pool
>   netmem: use _Generic to cover const casting for page_to_netmem()
>   netmem: remove __netmem_get_pp()
>   page_pool: make page_pool_get_dma_addr() just wrap
>     page_pool_get_dma_addr_netmem()
>   netdevsim: use netmem descriptor and APIs for page pool
>   netmem: introduce a netmem API, virt_to_head_netmem()
>   mt76: use netmem descriptor and APIs for page pool
>   page_pool: access ->pp_magic through struct netmem_desc in
>     page_pool_page_is_pp()
> 
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c    |  48 +++---
>  drivers/net/ethernet/mellanox/mlx4/en_tx.c    |   8 +-
>  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h  |   4 +-
>  drivers/net/netdevsim/netdev.c                |  19 +--
>  drivers/net/netdevsim/netdevsim.h             |   2 +-
>  drivers/net/wireless/mediatek/mt76/dma.c      |   6 +-
>  drivers/net/wireless/mediatek/mt76/mt76.h     |  12 +-
>  .../net/wireless/mediatek/mt76/sdio_txrx.c    |  24 +--
>  drivers/net/wireless/mediatek/mt76/usb.c      |  10 +-
>  include/linux/mm.h                            |  12 --
>  include/net/netmem.h                          | 145 +++++++++++++-----
>  include/net/page_pool/helpers.h               |   7 +-
>  mm/page_alloc.c                               |   1 +
>  net/core/netmem_priv.h                        |  14 ++
>  net/core/page_pool.c                          | 101 ++++++------
>  15 files changed, 239 insertions(+), 174 deletions(-)
> 
> 
> base-commit: d09a8a4ab57849d0401d7c0bc6583e367984d9f7
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring struct page
  2025-05-29  3:10 ` [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
@ 2025-05-29 16:31   ` Mina Almasry
  2025-05-30  1:10     ` Byungchul Park
  0 siblings, 1 reply; 28+ messages in thread
From: Mina Almasry @ 2025-05-29 16:31 UTC (permalink / raw)
  To: Byungchul Park
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

On Wed, May 28, 2025 at 8:11 PM Byungchul Park <byungchul@sk.com> wrote:
>
> To simplify struct page, the page pool members of struct page should be
> moved to other, allowing these members to be removed from struct page.
>
> Introduce a network memory descriptor to store the members, struct
> netmem_desc and make it union'ed with the existing fields in struct
> net_iov, allowing to organize the fields of struct net_iov.  The final
> look of struct net_iov should be like:
>
>         struct net_iov {
>                 struct netmem_desc;
>                 net_field1; /* e.g. struct net_iov_area *owner; */
>                 net_field2;
>                 ...
>         };
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>

This version looks much better from networking POV, and I think with
small adjustments we can make it work I think. I'll leave Matthew to
confirm if it's aligned with the memdesc plans.

> ---
>  include/net/netmem.h | 101 +++++++++++++++++++++++++++++++++----------
>  1 file changed, 79 insertions(+), 22 deletions(-)
>
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index 386164fb9c18..d52f86082271 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -12,6 +12,47 @@
>  #include <linux/mm.h>
>  #include <net/net_debug.h>
>
> +/* These fields in struct page are used by the page_pool and net stack:
> + *
> + *        struct {
> + *                unsigned long pp_magic;
> + *                struct page_pool *pp;
> + *                unsigned long _pp_mapping_pad;
> + *                unsigned long dma_addr;
> + *                atomic_long_t pp_ref_count;
> + *        };
> + *
> + * We mirror the page_pool fields here so the page_pool can access these
> + * fields without worrying whether the underlying fields belong to a
> + * page or netmem_desc.
> + */
> +struct netmem_desc {
> +       unsigned long _flags;
> +       unsigned long pp_magic;
> +       struct page_pool *pp;
> +       unsigned long _pp_mapping_pad;
> +       unsigned long dma_addr;
> +       atomic_long_t pp_ref_count;
> +};
> +
> +#define NETMEM_DESC_ASSERT_OFFSET(pg, desc)        \
> +       static_assert(offsetof(struct page, pg) == \
> +                     offsetof(struct netmem_desc, desc))
> +NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
> +NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
> +NETMEM_DESC_ASSERT_OFFSET(pp, pp);
> +NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
> +NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
> +NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> +#undef NETMEM_DESC_ASSERT_OFFSET
> +
> +/*
> + * Since struct netmem_desc uses the space in struct page, the size
> + * should be checked, until struct netmem_desc has its own instance from
> + * slab, to avoid conflicting with other members within struct page.
> + */
> +static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
> +
>  /* net_iov */
>
>  DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
> @@ -31,12 +72,33 @@ enum net_iov_type {
>  };
>
>  struct net_iov {
> -       enum net_iov_type type;
> -       unsigned long pp_magic;
> -       struct page_pool *pp;
> -       struct net_iov_area *owner;
> -       unsigned long dma_addr;
> -       atomic_long_t pp_ref_count;
> +       union {
> +               struct netmem_desc desc;
> +
> +               /* XXX: The following part should be removed once all
> +                * the references to them are converted so as to be
> +                * accessed via netmem_desc e.g. niov->desc.pp instead
> +                * of niov->pp.
> +                *
> +                * Plus, once struct netmem_desc has it own instance
> +                * from slab, network's fields of the following can be
> +                * moved out of struct netmem_desc like:
> +                *
> +                *    struct net_iov {
> +                *       struct netmem_desc desc;
> +                *       struct net_iov_area *owner;
> +                *       ...
> +                *    };
> +                */

We do not need to wait until netmem_desc has its own instance from
slab to move the net_iov-specific fields out of netmem_desc. We can do
that now, because there are no size restrictions on net_iov.

So, I recommend change this to:

struct net_iov {
  /* Union for anonymous aliasing: */
  union {
    struct netmem_desc desc;
    struct {
       unsigned long _flags;
       unsigned long pp_magic;
       struct page_pool *pp;
       unsigned long _pp_mapping_pad;
       unsigned long dma_addr;
       atomic_long_t pp_ref_count;
    };
    struct net_iov_area *owner;
    enum net_iov_type type;
};

>
>  struct net_iov_area {
> @@ -48,27 +110,22 @@ struct net_iov_area {
>         unsigned long base_virtual;
>  };
>
> -/* These fields in struct page are used by the page_pool and net stack:
> +/* net_iov is union'ed with struct netmem_desc mirroring struct page, so
> + * the page_pool can access these fields without worrying whether the
> + * underlying fields are accessed via netmem_desc or directly via
> + * net_iov, until all the references to them are converted so as to be
> + * accessed via netmem_desc e.g. niov->desc.pp instead of niov->pp.
>   *
> - *        struct {
> - *                unsigned long pp_magic;
> - *                struct page_pool *pp;
> - *                unsigned long _pp_mapping_pad;
> - *                unsigned long dma_addr;
> - *                atomic_long_t pp_ref_count;
> - *        };
> - *
> - * We mirror the page_pool fields here so the page_pool can access these fields
> - * without worrying whether the underlying fields belong to a page or net_iov.
> - *
> - * The non-net stack fields of struct page are private to the mm stack and must
> - * never be mirrored to net_iov.
> + * The non-net stack fields of struct page are private to the mm stack
> + * and must never be mirrored to net_iov.
>   */
> -#define NET_IOV_ASSERT_OFFSET(pg, iov)             \
> -       static_assert(offsetof(struct page, pg) == \
> +#define NET_IOV_ASSERT_OFFSET(desc, iov)                    \
> +       static_assert(offsetof(struct netmem_desc, desc) == \
>                       offsetof(struct net_iov, iov))
> +NET_IOV_ASSERT_OFFSET(_flags, type);

Remove this assertion.

>  NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic);
>  NET_IOV_ASSERT_OFFSET(pp, pp);
> +NET_IOV_ASSERT_OFFSET(_pp_mapping_pad, owner);

And this one.

(_flags, type) and (_pp_mapping_pad, owner) have very different
semantics and usage, we should not assert they are not the same
offset. However (pp, pp) and (pp_magic,pp_magic) have the same
semantics and usage, so we do assert they are at the same offset.

Code is allowed to access __netmem_clear_lsb(netmem)->pp or
__netmem_clear_lsb(netmem)->pp_magic without caring what's the
underlying memory type because both fields have the same semantics and
usage.

Code should *not* assume it can access
__netmem_clear_lsb(netmem)->owner or __netmem_clear_lsb(netmem)->type
without doing a check whether the underlying memory is
page/netmem_desc or net_iov. These fields are only usable for net_iov,
so let's explicitly move them to a different place.

-- 
Thanks,
Mina

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
  2025-05-29  3:10 ` [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
@ 2025-05-29 19:54   ` Mina Almasry
  2025-05-29 20:49     ` Pavel Begunkov
  2025-05-30  1:16     ` Byungchul Park
  0 siblings, 2 replies; 28+ messages in thread
From: Mina Almasry @ 2025-05-29 19:54 UTC (permalink / raw)
  To: Byungchul Park
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

On Wed, May 28, 2025 at 8:11 PM Byungchul Park <byungchul@sk.com> wrote:
>
> To simplify struct page, the effort to separate its own descriptor from
> struct page is required and the work for page pool is on going.
>
> To achieve that, all the code should avoid directly accessing page pool
> members of struct page.
>
> Access ->pp_magic through struct netmem_desc instead of directly
> accessing it through struct page in page_pool_page_is_pp().  Plus, move
> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> without header dependency issue.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  include/linux/mm.h   | 12 ------------
>  include/net/netmem.h | 14 ++++++++++++++
>  mm/page_alloc.c      |  1 +
>  3 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 8dc012e84033..de10ad386592 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4311,16 +4311,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>   */
>  #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>
> -#ifdef CONFIG_PAGE_POOL
> -static inline bool page_pool_page_is_pp(struct page *page)
> -{
> -       return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> -}
> -#else
> -static inline bool page_pool_page_is_pp(struct page *page)
> -{
> -       return false;
> -}
> -#endif
> -
>  #endif /* _LINUX_MM_H */
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index f05a8b008d00..9e4ed3530788 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -53,6 +53,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>   */
>  static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>
> +#ifdef CONFIG_PAGE_POOL
> +static inline bool page_pool_page_is_pp(struct page *page)
> +{
> +       struct netmem_desc *desc = (__force struct netmem_desc *)page;
> +

Is it expected that page can be cast to netmem_desc freely? I know it
works now since netmem_desc and page have the same layout, but how is
it going to continue to work when page is shrunk and no longer has
'pp_magic' inside of it? Is that series going to fixup all the places
where casts are done?

Is it also allowed that we can static cast netmem_desc to page?

Consider creating netmem_desc_page helper like ptdesc_page.

I'm not sure the __force is needed too.

-- 
Thanks,
Mina

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
  2025-05-29 19:54   ` Mina Almasry
@ 2025-05-29 20:49     ` Pavel Begunkov
  2025-05-30  1:16     ` Byungchul Park
  1 sibling, 0 replies; 28+ messages in thread
From: Pavel Begunkov @ 2025-05-29 20:49 UTC (permalink / raw)
  To: Mina Almasry, Byungchul Park
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
	daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
	surenb, mhocko, horms, linux-rdma, bpf, vishal.moola

On 5/29/25 20:54, Mina Almasry wrote:
...>>   #endif /* _LINUX_MM_H */
>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>> index f05a8b008d00..9e4ed3530788 100644
>> --- a/include/net/netmem.h
>> +++ b/include/net/netmem.h
>> @@ -53,6 +53,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>>    */
>>   static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>>
>> +#ifdef CONFIG_PAGE_POOL
>> +static inline bool page_pool_page_is_pp(struct page *page)
>> +{
>> +       struct netmem_desc *desc = (__force struct netmem_desc *)page;
>> +
> 
> Is it expected that page can be cast to netmem_desc freely? I know it
> works now since netmem_desc and page have the same layout, but how is
> it going to continue to work when page is shrunk and no longer has
> 'pp_magic' inside of it? Is that series going to fixup all the places
> where casts are done?

It's expected the struct page will have a type field once it's shrunk.


> Is it also allowed that we can static cast netmem_desc to page?
> 
> Consider creating netmem_desc_page helper like ptdesc_page.
> 
> I'm not sure the __force is needed too.
> 

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring struct page
  2025-05-29 16:31   ` Mina Almasry
@ 2025-05-30  1:10     ` Byungchul Park
  2025-05-30 17:50       ` Mina Almasry
  0 siblings, 1 reply; 28+ messages in thread
From: Byungchul Park @ 2025-05-30  1:10 UTC (permalink / raw)
  To: Mina Almasry
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

On Thu, May 29, 2025 at 09:31:40AM -0700, Mina Almasry wrote:
> On Wed, May 28, 2025 at 8:11 PM Byungchul Park <byungchul@sk.com> wrote:
> >  struct net_iov {
> > -       enum net_iov_type type;
> > -       unsigned long pp_magic;
> > -       struct page_pool *pp;
> > -       struct net_iov_area *owner;
> > -       unsigned long dma_addr;
> > -       atomic_long_t pp_ref_count;
> > +       union {
> > +               struct netmem_desc desc;
> > +
> > +               /* XXX: The following part should be removed once all
> > +                * the references to them are converted so as to be
> > +                * accessed via netmem_desc e.g. niov->desc.pp instead
> > +                * of niov->pp.
> > +                *
> > +                * Plus, once struct netmem_desc has it own instance
> > +                * from slab, network's fields of the following can be
> > +                * moved out of struct netmem_desc like:
> > +                *
> > +                *    struct net_iov {
> > +                *       struct netmem_desc desc;
> > +                *       struct net_iov_area *owner;
> > +                *       ...
> > +                *    };
> > +                */
> 
> We do not need to wait until netmem_desc has its own instance from
> slab to move the net_iov-specific fields out of netmem_desc. We can do
> that now, because there are no size restrictions on net_iov.

Got it.  Thanks for explanation.

> So, I recommend change this to:
> 
> struct net_iov {
>   /* Union for anonymous aliasing: */
>   union {
>     struct netmem_desc desc;
>     struct {
>        unsigned long _flags;
>        unsigned long pp_magic;
>        struct page_pool *pp;
>        unsigned long _pp_mapping_pad;
>        unsigned long dma_addr;
>        atomic_long_t pp_ref_count;
>     };
>     struct net_iov_area *owner;
>     enum net_iov_type type;
> };

Do you mean?

  struct net_iov {
    /* Union for anonymous aliasing: */
    union {
      struct netmem_desc desc;
      struct {
         unsigned long _flags;
         unsigned long pp_magic;
         struct page_pool *pp;
         unsigned long _pp_mapping_pad;
         unsigned long dma_addr;
         atomic_long_t pp_ref_count;
      };
    };
    struct net_iov_area *owner;
    enum net_iov_type type;
  };

Right?  If so, I will.

> >  struct net_iov_area {
> > @@ -48,27 +110,22 @@ struct net_iov_area {
> >         unsigned long base_virtual;
> >  };
> >
> > -/* These fields in struct page are used by the page_pool and net stack:
> > +/* net_iov is union'ed with struct netmem_desc mirroring struct page, so
> > + * the page_pool can access these fields without worrying whether the
> > + * underlying fields are accessed via netmem_desc or directly via
> > + * net_iov, until all the references to them are converted so as to be
> > + * accessed via netmem_desc e.g. niov->desc.pp instead of niov->pp.
> >   *
> > - *        struct {
> > - *                unsigned long pp_magic;
> > - *                struct page_pool *pp;
> > - *                unsigned long _pp_mapping_pad;
> > - *                unsigned long dma_addr;
> > - *                atomic_long_t pp_ref_count;
> > - *        };
> > - *
> > - * We mirror the page_pool fields here so the page_pool can access these fields
> > - * without worrying whether the underlying fields belong to a page or net_iov.
> > - *
> > - * The non-net stack fields of struct page are private to the mm stack and must
> > - * never be mirrored to net_iov.
> > + * The non-net stack fields of struct page are private to the mm stack
> > + * and must never be mirrored to net_iov.
> >   */
> > -#define NET_IOV_ASSERT_OFFSET(pg, iov)             \
> > -       static_assert(offsetof(struct page, pg) == \
> > +#define NET_IOV_ASSERT_OFFSET(desc, iov)                    \
> > +       static_assert(offsetof(struct netmem_desc, desc) == \
> >                       offsetof(struct net_iov, iov))
> > +NET_IOV_ASSERT_OFFSET(_flags, type);
> 
> Remove this assertion.

I will.

> 
> >  NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic);
> >  NET_IOV_ASSERT_OFFSET(pp, pp);
> > +NET_IOV_ASSERT_OFFSET(_pp_mapping_pad, owner);
> 
> And this one.

I will.

> (_flags, type) and (_pp_mapping_pad, owner) have very different
> semantics and usage, we should not assert they are not the same
> offset. However (pp, pp) and (pp_magic,pp_magic) have the same
> semantics and usage, so we do assert they are at the same offset.
> 
> Code is allowed to access __netmem_clear_lsb(netmem)->pp or
> __netmem_clear_lsb(netmem)->pp_magic without caring what's the
> underlying memory type because both fields have the same semantics and
> usage.
> 
> Code should *not* assume it can access
> __netmem_clear_lsb(netmem)->owner or __netmem_clear_lsb(netmem)->type
> without doing a check whether the underlying memory is
> page/netmem_desc or net_iov. These fields are only usable for net_iov,

Sounds good.  Thanks.

	Byungchul

> so let's explicitly move them to a different place.
> 
> -- 
> Thanks,
> Mina

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
  2025-05-29 19:54   ` Mina Almasry
  2025-05-29 20:49     ` Pavel Begunkov
@ 2025-05-30  1:16     ` Byungchul Park
  1 sibling, 0 replies; 28+ messages in thread
From: Byungchul Park @ 2025-05-30  1:16 UTC (permalink / raw)
  To: Mina Almasry
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

On Thu, May 29, 2025 at 12:54:31PM -0700, Mina Almasry wrote:
> On Wed, May 28, 2025 at 8:11 PM Byungchul Park <byungchul@sk.com> wrote:
> >
> > To simplify struct page, the effort to separate its own descriptor from
> > struct page is required and the work for page pool is on going.
> >
> > To achieve that, all the code should avoid directly accessing page pool
> > members of struct page.
> >
> > Access ->pp_magic through struct netmem_desc instead of directly
> > accessing it through struct page in page_pool_page_is_pp().  Plus, move
> > page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> > without header dependency issue.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  include/linux/mm.h   | 12 ------------
> >  include/net/netmem.h | 14 ++++++++++++++
> >  mm/page_alloc.c      |  1 +
> >  3 files changed, 15 insertions(+), 12 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 8dc012e84033..de10ad386592 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -4311,16 +4311,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> >   */
> >  #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
> >
> > -#ifdef CONFIG_PAGE_POOL
> > -static inline bool page_pool_page_is_pp(struct page *page)
> > -{
> > -       return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > -}
> > -#else
> > -static inline bool page_pool_page_is_pp(struct page *page)
> > -{
> > -       return false;
> > -}
> > -#endif
> > -
> >  #endif /* _LINUX_MM_H */
> > diff --git a/include/net/netmem.h b/include/net/netmem.h
> > index f05a8b008d00..9e4ed3530788 100644
> > --- a/include/net/netmem.h
> > +++ b/include/net/netmem.h
> > @@ -53,6 +53,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> >   */
> >  static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
> >
> > +#ifdef CONFIG_PAGE_POOL
> > +static inline bool page_pool_page_is_pp(struct page *page)
> > +{
> > +       struct netmem_desc *desc = (__force struct netmem_desc *)page;
> > +
> 
> Is it expected that page can be cast to netmem_desc freely? I know it
> works now since netmem_desc and page have the same layout, but how is
> it going to continue to work when page is shrunk and no longer has

This should be updated once struct netmem_desc has its own instance from
slab.  As Pavel mentioned, that should be done another way.

> 'pp_magic' inside of it? Is that series going to fixup all the places
> where casts are done?
> 
> Is it also allowed that we can static cast netmem_desc to page?
> 
> Consider creating netmem_desc_page helper like ptdesc_page.

Do we need casting netmem_desc to page?

> 
> I'm not sure the __force is needed too.

Ah, I will remove it.

	Byungchul
> 
> -- 
> Thanks,
> Mina

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 00/18] Split netmem from struct page
  2025-05-29  3:29 ` [RFC v3 00/18] Split netmem from struct page Byungchul Park
@ 2025-05-30 15:04   ` Alexander Lobakin
  0 siblings, 0 replies; 28+ messages in thread
From: Alexander Lobakin @ 2025-05-30 15:04 UTC (permalink / raw)
  To: Byungchul Park, anthony.l.nguyen
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
	john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
	edumazet, pabeni, saeedm, leon, ast, daniel, david,
	lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
	horms, linux-rdma, bpf, vishal.moola

From: Byungchul Park <byungchul@sk.com>
Date: Thu, 29 May 2025 12:29:03 +0900

> On Thu, May 29, 2025 at 12:10:29PM +0900, Byungchul Park wrote:
>> The MM subsystem is trying to reduce struct page to a single pointer.
>> The first step towards that is splitting struct page by its individual
>> users, as has already been done with folio and slab.  This patchset does
>> that for netmem which is used for page pools.
>>
>> Matthew Wilcox tried and stopped the same work, you can see in:
>>
>>    https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/
>>
>> Mina Almasry already has done a lot fo prerequisite works by luck, he
>> said :).  I stacked my patches on the top of his work e.i. netmem.
>>
>> I focused on removing the page pool members in struct page this time,
>> not moving the allocation code of page pool from net to mm.  It can be
>> done later if needed.
>>
>> The final patch removing the page pool fields will be submitted once
>> all the converting work of page to netmem are done:
>>
>>    1. converting of libeth_fqe by Tony Nguyen.

libeth_fqe will be fully converted to netmem when this PR is accepted:
[1], see the first patch of the series.
It didn't make it into this window as Jakub had a couple last-minute
questions, but I hope it will be merged to net-next in the first couple
weeks after net-next is open.

>>    2. converting of mlx5 by Tariq Toukan.

[1]
https://lore.kernel.org/netdev/20250520205920.2134829-1-anthony.l.nguyen@intel.com

Thanks,
Olek

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring struct page
  2025-05-30  1:10     ` Byungchul Park
@ 2025-05-30 17:50       ` Mina Almasry
  2025-06-03  9:22         ` Stefan Metzmacher
  0 siblings, 1 reply; 28+ messages in thread
From: Mina Almasry @ 2025-05-30 17:50 UTC (permalink / raw)
  To: Byungchul Park
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

On Thu, May 29, 2025 at 6:10 PM Byungchul Park <byungchul@sk.com> wrote:
>
> On Thu, May 29, 2025 at 09:31:40AM -0700, Mina Almasry wrote:
> > On Wed, May 28, 2025 at 8:11 PM Byungchul Park <byungchul@sk.com> wrote:
> > >  struct net_iov {
> > > -       enum net_iov_type type;
> > > -       unsigned long pp_magic;
> > > -       struct page_pool *pp;
> > > -       struct net_iov_area *owner;
> > > -       unsigned long dma_addr;
> > > -       atomic_long_t pp_ref_count;
> > > +       union {
> > > +               struct netmem_desc desc;
> > > +
> > > +               /* XXX: The following part should be removed once all
> > > +                * the references to them are converted so as to be
> > > +                * accessed via netmem_desc e.g. niov->desc.pp instead
> > > +                * of niov->pp.
> > > +                *
> > > +                * Plus, once struct netmem_desc has it own instance
> > > +                * from slab, network's fields of the following can be
> > > +                * moved out of struct netmem_desc like:
> > > +                *
> > > +                *    struct net_iov {
> > > +                *       struct netmem_desc desc;
> > > +                *       struct net_iov_area *owner;
> > > +                *       ...
> > > +                *    };
> > > +                */
> >
> > We do not need to wait until netmem_desc has its own instance from
> > slab to move the net_iov-specific fields out of netmem_desc. We can do
> > that now, because there are no size restrictions on net_iov.
>
> Got it.  Thanks for explanation.
>
> > So, I recommend change this to:
> >
> > struct net_iov {
> >   /* Union for anonymous aliasing: */
> >   union {
> >     struct netmem_desc desc;
> >     struct {
> >        unsigned long _flags;
> >        unsigned long pp_magic;
> >        struct page_pool *pp;
> >        unsigned long _pp_mapping_pad;
> >        unsigned long dma_addr;
> >        atomic_long_t pp_ref_count;
> >     };
> >     struct net_iov_area *owner;
> >     enum net_iov_type type;
> > };
>
> Do you mean?
>
>   struct net_iov {
>     /* Union for anonymous aliasing: */
>     union {
>       struct netmem_desc desc;
>       struct {
>          unsigned long _flags;
>          unsigned long pp_magic;
>          struct page_pool *pp;
>          unsigned long _pp_mapping_pad;
>          unsigned long dma_addr;
>          atomic_long_t pp_ref_count;
>       };
>     };
>     struct net_iov_area *owner;
>     enum net_iov_type type;
>   };
>
> Right?  If so, I will.
>

Yes, sounds good.

Also, maybe having a union with the same fields for anonymous aliasing
can be error prone if someone updates netmem_desc and forgets to
update the mirror in struct net_iov. If you can think of a way to deal
with that, great, if not lets maybe put a comment on top of struct
netmem_desc:

/* Do not update the fields in netmem_desc without also updating the
anonymous aliasing union in struct net_iov */.

Or something like that.

-- 
Thanks,
Mina

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring struct page
  2025-05-30 17:50       ` Mina Almasry
@ 2025-06-03  9:22         ` Stefan Metzmacher
  0 siblings, 0 replies; 28+ messages in thread
From: Stefan Metzmacher @ 2025-06-03  9:22 UTC (permalink / raw)
  To: Mina Almasry, Byungchul Park
  Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
	ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
	andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
	saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
	vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
	vishal.moola

Hi Mina,

>> Do you mean?
>>
>>    struct net_iov {
>>      /* Union for anonymous aliasing: */
>>      union {
>>        struct netmem_desc desc;
>>        struct {
>>           unsigned long _flags;
>>           unsigned long pp_magic;
>>           struct page_pool *pp;
>>           unsigned long _pp_mapping_pad;
>>           unsigned long dma_addr;
>>           atomic_long_t pp_ref_count;
>>        };
>>      };
>>      struct net_iov_area *owner;
>>      enum net_iov_type type;
>>    };
>>
>> Right?  If so, I will.
>>
> 
> Yes, sounds good.
> 
> Also, maybe having a union with the same fields for anonymous aliasing
> can be error prone if someone updates netmem_desc and forgets to
> update the mirror in struct net_iov. If you can think of a way to deal
> with that, great, if not lets maybe put a comment on top of struct
> netmem_desc:

I haven't looked at the patch in detail, but to me it sounds
a bit like the checks io_uring_init is doing.

I hope this is in same way helpful here :-)

metze


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-06-03  9:23 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-29  3:10 [RFC v3 00/18] Split netmem from struct page Byungchul Park
2025-05-29  3:10 ` [RFC v3 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-05-29 16:31   ` Mina Almasry
2025-05-30  1:10     ` Byungchul Park
2025-05-30 17:50       ` Mina Almasry
2025-06-03  9:22         ` Stefan Metzmacher
2025-05-29  3:10 ` [RFC v3 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
2025-05-29  3:10 ` [RFC v3 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
2025-05-29  3:10 ` [RFC v3 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
2025-05-29  3:10 ` [RFC v3 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
2025-05-29  3:10 ` [RFC v3 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
2025-05-29  3:10 ` [RFC v3 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
2025-05-29  3:10 ` [RFC v3 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
2025-05-29  3:10 ` [RFC v3 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
2025-05-29  3:10 ` [RFC v3 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
2025-05-29  3:10 ` [RFC v3 11/18] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
2025-05-29  3:10 ` [RFC v3 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
2025-05-29  3:10 ` [RFC v3 13/18] netmem: remove __netmem_get_pp() Byungchul Park
2025-05-29  3:10 ` [RFC v3 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
2025-05-29  3:10 ` [RFC v3 15/18] netdevsim: use netmem descriptor and APIs for page pool Byungchul Park
2025-05-29  3:10 ` [RFC v3 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
2025-05-29  3:10 ` [RFC v3 17/18] mt76: use netmem descriptor and APIs for page pool Byungchul Park
2025-05-29  3:10 ` [RFC v3 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
2025-05-29 19:54   ` Mina Almasry
2025-05-29 20:49     ` Pavel Begunkov
2025-05-30  1:16     ` Byungchul Park
2025-05-29  3:29 ` [RFC v3 00/18] Split netmem from struct page Byungchul Park
2025-05-30 15:04   ` Alexander Lobakin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).