* [RFC v4 00/18] Split netmem from struct page
@ 2025-06-04 2:52 Byungchul Park
2025-06-04 2:52 ` [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
` (18 more replies)
0 siblings, 19 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
The MM subsystem is trying to reduce struct page to a single pointer.
The first step towards that is splitting struct page by its individual
users, as has already been done with folio and slab. This patchset does
that for netmem which is used for page pools.
Matthew Wilcox tried and stopped the same work, you can see in:
https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/
Mina Almasry already has done a lot fo prerequisite works by luck. I
stacked my patches on the top of his work e.i. netmem.
I focused on removing the page pool members in struct page this time,
not moving the allocation code of page pool from net to mm. It can be
done later if needed.
The final patch removing the page pool fields will be submitted once
all the converting work of page to netmem are done:
1. converting of libeth_fqe by Tony Nguyen.
2. converting of mlx5 by Tariq Toukan.
3. converting of prueth_swdata (on me).
4. converting of freescale driver (on me).
For our discussion, I'm sharing what the final patch looks like the
following.
Byungchul
--8<--
commit 1847d9890f798456b21ccb27aac7545303048492
Author: Byungchul Park <byungchul@sk.com>
Date: Wed May 28 20:44:55 2025 +0900
mm, netmem: remove the page pool members in struct page
Now that all the users of the page pool members in struct page have been
gone, the members can be removed from struct page.
However, since struct netmem_desc still uses the space in struct page,
the important offsets should be checked properly, until struct
netmem_desc has its own instance from slab.
Remove the page pool members in struct page and modify static checkers
for the offsets.
Signed-off-by: Byungchul Park <byungchul@sk.com>
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 32ba5126e221..db2fe0d0ebbf 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -120,17 +120,6 @@ struct page {
*/
unsigned long private;
};
- struct { /* page_pool used by netstack */
- /**
- * @pp_magic: magic value to avoid recycling non
- * page_pool allocated pages.
- */
- unsigned long pp_magic;
- struct page_pool *pp;
- unsigned long _pp_mapping_pad;
- unsigned long dma_addr;
- atomic_long_t pp_ref_count;
- };
struct { /* Tail pages of compound page */
unsigned long compound_head; /* Bit zero is set */
};
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 8f354ae7d5c3..3414f184d018 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -42,11 +42,8 @@ struct netmem_desc {
static_assert(offsetof(struct page, pg) == \
offsetof(struct netmem_desc, desc))
NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
-NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
-NETMEM_DESC_ASSERT_OFFSET(pp, pp);
-NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
-NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
-NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
+NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic);
+NETMEM_DESC_ASSERT_OFFSET(mapping, _pp_mapping_pad);
#undef NETMEM_DESC_ASSERT_OFFSET
/*
---
Changes from v3:
1. Relocates ->owner and ->type of net_iov out of netmem_desc
and make them be net_iov specific.
2. Remove __force when casting struct page to struct netmem_desc.
Changes from v2:
1. Introduce a netmem API, virt_to_head_netmem(), and use it
when it's needed.
2. Introduce struct netmem_desc as a new struct and union'ed
with the existing fields in struct net_iov.
3. Make page_pool_page_is_pp() access ->pp_magic through struct
netmem_desc instead of struct page.
4. Move netmem alloc APIs from include/net/netmem.h to
net/core/netmem_priv.h.
5. Apply trivial feedbacks, thanks to Mina, Pavel, and Toke.
6. Add given 'Reviewed-by's, thanks to Mina.
Changes from v1:
1. Rebase on net-next's main as of May 26.
2. Check checkpatch.pl, feedbacked by SJ Park.
3. Add converting of page to netmem in mt76.
4. Revert 'mlx5: use netmem descriptor and APIs for page pool'
since it's on-going by Tariq Toukan. I will wait for his
work to be done.
5. Revert 'page_pool: use netmem APIs to access page->pp_magic
in page_pool_page_is_pp()' since we need more discussion.
6. Revert 'mm, netmem: remove the page pool members in struct
page' since there are some prerequisite works to remove the
page pool fields from struct page. I can submit this patch
separatedly later.
7. Cancel relocating a page pool member in struct page.
8. Modify static assert for offests and size of struct
netmem_desc.
Changes from rfc:
1. Rebase on net-next's main branch.
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
2. Fix a build error reported by kernel test robot.
https://lore.kernel.org/all/202505100932.uzAMBW1y-lkp@intel.com/
3. Add given 'Reviewed-by's, thanks to Mina and Ilias.
4. Do static_assert() on the size of struct netmem_desc instead
of placing place-holder in struct page, feedbacked by
Matthew.
5. Do struct_group_tagged(netmem_desc) on struct net_iov instead
of wholly renaming it to strcut netmem_desc, feedbacked by
Mina and Pavel.
Byungchul Park (18):
netmem: introduce struct netmem_desc mirroring struct page
netmem: introduce netmem alloc APIs to wrap page alloc APIs
page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
page_pool: rename __page_pool_alloc_page_order() to
__page_pool_alloc_netmem_order()
page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
page_pool: rename page_pool_return_page() to page_pool_return_netmem()
page_pool: use netmem put API in page_pool_return_netmem()
page_pool: rename __page_pool_release_page_dma() to
__page_pool_release_netmem_dma()
page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
page_pool: rename __page_pool_alloc_pages_slow() to
__page_pool_alloc_netmems_slow()
mlx4: use netmem descriptor and APIs for page pool
netmem: use _Generic to cover const casting for page_to_netmem()
netmem: remove __netmem_get_pp()
page_pool: make page_pool_get_dma_addr() just wrap
page_pool_get_dma_addr_netmem()
netdevsim: use netmem descriptor and APIs for page pool
netmem: introduce a netmem API, virt_to_head_netmem()
mt76: use netmem descriptor and APIs for page pool
page_pool: access ->pp_magic through struct netmem_desc in
page_pool_page_is_pp()
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 48 +++---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4 +-
drivers/net/netdevsim/netdev.c | 19 +--
drivers/net/netdevsim/netdevsim.h | 2 +-
drivers/net/wireless/mediatek/mt76/dma.c | 6 +-
drivers/net/wireless/mediatek/mt76/mt76.h | 12 +-
.../net/wireless/mediatek/mt76/sdio_txrx.c | 24 +--
drivers/net/wireless/mediatek/mt76/usb.c | 10 +-
include/linux/mm.h | 12 --
include/net/netmem.h | 138 ++++++++++++------
include/net/page_pool/helpers.h | 7 +-
mm/page_alloc.c | 1 +
net/core/netmem_priv.h | 14 ++
net/core/page_pool.c | 103 ++++++-------
15 files changed, 234 insertions(+), 174 deletions(-)
base-commit: 90b83efa6701656e02c86e7df2cb1765ea602d07
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring struct page
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:53 ` Toke Høiland-Jørgensen
` (2 more replies)
2025-06-04 2:52 ` [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
` (17 subsequent siblings)
18 siblings, 3 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
To simplify struct page, the page pool members of struct page should be
moved to other, allowing these members to be removed from struct page.
Introduce a network memory descriptor to store the members, struct
netmem_desc, and make it union'ed with the existing fields in struct
net_iov, allowing to organize the fields of struct net_iov.
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
include/net/netmem.h | 94 ++++++++++++++++++++++++++++++++++----------
1 file changed, 73 insertions(+), 21 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 386164fb9c18..2687c8051ca5 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -12,6 +12,50 @@
#include <linux/mm.h>
#include <net/net_debug.h>
+/* These fields in struct page are used by the page_pool and net stack:
+ *
+ * struct {
+ * unsigned long pp_magic;
+ * struct page_pool *pp;
+ * unsigned long _pp_mapping_pad;
+ * unsigned long dma_addr;
+ * atomic_long_t pp_ref_count;
+ * };
+ *
+ * We mirror the page_pool fields here so the page_pool can access these
+ * fields without worrying whether the underlying fields belong to a
+ * page or netmem_desc.
+ *
+ * CAUTION: Do not update the fields in netmem_desc without also
+ * updating the anonymous aliasing union in struct net_iov.
+ */
+struct netmem_desc {
+ unsigned long _flags;
+ unsigned long pp_magic;
+ struct page_pool *pp;
+ unsigned long _pp_mapping_pad;
+ unsigned long dma_addr;
+ atomic_long_t pp_ref_count;
+};
+
+#define NETMEM_DESC_ASSERT_OFFSET(pg, desc) \
+ static_assert(offsetof(struct page, pg) == \
+ offsetof(struct netmem_desc, desc))
+NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
+NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
+NETMEM_DESC_ASSERT_OFFSET(pp, pp);
+NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
+NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
+NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
+#undef NETMEM_DESC_ASSERT_OFFSET
+
+/*
+ * Since struct netmem_desc uses the space in struct page, the size
+ * should be checked, until struct netmem_desc has its own instance from
+ * slab, to avoid conflicting with other members within struct page.
+ */
+static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
+
/* net_iov */
DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
@@ -31,12 +75,25 @@ enum net_iov_type {
};
struct net_iov {
- enum net_iov_type type;
- unsigned long pp_magic;
- struct page_pool *pp;
+ union {
+ struct netmem_desc desc;
+
+ /* XXX: The following part should be removed once all
+ * the references to them are converted so as to be
+ * accessed via netmem_desc e.g. niov->desc.pp instead
+ * of niov->pp.
+ */
+ struct {
+ unsigned long _flags;
+ unsigned long pp_magic;
+ struct page_pool *pp;
+ unsigned long _pp_mapping_pad;
+ unsigned long dma_addr;
+ atomic_long_t pp_ref_count;
+ };
+ };
struct net_iov_area *owner;
- unsigned long dma_addr;
- atomic_long_t pp_ref_count;
+ enum net_iov_type type;
};
struct net_iov_area {
@@ -48,27 +105,22 @@ struct net_iov_area {
unsigned long base_virtual;
};
-/* These fields in struct page are used by the page_pool and net stack:
+/* net_iov is union'ed with struct netmem_desc mirroring struct page, so
+ * the page_pool can access these fields without worrying whether the
+ * underlying fields are accessed via netmem_desc or directly via
+ * net_iov, until all the references to them are converted so as to be
+ * accessed via netmem_desc e.g. niov->desc.pp instead of niov->pp.
*
- * struct {
- * unsigned long pp_magic;
- * struct page_pool *pp;
- * unsigned long _pp_mapping_pad;
- * unsigned long dma_addr;
- * atomic_long_t pp_ref_count;
- * };
- *
- * We mirror the page_pool fields here so the page_pool can access these fields
- * without worrying whether the underlying fields belong to a page or net_iov.
- *
- * The non-net stack fields of struct page are private to the mm stack and must
- * never be mirrored to net_iov.
+ * The non-net stack fields of struct page are private to the mm stack
+ * and must never be mirrored to net_iov.
*/
-#define NET_IOV_ASSERT_OFFSET(pg, iov) \
- static_assert(offsetof(struct page, pg) == \
+#define NET_IOV_ASSERT_OFFSET(desc, iov) \
+ static_assert(offsetof(struct netmem_desc, desc) == \
offsetof(struct net_iov, iov))
+NET_IOV_ASSERT_OFFSET(_flags, _flags);
NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic);
NET_IOV_ASSERT_OFFSET(pp, pp);
+NET_IOV_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
NET_IOV_ASSERT_OFFSET(dma_addr, dma_addr);
NET_IOV_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
#undef NET_IOV_ASSERT_OFFSET
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
2025-06-04 2:52 ` [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 15:14 ` Suren Baghdasaryan
2025-06-05 10:05 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
` (16 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
To eliminate the use of struct page in page pool, the page pool code
should use netmem descriptor and APIs instead.
As part of the work, introduce netmem alloc APIs allowing the code to
use them rather than the existing APIs for struct page.
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
net/core/netmem_priv.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
index cd95394399b4..32e390908bb2 100644
--- a/net/core/netmem_priv.h
+++ b/net/core/netmem_priv.h
@@ -59,4 +59,18 @@ static inline void netmem_set_dma_index(netmem_ref netmem,
magic = netmem_get_pp_magic(netmem) | (id << PP_DMA_INDEX_SHIFT);
__netmem_clear_lsb(netmem)->pp_magic = magic;
}
+
+static inline netmem_ref alloc_netmems_node(int nid, gfp_t gfp_mask,
+ unsigned int order)
+{
+ return page_to_netmem(alloc_pages_node(nid, gfp_mask, order));
+}
+
+static inline unsigned long alloc_netmems_bulk_node(gfp_t gfp, int nid,
+ unsigned long nr_netmems,
+ netmem_ref *netmem_array)
+{
+ return alloc_pages_bulk_node(gfp, nid, nr_netmems,
+ (struct page **)netmem_array);
+}
#endif
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
2025-06-04 2:52 ` [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-06-04 2:52 ` [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:26 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
` (15 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Use netmem alloc/put APIs instead of page alloc/put APIs and make it
return netmem_ref instead of struct page * in
__page_pool_alloc_page_order().
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 4011eb305cee..523354f2db1c 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -518,29 +518,29 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
return false;
}
-static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
- gfp_t gfp)
+static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
+ gfp_t gfp)
{
- struct page *page;
+ netmem_ref netmem;
gfp |= __GFP_COMP;
- page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
- if (unlikely(!page))
- return NULL;
+ netmem = alloc_netmems_node(pool->p.nid, gfp, pool->p.order);
+ if (unlikely(!netmem))
+ return 0;
- if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page), gfp))) {
- put_page(page);
- return NULL;
+ if (pool->dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
+ put_netmem(netmem);
+ return 0;
}
alloc_stat_inc(pool, slow_high_order);
- page_pool_set_pp_info(pool, page_to_netmem(page));
+ page_pool_set_pp_info(pool, netmem);
/* Track how many pages are held 'in-flight' */
pool->pages_state_hold_cnt++;
- trace_page_pool_state_hold(pool, page_to_netmem(page),
+ trace_page_pool_state_hold(pool, netmem,
pool->pages_state_hold_cnt);
- return page;
+ return netmem;
}
/* slow path */
@@ -555,7 +555,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
/* Don't support bulk alloc for high-order pages */
if (unlikely(pp_order))
- return page_to_netmem(__page_pool_alloc_page_order(pool, gfp));
+ return __page_pool_alloc_page_order(pool, gfp);
/* Unnecessary as alloc cache is empty, but guarantees zero count */
if (unlikely(pool->alloc.count > 0))
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (2 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:28 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
` (14 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Now that __page_pool_alloc_page_order() uses netmem alloc/put APIs, not
page alloc/put APIs, rename it to __page_pool_alloc_netmem_order() to
reflect what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 523354f2db1c..ff3d0d31263c 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -518,8 +518,8 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
return false;
}
-static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
- gfp_t gfp)
+static netmem_ref __page_pool_alloc_netmem_order(struct page_pool *pool,
+ gfp_t gfp)
{
netmem_ref netmem;
@@ -555,7 +555,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
/* Don't support bulk alloc for high-order pages */
if (unlikely(pp_order))
- return __page_pool_alloc_page_order(pool, gfp);
+ return __page_pool_alloc_netmem_order(pool, gfp);
/* Unnecessary as alloc cache is empty, but guarantees zero count */
if (unlikely(pool->alloc.count > 0))
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (3 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 17:02 ` Toke Høiland-Jørgensen
2025-06-05 10:30 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
` (13 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Use netmem alloc/put APIs instead of page alloc/put APIs in
__page_pool_alloc_pages_slow().
While at it, improved some comments.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index ff3d0d31263c..e80a637f0fa4 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -551,7 +551,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
unsigned int pp_order = pool->p.order;
bool dma_map = pool->dma_map;
netmem_ref netmem;
- int i, nr_pages;
+ int i, nr_netmems;
/* Don't support bulk alloc for high-order pages */
if (unlikely(pp_order))
@@ -561,21 +561,21 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
if (unlikely(pool->alloc.count > 0))
return pool->alloc.cache[--pool->alloc.count];
- /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk */
+ /* Mark empty alloc.cache slots "empty" for alloc_netmems_bulk_node() */
memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
- nr_pages = alloc_pages_bulk_node(gfp, pool->p.nid, bulk,
- (struct page **)pool->alloc.cache);
- if (unlikely(!nr_pages))
+ nr_netmems = alloc_netmems_bulk_node(gfp, pool->p.nid, bulk,
+ pool->alloc.cache);
+ if (unlikely(!nr_netmems))
return 0;
- /* Pages have been filled into alloc.cache array, but count is zero and
- * page element have not been (possibly) DMA mapped.
+ /* Netmems have been filled into alloc.cache array, but count is
+ * zero and elements have not been (possibly) DMA mapped.
*/
- for (i = 0; i < nr_pages; i++) {
+ for (i = 0; i < nr_netmems; i++) {
netmem = pool->alloc.cache[i];
if (dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
- put_page(netmem_to_page(netmem));
+ put_netmem(netmem);
continue;
}
@@ -587,7 +587,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
pool->pages_state_hold_cnt);
}
- /* Return last page */
+ /* Return the last netmem */
if (likely(pool->alloc.count > 0)) {
netmem = pool->alloc.cache[--pool->alloc.count];
alloc_stat_inc(pool, slow);
@@ -595,7 +595,9 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
netmem = 0;
}
- /* When page just alloc'ed is should/must have refcnt 1. */
+ /* When a netmem has been just allocated, it should/must have
+ * refcnt 1.
+ */
return netmem;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (4 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 17:03 ` Toke Høiland-Jørgensen
2025-06-05 10:31 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
` (12 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Now that page_pool_return_page() is for returning netmem, not struct
page, rename it to page_pool_return_netmem() to reflect what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index e80a637f0fa4..b7680dcb83e4 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -371,7 +371,7 @@ struct page_pool *page_pool_create(const struct page_pool_params *params)
}
EXPORT_SYMBOL(page_pool_create);
-static void page_pool_return_page(struct page_pool *pool, netmem_ref netmem);
+static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem);
static noinline netmem_ref page_pool_refill_alloc_cache(struct page_pool *pool)
{
@@ -409,7 +409,7 @@ static noinline netmem_ref page_pool_refill_alloc_cache(struct page_pool *pool)
* (2) break out to fallthrough to alloc_pages_node.
* This limit stress on page buddy alloactor.
*/
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
alloc_stat_inc(pool, waive);
netmem = 0;
break;
@@ -714,7 +714,7 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
* a regular page (that will eventually be returned to the normal
* page-allocator via put_page).
*/
-void page_pool_return_page(struct page_pool *pool, netmem_ref netmem)
+static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
{
int count;
bool put;
@@ -831,7 +831,7 @@ __page_pool_put_page(struct page_pool *pool, netmem_ref netmem,
* will be invoking put_page.
*/
recycle_stat_inc(pool, released_refcnt);
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
return 0;
}
@@ -874,7 +874,7 @@ void page_pool_put_unrefed_netmem(struct page_pool *pool, netmem_ref netmem,
if (netmem && !page_pool_recycle_in_ring(pool, netmem)) {
/* Cache full, fallback to free pages */
recycle_stat_inc(pool, ring_full);
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
EXPORT_SYMBOL(page_pool_put_unrefed_netmem);
@@ -917,7 +917,7 @@ static void page_pool_recycle_ring_bulk(struct page_pool *pool,
* since put_page() with refcnt == 1 can be an expensive operation.
*/
for (; i < bulk_len; i++)
- page_pool_return_page(pool, bulk[i]);
+ page_pool_return_netmem(pool, bulk[i]);
}
/**
@@ -1000,7 +1000,7 @@ static netmem_ref page_pool_drain_frag(struct page_pool *pool,
return netmem;
}
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
return 0;
}
@@ -1014,7 +1014,7 @@ static void page_pool_free_frag(struct page_pool *pool)
if (!netmem || page_pool_unref_netmem(netmem, drain_count))
return;
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool,
@@ -1081,7 +1081,7 @@ static void page_pool_empty_ring(struct page_pool *pool)
pr_crit("%s() page_pool refcnt %d violation\n",
__func__, netmem_ref_count(netmem));
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
@@ -1114,7 +1114,7 @@ static void page_pool_empty_alloc_cache_once(struct page_pool *pool)
*/
while (pool->alloc.count) {
netmem = pool->alloc.cache[--pool->alloc.count];
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
@@ -1254,7 +1254,7 @@ void page_pool_update_nid(struct page_pool *pool, int new_nid)
/* Flush pool alloc cache, as refill will check NUMA node */
while (pool->alloc.count) {
netmem = pool->alloc.cache[--pool->alloc.count];
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
EXPORT_SYMBOL(page_pool_update_nid);
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (5 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:33 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
` (11 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Use netmem put API, put_netmem(), instead of put_page() in
page_pool_return_netmem().
While at it, delete #include <linux/mm.h> since the last put_page() in
page_pool.c has been just removed with this patch.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index b7680dcb83e4..dab89bc69f10 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -20,7 +20,6 @@
#include <linux/dma-direction.h>
#include <linux/dma-mapping.h>
#include <linux/page-flags.h>
-#include <linux/mm.h> /* for put_page() */
#include <linux/poison.h>
#include <linux/ethtool.h>
#include <linux/netdevice.h>
@@ -712,7 +711,7 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
/* Disconnects a page (from a page_pool). API users can have a need
* to disconnect a page (from a page_pool), to allow it to be used as
* a regular page (that will eventually be returned to the normal
- * page-allocator via put_page).
+ * page-allocator via put_netmem()).
*/
static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
{
@@ -733,7 +732,7 @@ static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
if (put) {
page_pool_clear_pp_info(netmem);
- put_page(netmem_to_page(netmem));
+ put_netmem(netmem);
}
/* An optimization would be to call __free_pages(page, pool->p.order)
* knowing page is not part of page-cache (thus avoiding a
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (6 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:34 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
` (10 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Now that __page_pool_release_page_dma() is for releasing netmem, not
struct page, rename it to __page_pool_release_netmem_dma() to reflect
what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index dab89bc69f10..c31a35621b24 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -674,8 +674,8 @@ void page_pool_clear_pp_info(netmem_ref netmem)
netmem_set_pp(netmem, NULL);
}
-static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
- netmem_ref netmem)
+static __always_inline void __page_pool_release_netmem_dma(struct page_pool *pool,
+ netmem_ref netmem)
{
struct page *old, *page = netmem_to_page(netmem);
unsigned long id;
@@ -722,7 +722,7 @@ static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
put = pool->mp_ops->release_netmem(pool, netmem);
else
- __page_pool_release_page_dma(pool, netmem);
+ __page_pool_release_netmem_dma(pool, netmem);
/* This may be the last page returned, releasing the pool, so
* it is not safe to reference pool afterwards.
@@ -1140,7 +1140,7 @@ static void page_pool_scrub(struct page_pool *pool)
}
xa_for_each(&pool->dma_mapped, id, ptr)
- __page_pool_release_page_dma(pool, page_to_netmem(ptr));
+ __page_pool_release_netmem_dma(pool, page_to_netmem((struct page *)ptr));
}
/* No more consumers should exist, but producers could still
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (7 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:35 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
` (9 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Now that __page_pool_put_page() puts netmem, not struct page, rename it
to __page_pool_put_netmem() to reflect what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index c31a35621b24..0d6a72a71745 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -790,8 +790,8 @@ static bool __page_pool_page_can_be_recycled(netmem_ref netmem)
* subsystem.
*/
static __always_inline netmem_ref
-__page_pool_put_page(struct page_pool *pool, netmem_ref netmem,
- unsigned int dma_sync_size, bool allow_direct)
+__page_pool_put_netmem(struct page_pool *pool, netmem_ref netmem,
+ unsigned int dma_sync_size, bool allow_direct)
{
lockdep_assert_no_hardirq();
@@ -850,7 +850,7 @@ static bool page_pool_napi_local(const struct page_pool *pool)
/* Allow direct recycle if we have reasons to believe that we are
* in the same context as the consumer would run, so there's
* no possible race.
- * __page_pool_put_page() makes sure we're not in hardirq context
+ * __page_pool_put_netmem() makes sure we're not in hardirq context
* and interrupts are enabled prior to accessing the cache.
*/
cpuid = smp_processor_id();
@@ -868,8 +868,8 @@ void page_pool_put_unrefed_netmem(struct page_pool *pool, netmem_ref netmem,
if (!allow_direct)
allow_direct = page_pool_napi_local(pool);
- netmem = __page_pool_put_page(pool, netmem, dma_sync_size,
- allow_direct);
+ netmem = __page_pool_put_netmem(pool, netmem, dma_sync_size,
+ allow_direct);
if (netmem && !page_pool_recycle_in_ring(pool, netmem)) {
/* Cache full, fallback to free pages */
recycle_stat_inc(pool, ring_full);
@@ -970,8 +970,8 @@ void page_pool_put_netmem_bulk(netmem_ref *data, u32 count)
continue;
}
- netmem = __page_pool_put_page(pool, netmem, -1,
- allow_direct);
+ netmem = __page_pool_put_netmem(pool, netmem, -1,
+ allow_direct);
/* Approved for bulk recycling in ptr_ring cache */
if (netmem)
bulk[bulk_len++] = netmem;
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (8 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 17:03 ` Toke Høiland-Jørgensen
2025-06-04 2:52 ` [RFC v4 11/18] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
` (8 subsequent siblings)
18 siblings, 1 reply; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
Now that __page_pool_alloc_pages_slow() is for allocating netmem, not
struct page, rename it to __page_pool_alloc_netmems_slow() to reflect
what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
net/core/page_pool.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 0d6a72a71745..47cec631f598 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -543,8 +543,8 @@ static netmem_ref __page_pool_alloc_netmem_order(struct page_pool *pool,
}
/* slow path */
-static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
- gfp_t gfp)
+static noinline netmem_ref __page_pool_alloc_netmems_slow(struct page_pool *pool,
+ gfp_t gfp)
{
const int bulk = PP_ALLOC_CACHE_REFILL;
unsigned int pp_order = pool->p.order;
@@ -616,7 +616,7 @@ netmem_ref page_pool_alloc_netmems(struct page_pool *pool, gfp_t gfp)
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
netmem = pool->mp_ops->alloc_netmems(pool, gfp);
else
- netmem = __page_pool_alloc_pages_slow(pool, gfp);
+ netmem = __page_pool_alloc_netmems_slow(pool, gfp);
return netmem;
}
EXPORT_SYMBOL(page_pool_alloc_netmems);
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 11/18] mlx4: use netmem descriptor and APIs for page pool
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (9 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 2:52 ` [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
` (7 subsequent siblings)
18 siblings, 0 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.
Use netmem descriptor and APIs for page pool in mlx4 code.
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 48 +++++++++++---------
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 ++--
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4 +-
3 files changed, 32 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index b33285d755b9..7cf0d2dc5011 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -62,18 +62,18 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
int i;
for (i = 0; i < priv->num_frags; i++, frags++) {
- if (!frags->page) {
- frags->page = page_pool_alloc_pages(ring->pp, gfp);
- if (!frags->page) {
+ if (!frags->netmem) {
+ frags->netmem = page_pool_alloc_netmems(ring->pp, gfp);
+ if (!frags->netmem) {
ring->alloc_fail++;
return -ENOMEM;
}
- page_pool_fragment_page(frags->page, 1);
+ page_pool_fragment_netmem(frags->netmem, 1);
frags->page_offset = priv->rx_headroom;
ring->rx_alloc_pages++;
}
- dma = page_pool_get_dma_addr(frags->page);
+ dma = page_pool_get_dma_addr_netmem(frags->netmem);
rx_desc->data[i].addr = cpu_to_be64(dma + frags->page_offset);
}
return 0;
@@ -83,10 +83,10 @@ static void mlx4_en_free_frag(const struct mlx4_en_priv *priv,
struct mlx4_en_rx_ring *ring,
struct mlx4_en_rx_alloc *frag)
{
- if (frag->page)
- page_pool_put_full_page(ring->pp, frag->page, false);
+ if (frag->netmem)
+ page_pool_put_full_netmem(ring->pp, frag->netmem, false);
/* We need to clear all fields, otherwise a change of priv->log_rx_info
- * could lead to see garbage later in frag->page.
+ * could lead to see garbage later in frag->netmem.
*/
memset(frag, 0, sizeof(*frag));
}
@@ -440,29 +440,33 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
unsigned int truesize = 0;
bool release = true;
int nr, frag_size;
- struct page *page;
+ netmem_ref netmem;
dma_addr_t dma;
/* Collect used fragments while replacing them in the HW descriptors */
for (nr = 0;; frags++) {
frag_size = min_t(int, length, frag_info->frag_size);
- page = frags->page;
- if (unlikely(!page))
+ netmem = frags->netmem;
+ if (unlikely(!netmem))
goto fail;
- dma = page_pool_get_dma_addr(page);
+ dma = page_pool_get_dma_addr_netmem(netmem);
dma_sync_single_range_for_cpu(priv->ddev, dma, frags->page_offset,
frag_size, priv->dma_dir);
- __skb_fill_page_desc(skb, nr, page, frags->page_offset,
- frag_size);
+ __skb_fill_netmem_desc(skb, nr, netmem, frags->page_offset,
+ frag_size);
truesize += frag_info->frag_stride;
if (frag_info->frag_stride == PAGE_SIZE / 2) {
+ struct page *page = netmem_to_page(netmem);
+ atomic_long_t *pp_ref_count =
+ netmem_get_pp_ref_count_ref(netmem);
+
frags->page_offset ^= PAGE_SIZE / 2;
release = page_count(page) != 1 ||
- atomic_long_read(&page->pp_ref_count) != 1 ||
+ atomic_long_read(pp_ref_count) != 1 ||
page_is_pfmemalloc(page) ||
page_to_nid(page) != numa_mem_id();
} else if (!priv->rx_headroom) {
@@ -476,9 +480,9 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
release = frags->page_offset + frag_info->frag_size > PAGE_SIZE;
}
if (release) {
- frags->page = NULL;
+ frags->netmem = 0;
} else {
- page_pool_ref_page(page);
+ page_pool_ref_netmem(netmem);
}
nr++;
@@ -719,7 +723,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
int nr;
frags = ring->rx_info + (index << priv->log_rx_info);
- va = page_address(frags[0].page) + frags[0].page_offset;
+ va = netmem_address(frags[0].netmem) + frags[0].page_offset;
net_prefetchw(va);
/*
* make sure we read the CQE after we read the ownership bit
@@ -748,7 +752,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
/* Get pointer to first fragment since we haven't
* skb yet and cast it to ethhdr struct
*/
- dma = page_pool_get_dma_addr(frags[0].page);
+ dma = page_pool_get_dma_addr_netmem(frags[0].netmem);
dma += frags[0].page_offset;
dma_sync_single_for_cpu(priv->ddev, dma, sizeof(*ethh),
DMA_FROM_DEVICE);
@@ -788,7 +792,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
void *orig_data;
u32 act;
- dma = page_pool_get_dma_addr(frags[0].page);
+ dma = page_pool_get_dma_addr_netmem(frags[0].netmem);
dma += frags[0].page_offset;
dma_sync_single_for_cpu(priv->ddev, dma,
priv->frag_info[0].frag_size,
@@ -818,7 +822,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
if (likely(!xdp_do_redirect(dev, &mxbuf.xdp, xdp_prog))) {
ring->xdp_redirect++;
xdp_redir_flush = true;
- frags[0].page = NULL;
+ frags[0].netmem = 0;
goto next;
}
ring->xdp_redirect_fail++;
@@ -828,7 +832,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
if (likely(!mlx4_en_xmit_frame(ring, frags, priv,
length, cq_ring,
&doorbell_pending))) {
- frags[0].page = NULL;
+ frags[0].netmem = 0;
goto next;
}
trace_xdp_exception(dev, xdp_prog, act);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 87f35bcbeff8..b564a953da09 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -354,7 +354,7 @@ u32 mlx4_en_recycle_tx_desc(struct mlx4_en_priv *priv,
struct page_pool *pool = ring->recycle_ring->pp;
/* Note that napi_mode = 0 means ndo_close() path, not budget = 0 */
- page_pool_put_full_page(pool, tx_info->page, !!napi_mode);
+ page_pool_put_full_netmem(pool, tx_info->netmem, !!napi_mode);
return tx_info->nr_txbb;
}
@@ -1191,10 +1191,10 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
tx_desc = ring->buf + (index << LOG_TXBB_SIZE);
data = &tx_desc->data;
- dma = page_pool_get_dma_addr(frame->page);
+ dma = page_pool_get_dma_addr_netmem(frame->netmem);
- tx_info->page = frame->page;
- frame->page = NULL;
+ tx_info->netmem = frame->netmem;
+ frame->netmem = 0;
tx_info->map0_dma = dma;
tx_info->nr_bytes = max_t(unsigned int, length, ETH_ZLEN);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index ad0d91a75184..3ef9a0a1f783 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -213,7 +213,7 @@ enum cq_type {
struct mlx4_en_tx_info {
union {
struct sk_buff *skb;
- struct page *page;
+ netmem_ref netmem;
};
dma_addr_t map0_dma;
u32 map0_byte_count;
@@ -246,7 +246,7 @@ struct mlx4_en_tx_desc {
#define MLX4_EN_CX3_HIGH_ID 0x1005
struct mlx4_en_rx_alloc {
- struct page *page;
+ netmem_ref netmem;
u32 page_offset;
};
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (10 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 11/18] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:40 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 13/18] netmem: remove __netmem_get_pp() Byungchul Park
` (6 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
The current page_to_netmem() doesn't cover const casting resulting in
trying to cast const struct page * to const netmem_ref fails.
To cover the case, change page_to_netmem() to use macro and _Generic.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
include/net/netmem.h | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 2687c8051ca5..65bb87835664 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -195,10 +195,9 @@ static inline netmem_ref net_iov_to_netmem(struct net_iov *niov)
return (__force netmem_ref)((unsigned long)niov | NET_IOV);
}
-static inline netmem_ref page_to_netmem(struct page *page)
-{
- return (__force netmem_ref)page;
-}
+#define page_to_netmem(p) (_Generic((p), \
+ const struct page * : (__force const netmem_ref)(p), \
+ struct page * : (__force netmem_ref)(p)))
/**
* virt_to_netmem - convert virtual memory pointer to a netmem reference
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 13/18] netmem: remove __netmem_get_pp()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (11 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:56 ` Toke Høiland-Jørgensen
2025-06-05 10:41 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
` (5 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
There are no users of __netmem_get_pp(). Remove it.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
include/net/netmem.h | 16 ----------------
1 file changed, 16 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 65bb87835664..d4066fcb1fee 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -234,22 +234,6 @@ static inline struct net_iov *__netmem_clear_lsb(netmem_ref netmem)
return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV);
}
-/**
- * __netmem_get_pp - unsafely get pointer to the &page_pool backing @netmem
- * @netmem: netmem reference to get the pointer from
- *
- * Unsafe version of netmem_get_pp(). When @netmem is always page-backed,
- * e.g. when it's a header buffer, performs faster and generates smaller
- * object code (avoids clearing the LSB). When @netmem points to IOV,
- * provokes invalid memory access.
- *
- * Return: pointer to the &page_pool (garbage if @netmem is not page-backed).
- */
-static inline struct page_pool *__netmem_get_pp(netmem_ref netmem)
-{
- return __netmem_to_page(netmem)->pp;
-}
-
static inline struct page_pool *netmem_get_pp(netmem_ref netmem)
{
return __netmem_clear_lsb(netmem)->pp;
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (12 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 13/18] netmem: remove __netmem_get_pp() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:57 ` Toke Høiland-Jørgensen
2025-06-05 10:45 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 15/18] netdevsim: use netmem descriptor and APIs for page pool Byungchul Park
` (4 subsequent siblings)
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
The page pool members in struct page cannot be removed unless it's not
allowed to access any of them via struct page.
Do not access 'page->dma_addr' directly in page_pool_get_dma_addr() but
just wrap page_pool_get_dma_addr_netmem() safely.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---
include/net/page_pool/helpers.h | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 93f2c31baf9b..387913b6c8bf 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -437,12 +437,7 @@ static inline dma_addr_t page_pool_get_dma_addr_netmem(netmem_ref netmem)
*/
static inline dma_addr_t page_pool_get_dma_addr(const struct page *page)
{
- dma_addr_t ret = page->dma_addr;
-
- if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA)
- ret <<= PAGE_SHIFT;
-
- return ret;
+ return page_pool_get_dma_addr_netmem(page_to_netmem(page));
}
static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool,
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 15/18] netdevsim: use netmem descriptor and APIs for page pool
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (13 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 2:52 ` [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
` (3 subsequent siblings)
18 siblings, 0 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.
Use netmem descriptor and APIs for page pool in netdevsim code.
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
drivers/net/netdevsim/netdev.c | 19 ++++++++++---------
drivers/net/netdevsim/netdevsim.h | 2 +-
2 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index af545d42961c..d134a6195bfa 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -821,7 +821,7 @@ nsim_pp_hold_read(struct file *file, char __user *data,
struct netdevsim *ns = file->private_data;
char buf[3] = "n\n";
- if (ns->page)
+ if (ns->netmem)
buf[0] = 'y';
return simple_read_from_buffer(data, count, ppos, buf, 2);
@@ -841,18 +841,19 @@ nsim_pp_hold_write(struct file *file, const char __user *data,
rtnl_lock();
ret = count;
- if (val == !!ns->page)
+ if (val == !!ns->netmem)
goto exit;
if (!netif_running(ns->netdev) && val) {
ret = -ENETDOWN;
} else if (val) {
- ns->page = page_pool_dev_alloc_pages(ns->rq[0]->page_pool);
- if (!ns->page)
+ ns->netmem = page_pool_alloc_netmems(ns->rq[0]->page_pool,
+ GFP_ATOMIC | __GFP_NOWARN);
+ if (!ns->netmem)
ret = -ENOMEM;
} else {
- page_pool_put_full_page(ns->page->pp, ns->page, false);
- ns->page = NULL;
+ page_pool_put_full_netmem(netmem_get_pp(ns->netmem), ns->netmem, false);
+ ns->netmem = 0;
}
exit:
@@ -1077,9 +1078,9 @@ void nsim_destroy(struct netdevsim *ns)
nsim_exit_netdevsim(ns);
/* Put this intentionally late to exercise the orphaning path */
- if (ns->page) {
- page_pool_put_full_page(ns->page->pp, ns->page, false);
- ns->page = NULL;
+ if (ns->netmem) {
+ page_pool_put_full_netmem(netmem_get_pp(ns->netmem), ns->netmem, false);
+ ns->netmem = 0;
}
free_netdev(dev);
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index d04401f0bdf7..1dc51468a50c 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -138,7 +138,7 @@ struct netdevsim {
struct debugfs_u32_array dfs_ports[2];
} udp_ports;
- struct page *page;
+ netmem_ref netmem;
struct dentry *pp_dfs;
struct dentry *qr_dfs;
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (14 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 15/18] netdevsim: use netmem descriptor and APIs for page pool Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:59 ` Toke Høiland-Jørgensen
` (2 more replies)
2025-06-04 2:52 ` [RFC v4 17/18] mt76: use netmem descriptor and APIs for page pool Byungchul Park
` (2 subsequent siblings)
18 siblings, 3 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
To eliminate the use of struct page in page pool, the page pool code
should use netmem descriptor and APIs instead.
As part of the work, introduce a netmem API to convert a virtual address
to a head netmem allowing the code to use it rather than the existing
API, virt_to_head_page() for struct page.
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
include/net/netmem.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index d4066fcb1fee..d84ab624b489 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -265,6 +265,13 @@ static inline netmem_ref netmem_compound_head(netmem_ref netmem)
return page_to_netmem(compound_head(netmem_to_page(netmem)));
}
+static inline netmem_ref virt_to_head_netmem(const void *x)
+{
+ netmem_ref netmem = virt_to_netmem(x);
+
+ return netmem_compound_head(netmem);
+}
+
/**
* __netmem_address - unsafely get pointer to the memory backing @netmem
* @netmem: netmem reference to get the pointer for
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 17/18] mt76: use netmem descriptor and APIs for page pool
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (15 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 2:52 ` [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
2025-06-04 3:23 ` [RFC v4 00/18] Split netmem from struct page Byungchul Park
18 siblings, 0 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.
Use netmem descriptor and APIs for page pool in mt76 code.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
drivers/net/wireless/mediatek/mt76/dma.c | 6 ++---
drivers/net/wireless/mediatek/mt76/mt76.h | 12 +++++-----
.../net/wireless/mediatek/mt76/sdio_txrx.c | 24 +++++++++----------
drivers/net/wireless/mediatek/mt76/usb.c | 10 ++++----
4 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c
index 35b4ec91979e..41b529b95877 100644
--- a/drivers/net/wireless/mediatek/mt76/dma.c
+++ b/drivers/net/wireless/mediatek/mt76/dma.c
@@ -820,10 +820,10 @@ mt76_add_fragment(struct mt76_dev *dev, struct mt76_queue *q, void *data,
int nr_frags = shinfo->nr_frags;
if (nr_frags < ARRAY_SIZE(shinfo->frags)) {
- struct page *page = virt_to_head_page(data);
- int offset = data - page_address(page) + q->buf_offset;
+ netmem_ref netmem = virt_to_head_netmem(data);
+ int offset = data - netmem_address(netmem) + q->buf_offset;
- skb_add_rx_frag(skb, nr_frags, page, offset, len, q->buf_size);
+ skb_add_rx_frag_netmem(skb, nr_frags, netmem, offset, len, q->buf_size);
} else {
mt76_put_page_pool_buf(data, allow_direct);
}
diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h
index 5f8d81cda6cd..16d09b6d8270 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76.h
+++ b/drivers/net/wireless/mediatek/mt76/mt76.h
@@ -1795,21 +1795,21 @@ int mt76_rx_token_consume(struct mt76_dev *dev, void *ptr,
int mt76_create_page_pool(struct mt76_dev *dev, struct mt76_queue *q);
static inline void mt76_put_page_pool_buf(void *buf, bool allow_direct)
{
- struct page *page = virt_to_head_page(buf);
+ netmem_ref netmem = virt_to_head_netmem(buf);
- page_pool_put_full_page(page->pp, page, allow_direct);
+ page_pool_put_full_netmem(netmem_get_pp(netmem), netmem, allow_direct);
}
static inline void *
mt76_get_page_pool_buf(struct mt76_queue *q, u32 *offset, u32 size)
{
- struct page *page;
+ netmem_ref netmem;
- page = page_pool_dev_alloc_frag(q->page_pool, offset, size);
- if (!page)
+ netmem = page_pool_dev_alloc_netmem(q->page_pool, offset, &size);
+ if (!netmem)
return NULL;
- return page_address(page) + *offset;
+ return netmem_address(netmem) + *offset;
}
static inline void mt76_set_tx_blocked(struct mt76_dev *dev, bool blocked)
diff --git a/drivers/net/wireless/mediatek/mt76/sdio_txrx.c b/drivers/net/wireless/mediatek/mt76/sdio_txrx.c
index 0a927a7313a6..b1d89b6f663d 100644
--- a/drivers/net/wireless/mediatek/mt76/sdio_txrx.c
+++ b/drivers/net/wireless/mediatek/mt76/sdio_txrx.c
@@ -68,14 +68,14 @@ mt76s_build_rx_skb(void *data, int data_len, int buf_len)
skb_put_data(skb, data, len);
if (data_len > len) {
- struct page *page;
+ netmem_ref netmem;
data += len;
- page = virt_to_head_page(data);
- skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
- page, data - page_address(page),
- data_len - len, buf_len);
- get_page(page);
+ netmem = virt_to_head_netmem(data);
+ skb_add_rx_frag_netmem(skb, skb_shinfo(skb)->nr_frags,
+ netmem, data - netmem_address(netmem),
+ data_len - len, buf_len);
+ get_netmem(netmem);
}
return skb;
@@ -88,7 +88,7 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
struct mt76_queue *q = &dev->q_rx[qid];
struct mt76_sdio *sdio = &dev->sdio;
int len = 0, err, i;
- struct page *page;
+ netmem_ref netmem;
u8 *buf, *end;
for (i = 0; i < intr->rx.num[qid]; i++)
@@ -100,11 +100,11 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
if (len > sdio->func->cur_blksize)
len = roundup(len, sdio->func->cur_blksize);
- page = __dev_alloc_pages(GFP_KERNEL, get_order(len));
- if (!page)
+ netmem = page_to_netmem(__dev_alloc_pages(GFP_KERNEL, get_order(len)));
+ if (!netmem)
return -ENOMEM;
- buf = page_address(page);
+ buf = netmem_address(netmem);
sdio_claim_host(sdio->func);
err = sdio_readsb(sdio->func, buf, MCR_WRDR(qid), len);
@@ -112,7 +112,7 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
if (err < 0) {
dev_err(dev->dev, "sdio read data failed:%d\n", err);
- put_page(page);
+ put_netmem(netmem);
return err;
}
@@ -140,7 +140,7 @@ mt76s_rx_run_queue(struct mt76_dev *dev, enum mt76_rxq_id qid,
}
buf += round_up(len + 4, 4);
}
- put_page(page);
+ put_netmem(netmem);
spin_lock_bh(&q->lock);
q->head = (q->head + i) % q->ndesc;
diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
index f9e67b8c3b3c..1ea80c87a839 100644
--- a/drivers/net/wireless/mediatek/mt76/usb.c
+++ b/drivers/net/wireless/mediatek/mt76/usb.c
@@ -478,7 +478,7 @@ mt76u_build_rx_skb(struct mt76_dev *dev, void *data,
head_room = drv_flags & MT_DRV_RX_DMA_HDR ? 0 : MT_DMA_HDR_LEN;
if (SKB_WITH_OVERHEAD(buf_size) < head_room + len) {
- struct page *page;
+ netmem_ref netmem;
/* slow path, not enough space for data and
* skb_shared_info
@@ -489,10 +489,10 @@ mt76u_build_rx_skb(struct mt76_dev *dev, void *data,
skb_put_data(skb, data + head_room, MT_SKB_HEAD_LEN);
data += head_room + MT_SKB_HEAD_LEN;
- page = virt_to_head_page(data);
- skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
- page, data - page_address(page),
- len - MT_SKB_HEAD_LEN, buf_size);
+ netmem = virt_to_head_netmem(data);
+ skb_add_rx_frag_netmem(skb, skb_shinfo(skb)->nr_frags,
+ netmem, data - netmem_address(netmem),
+ len - MT_SKB_HEAD_LEN, buf_size);
return skb;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (16 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 17/18] mt76: use netmem descriptor and APIs for page pool Byungchul Park
@ 2025-06-04 2:52 ` Byungchul Park
2025-06-04 16:59 ` Toke Høiland-Jørgensen
2025-06-05 10:56 ` Pavel Begunkov
2025-06-04 3:23 ` [RFC v4 00/18] Split netmem from struct page Byungchul Park
18 siblings, 2 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 2:52 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.
To achieve that, all the code should avoid directly accessing page pool
members of struct page.
Access ->pp_magic through struct netmem_desc instead of directly
accessing it through struct page in page_pool_page_is_pp(). Plus, move
page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
without header dependency issue.
Signed-off-by: Byungchul Park <byungchul@sk.com>
---
include/linux/mm.h | 12 ------------
include/net/netmem.h | 14 ++++++++++++++
mm/page_alloc.c | 1 +
3 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e51dba8398f7..f23560853447 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4311,16 +4311,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
*/
#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
-#ifdef CONFIG_PAGE_POOL
-static inline bool page_pool_page_is_pp(struct page *page)
-{
- return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
-}
-#else
-static inline bool page_pool_page_is_pp(struct page *page)
-{
- return false;
-}
-#endif
-
#endif /* _LINUX_MM_H */
diff --git a/include/net/netmem.h b/include/net/netmem.h
index d84ab624b489..8f354ae7d5c3 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
*/
static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
+#ifdef CONFIG_PAGE_POOL
+static inline bool page_pool_page_is_pp(struct page *page)
+{
+ struct netmem_desc *desc = (struct netmem_desc *)page;
+
+ return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
+}
+#else
+static inline bool page_pool_page_is_pp(struct page *page)
+{
+ return false;
+}
+#endif
+
/* net_iov */
DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4f29e393f6af..be0752c0ac92 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -55,6 +55,7 @@
#include <linux/delayacct.h>
#include <linux/cacheinfo.h>
#include <linux/pgalloc_tag.h>
+#include <net/netmem.h>
#include <asm/div64.h>
#include "internal.h"
#include "shuffle.h"
--
2.17.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [RFC v4 00/18] Split netmem from struct page
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
` (17 preceding siblings ...)
2025-06-04 2:52 ` [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
@ 2025-06-04 3:23 ` Byungchul Park
2025-06-05 19:55 ` Mina Almasry
18 siblings, 1 reply; 65+ messages in thread
From: Byungchul Park @ 2025-06-04 3:23 UTC (permalink / raw)
To: willy, almasrymina
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, netdev
On Wed, Jun 04, 2025 at 11:52:28AM +0900, Byungchul Park wrote:
> The MM subsystem is trying to reduce struct page to a single pointer.
> The first step towards that is splitting struct page by its individual
> users, as has already been done with folio and slab. This patchset does
> that for netmem which is used for page pools.
>
> Matthew Wilcox tried and stopped the same work, you can see in:
>
> https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/
>
> Mina Almasry already has done a lot fo prerequisite works by luck. I
> stacked my patches on the top of his work e.i. netmem.
>
> I focused on removing the page pool members in struct page this time,
> not moving the allocation code of page pool from net to mm. It can be
> done later if needed.
>
> The final patch removing the page pool fields will be submitted once
> all the converting work of page to netmem are done:
>
> 1. converting of libeth_fqe by Tony Nguyen.
> 2. converting of mlx5 by Tariq Toukan.
> 3. converting of prueth_swdata (on me).
> 4. converting of freescale driver (on me).
>
> For our discussion, I'm sharing what the final patch looks like the
> following.
To Willy and Mina,
I believe this version might be the final version. Please check the
direction if it's going as you meant so as to go ahead convinced.
As I mentioned above, the final patch should be submitted later once all
the required works on drivers are done, but you can check what it looks
like, in the following embedded patch in this cover letter.
Byungchul
> Byungchul
> --8<--
> commit 1847d9890f798456b21ccb27aac7545303048492
> Author: Byungchul Park <byungchul@sk.com>
> Date: Wed May 28 20:44:55 2025 +0900
>
> mm, netmem: remove the page pool members in struct page
>
> Now that all the users of the page pool members in struct page have been
> gone, the members can be removed from struct page.
>
> However, since struct netmem_desc still uses the space in struct page,
> the important offsets should be checked properly, until struct
> netmem_desc has its own instance from slab.
>
> Remove the page pool members in struct page and modify static checkers
> for the offsets.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 32ba5126e221..db2fe0d0ebbf 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -120,17 +120,6 @@ struct page {
> */
> unsigned long private;
> };
> - struct { /* page_pool used by netstack */
> - /**
> - * @pp_magic: magic value to avoid recycling non
> - * page_pool allocated pages.
> - */
> - unsigned long pp_magic;
> - struct page_pool *pp;
> - unsigned long _pp_mapping_pad;
> - unsigned long dma_addr;
> - atomic_long_t pp_ref_count;
> - };
> struct { /* Tail pages of compound page */
> unsigned long compound_head; /* Bit zero is set */
> };
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index 8f354ae7d5c3..3414f184d018 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -42,11 +42,8 @@ struct netmem_desc {
> static_assert(offsetof(struct page, pg) == \
> offsetof(struct netmem_desc, desc))
> NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
> -NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
> -NETMEM_DESC_ASSERT_OFFSET(pp, pp);
> -NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
> -NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
> -NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> +NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic);
> +NETMEM_DESC_ASSERT_OFFSET(mapping, _pp_mapping_pad);
> #undef NETMEM_DESC_ASSERT_OFFSET
>
> /*
> ---
> Changes from v3:
> 1. Relocates ->owner and ->type of net_iov out of netmem_desc
> and make them be net_iov specific.
> 2. Remove __force when casting struct page to struct netmem_desc.
>
> Changes from v2:
> 1. Introduce a netmem API, virt_to_head_netmem(), and use it
> when it's needed.
> 2. Introduce struct netmem_desc as a new struct and union'ed
> with the existing fields in struct net_iov.
> 3. Make page_pool_page_is_pp() access ->pp_magic through struct
> netmem_desc instead of struct page.
> 4. Move netmem alloc APIs from include/net/netmem.h to
> net/core/netmem_priv.h.
> 5. Apply trivial feedbacks, thanks to Mina, Pavel, and Toke.
> 6. Add given 'Reviewed-by's, thanks to Mina.
>
> Changes from v1:
> 1. Rebase on net-next's main as of May 26.
> 2. Check checkpatch.pl, feedbacked by SJ Park.
> 3. Add converting of page to netmem in mt76.
> 4. Revert 'mlx5: use netmem descriptor and APIs for page pool'
> since it's on-going by Tariq Toukan. I will wait for his
> work to be done.
> 5. Revert 'page_pool: use netmem APIs to access page->pp_magic
> in page_pool_page_is_pp()' since we need more discussion.
> 6. Revert 'mm, netmem: remove the page pool members in struct
> page' since there are some prerequisite works to remove the
> page pool fields from struct page. I can submit this patch
> separatedly later.
> 7. Cancel relocating a page pool member in struct page.
> 8. Modify static assert for offests and size of struct
> netmem_desc.
>
> Changes from rfc:
> 1. Rebase on net-next's main branch.
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
> 2. Fix a build error reported by kernel test robot.
> https://lore.kernel.org/all/202505100932.uzAMBW1y-lkp@intel.com/
> 3. Add given 'Reviewed-by's, thanks to Mina and Ilias.
> 4. Do static_assert() on the size of struct netmem_desc instead
> of placing place-holder in struct page, feedbacked by
> Matthew.
> 5. Do struct_group_tagged(netmem_desc) on struct net_iov instead
> of wholly renaming it to strcut netmem_desc, feedbacked by
> Mina and Pavel.
>
> Byungchul Park (18):
> netmem: introduce struct netmem_desc mirroring struct page
> netmem: introduce netmem alloc APIs to wrap page alloc APIs
> page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
> page_pool: rename __page_pool_alloc_page_order() to
> __page_pool_alloc_netmem_order()
> page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
> page_pool: rename page_pool_return_page() to page_pool_return_netmem()
> page_pool: use netmem put API in page_pool_return_netmem()
> page_pool: rename __page_pool_release_page_dma() to
> __page_pool_release_netmem_dma()
> page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
> page_pool: rename __page_pool_alloc_pages_slow() to
> __page_pool_alloc_netmems_slow()
> mlx4: use netmem descriptor and APIs for page pool
> netmem: use _Generic to cover const casting for page_to_netmem()
> netmem: remove __netmem_get_pp()
> page_pool: make page_pool_get_dma_addr() just wrap
> page_pool_get_dma_addr_netmem()
> netdevsim: use netmem descriptor and APIs for page pool
> netmem: introduce a netmem API, virt_to_head_netmem()
> mt76: use netmem descriptor and APIs for page pool
> page_pool: access ->pp_magic through struct netmem_desc in
> page_pool_page_is_pp()
>
> drivers/net/ethernet/mellanox/mlx4/en_rx.c | 48 +++---
> drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +-
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4 +-
> drivers/net/netdevsim/netdev.c | 19 +--
> drivers/net/netdevsim/netdevsim.h | 2 +-
> drivers/net/wireless/mediatek/mt76/dma.c | 6 +-
> drivers/net/wireless/mediatek/mt76/mt76.h | 12 +-
> .../net/wireless/mediatek/mt76/sdio_txrx.c | 24 +--
> drivers/net/wireless/mediatek/mt76/usb.c | 10 +-
> include/linux/mm.h | 12 --
> include/net/netmem.h | 138 ++++++++++++------
> include/net/page_pool/helpers.h | 7 +-
> mm/page_alloc.c | 1 +
> net/core/netmem_priv.h | 14 ++
> net/core/page_pool.c | 103 ++++++-------
> 15 files changed, 234 insertions(+), 174 deletions(-)
>
>
> base-commit: 90b83efa6701656e02c86e7df2cb1765ea602d07
> --
> 2.17.1
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs
2025-06-04 2:52 ` [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
@ 2025-06-04 15:14 ` Suren Baghdasaryan
2025-06-05 0:53 ` Byungchul Park
2025-06-05 10:05 ` Pavel Begunkov
1 sibling, 1 reply; 65+ messages in thread
From: Suren Baghdasaryan @ 2025-06-04 15:14 UTC (permalink / raw)
To: Byungchul Park
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, david,
lorenzo.stoakes, Liam.Howlett, vbabka, rppt, mhocko, horms,
linux-rdma, bpf, vishal.moola
On Tue, Jun 3, 2025 at 7:53 PM Byungchul Park <byungchul@sk.com> wrote:
>
> To eliminate the use of struct page in page pool, the page pool code
> should use netmem descriptor and APIs instead.
>
> As part of the work, introduce netmem alloc APIs allowing the code to
> use them rather than the existing APIs for struct page.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
> net/core/netmem_priv.h | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
> index cd95394399b4..32e390908bb2 100644
> --- a/net/core/netmem_priv.h
> +++ b/net/core/netmem_priv.h
> @@ -59,4 +59,18 @@ static inline void netmem_set_dma_index(netmem_ref netmem,
> magic = netmem_get_pp_magic(netmem) | (id << PP_DMA_INDEX_SHIFT);
> __netmem_clear_lsb(netmem)->pp_magic = magic;
> }
> +
> +static inline netmem_ref alloc_netmems_node(int nid, gfp_t gfp_mask,
> + unsigned int order)
> +{
> + return page_to_netmem(alloc_pages_node(nid, gfp_mask, order));
> +}
> +
> +static inline unsigned long alloc_netmems_bulk_node(gfp_t gfp, int nid,
> + unsigned long nr_netmems,
> + netmem_ref *netmem_array)
> +{
> + return alloc_pages_bulk_node(gfp, nid, nr_netmems,
> + (struct page **)netmem_array);
> +}
Note: if you want these allocations to be reported in a separate line
inside /proc/allocinfo you need to use alloc_hooks() like this:
static inline unsigned long alloc_netmems_bulk_node_noprof(gfp_t gfp, int nid,
unsigned long nr_netmems,
netmem_ref *netmem_array)
{
return alloc_pages_bulk_node_noprof((gfp, nid, nr_netmems,
(struct page **)netmem_array);
}
#define alloc_netmems_bulk_node(...) \
alloc_hooks(alloc_netmems_bulk_node_noprof(__VA_ARGS__))
> #endif
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring struct page
2025-06-04 2:52 ` [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
@ 2025-06-04 16:53 ` Toke Høiland-Jørgensen
2025-06-05 10:03 ` Pavel Begunkov
2025-06-05 19:34 ` Mina Almasry
2 siblings, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:53 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> To simplify struct page, the page pool members of struct page should be
> moved to other, allowing these members to be removed from struct page.
>
> Introduce a network memory descriptor to store the members, struct
> netmem_desc, and make it union'ed with the existing fields in struct
> net_iov, allowing to organize the fields of struct net_iov.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
2025-06-04 2:52 ` [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
@ 2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:26 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:54 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Use netmem alloc/put APIs instead of page alloc/put APIs and make it
> return netmem_ref instead of struct page * in
> __page_pool_alloc_page_order().
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order()
2025-06-04 2:52 ` [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
@ 2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:28 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:54 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Now that __page_pool_alloc_page_order() uses netmem alloc/put APIs, not
> page alloc/put APIs, rename it to __page_pool_alloc_netmem_order() to
> reflect what it does.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
I think it would be OK to squash this with the preceding patch; but
regardless:
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem()
2025-06-04 2:52 ` [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
@ 2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:33 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:54 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Use netmem put API, put_netmem(), instead of put_page() in
> page_pool_return_netmem().
>
> While at it, delete #include <linux/mm.h> since the last put_page() in
> page_pool.c has been just removed with this patch.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma()
2025-06-04 2:52 ` [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
@ 2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:34 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:55 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Now that __page_pool_release_page_dma() is for releasing netmem, not
> struct page, rename it to __page_pool_release_netmem_dma() to reflect
> what it does.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
2025-06-04 2:52 ` [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
@ 2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:35 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:55 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Now that __page_pool_put_page() puts netmem, not struct page, rename it
> to __page_pool_put_netmem() to reflect what it does.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem()
2025-06-04 2:52 ` [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
@ 2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:40 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:55 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> The current page_to_netmem() doesn't cover const casting resulting in
> trying to cast const struct page * to const netmem_ref fails.
>
> To cover the case, change page_to_netmem() to use macro and _Generic.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 13/18] netmem: remove __netmem_get_pp()
2025-06-04 2:52 ` [RFC v4 13/18] netmem: remove __netmem_get_pp() Byungchul Park
@ 2025-06-04 16:56 ` Toke Høiland-Jørgensen
2025-06-05 10:41 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:56 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> There are no users of __netmem_get_pp(). Remove it.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem()
2025-06-04 2:52 ` [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
@ 2025-06-04 16:57 ` Toke Høiland-Jørgensen
2025-06-05 10:45 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:57 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> The page pool members in struct page cannot be removed unless it's not
> allowed to access any of them via struct page.
>
> Do not access 'page->dma_addr' directly in page_pool_get_dma_addr() but
> just wrap page_pool_get_dma_addr_netmem() safely.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem()
2025-06-04 2:52 ` [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
@ 2025-06-04 16:59 ` Toke Høiland-Jørgensen
2025-06-05 10:45 ` Pavel Begunkov
2025-06-05 19:43 ` Mina Almasry
2 siblings, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:59 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> To eliminate the use of struct page in page pool, the page pool code
> should use netmem descriptor and APIs instead.
>
> As part of the work, introduce a netmem API to convert a virtual address
> to a head netmem allowing the code to use it rather than the existing
> API, virt_to_head_page() for struct page.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-04 2:52 ` [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
@ 2025-06-04 16:59 ` Toke Høiland-Jørgensen
2025-06-05 10:56 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 16:59 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> To simplify struct page, the effort to separate its own descriptor from
> struct page is required and the work for page pool is on going.
>
> To achieve that, all the code should avoid directly accessing page pool
> members of struct page.
>
> Access ->pp_magic through struct netmem_desc instead of directly
> accessing it through struct page in page_pool_page_is_pp(). Plus, move
> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> without header dependency issue.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
2025-06-04 2:52 ` [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
@ 2025-06-04 17:02 ` Toke Høiland-Jørgensen
2025-06-05 10:30 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 17:02 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Use netmem alloc/put APIs instead of page alloc/put APIs in
> __page_pool_alloc_pages_slow().
>
> While at it, improved some comments.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem()
2025-06-04 2:52 ` [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
@ 2025-06-04 17:03 ` Toke Høiland-Jørgensen
2025-06-05 10:31 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 17:03 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Now that page_pool_return_page() is for returning netmem, not struct
> page, rename it to page_pool_return_netmem() to reflect what it does.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow()
2025-06-04 2:52 ` [RFC v4 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
@ 2025-06-04 17:03 ` Toke Høiland-Jørgensen
0 siblings, 0 replies; 65+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-04 17:03 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
Byungchul Park <byungchul@sk.com> writes:
> Now that __page_pool_alloc_pages_slow() is for allocating netmem, not
> struct page, rename it to __page_pool_alloc_netmems_slow() to reflect
> what it does.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs
2025-06-04 15:14 ` Suren Baghdasaryan
@ 2025-06-05 0:53 ` Byungchul Park
0 siblings, 0 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-05 0:53 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, david,
lorenzo.stoakes, Liam.Howlett, vbabka, rppt, mhocko, horms,
linux-rdma, bpf, vishal.moola
On Wed, Jun 04, 2025 at 08:14:18AM -0700, Suren Baghdasaryan wrote:
> On Tue, Jun 3, 2025 at 7:53 PM Byungchul Park <byungchul@sk.com> wrote:
> >
> > To eliminate the use of struct page in page pool, the page pool code
> > should use netmem descriptor and APIs instead.
> >
> > As part of the work, introduce netmem alloc APIs allowing the code to
> > use them rather than the existing APIs for struct page.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> > net/core/netmem_priv.h | 14 ++++++++++++++
> > 1 file changed, 14 insertions(+)
> >
> > diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
> > index cd95394399b4..32e390908bb2 100644
> > --- a/net/core/netmem_priv.h
> > +++ b/net/core/netmem_priv.h
> > @@ -59,4 +59,18 @@ static inline void netmem_set_dma_index(netmem_ref netmem,
> > magic = netmem_get_pp_magic(netmem) | (id << PP_DMA_INDEX_SHIFT);
> > __netmem_clear_lsb(netmem)->pp_magic = magic;
> > }
> > +
> > +static inline netmem_ref alloc_netmems_node(int nid, gfp_t gfp_mask,
> > + unsigned int order)
> > +{
> > + return page_to_netmem(alloc_pages_node(nid, gfp_mask, order));
> > +}
> > +
> > +static inline unsigned long alloc_netmems_bulk_node(gfp_t gfp, int nid,
> > + unsigned long nr_netmems,
> > + netmem_ref *netmem_array)
> > +{
> > + return alloc_pages_bulk_node(gfp, nid, nr_netmems,
> > + (struct page **)netmem_array);
> > +}
>
> Note: if you want these allocations to be reported in a separate line
> inside /proc/allocinfo you need to use alloc_hooks() like this:
Ah, it looks better to use alloc_hooks(). Thanks.
Byungchul
>
> static inline unsigned long alloc_netmems_bulk_node_noprof(gfp_t gfp, int nid,
> unsigned long nr_netmems,
> netmem_ref *netmem_array)
> {
> return alloc_pages_bulk_node_noprof((gfp, nid, nr_netmems,
> (struct page **)netmem_array);
> }
>
> #define alloc_netmems_bulk_node(...) \
> alloc_hooks(alloc_netmems_bulk_node_noprof(__VA_ARGS__))
>
>
>
> > #endif
> > --
> > 2.17.1
> >
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring struct page
2025-06-04 2:52 ` [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-06-04 16:53 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:03 ` Pavel Begunkov
2025-06-05 10:04 ` Pavel Begunkov
2025-06-05 19:34 ` Mina Almasry
2 siblings, 1 reply; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:03 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> To simplify struct page, the page pool members of struct page should be
> moved to other, allowing these members to be removed from struct page.
>
> Introduce a network memory descriptor to store the members, struct
> netmem_desc, and make it union'ed with the existing fields in struct
> net_iov, allowing to organize the fields of struct net_iov.
Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring struct page
2025-06-05 10:03 ` Pavel Begunkov
@ 2025-06-05 10:04 ` Pavel Begunkov
0 siblings, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:04 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/5/25 11:03, Pavel Begunkov wrote:
> On 6/4/25 03:52, Byungchul Park wrote:
>> To simplify struct page, the page pool members of struct page should be
>> moved to other, allowing these members to be removed from struct page.
>>
>> Introduce a network memory descriptor to store the members, struct
>> netmem_desc, and make it union'ed with the existing fields in struct
>> net_iov, allowing to organize the fields of struct net_iov.
>
> Pavel Begunkov <asml.silence@gmail.com>
Oops, it should be
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs
2025-06-04 2:52 ` [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
2025-06-04 15:14 ` Suren Baghdasaryan
@ 2025-06-05 10:05 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:05 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> To eliminate the use of struct page in page pool, the page pool code
> should use netmem descriptor and APIs instead.
>
> As part of the work, introduce netmem alloc APIs allowing the code to
> use them rather than the existing APIs for struct page.
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
2025-06-04 2:52 ` [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:26 ` Pavel Begunkov
2025-06-05 19:39 ` Mina Almasry
1 sibling, 1 reply; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:26 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> Use netmem alloc/put APIs instead of page alloc/put APIs and make it
> return netmem_ref instead of struct page * in
> __page_pool_alloc_page_order().
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
> ---
> net/core/page_pool.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 4011eb305cee..523354f2db1c 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -518,29 +518,29 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
> return false;
> }
>
> -static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
> - gfp_t gfp)
> +static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
> + gfp_t gfp)
> {
> - struct page *page;
> + netmem_ref netmem;
>
> gfp |= __GFP_COMP;
> - page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
> - if (unlikely(!page))
> - return NULL;
> + netmem = alloc_netmems_node(pool->p.nid, gfp, pool->p.order);
> + if (unlikely(!netmem))
> + return 0;
>
> - if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page), gfp))) {
> - put_page(page);
> - return NULL;
> + if (pool->dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
> + put_netmem(netmem);
It's a bad idea to have {put,get}_netmem in page pool's code, it has a
different semantics from what page pool expects for net_iov. I.e.
instead of releasing the netmem and allowing it to be reallocated by
page pool, put_netmem(niov) will drop a memory provider reference and
leak the net_iov. Depending on implementation it might even underflow
mp refs if a net_iov is ever passed here.
The second problem is that you pass it page_pool_dma_map(), which
works only with struct page and not net_iov, and so it just
unconditionally casts it back to struct page. Which, to be fair,
a pre-existent issue.
This function deals with pages, can we just use pages instead and
cast to netmem when needed? Sth like in this pseudo code:
netmem_ref __page_pool_alloc_page_order() {
struct page *p = alloc_pages_order();
netmem = page_to_netmem(p);
if (!page_pool_dma_map(netmem)) {
put_page(p);
return 0;
}
return netmem;
}
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order()
2025-06-04 2:52 ` [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:28 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:28 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> Now that __page_pool_alloc_page_order() uses netmem alloc/put APIs, not
> page alloc/put APIs, rename it to __page_pool_alloc_netmem_order() to
> reflect what it does.
FWIW, I think the current name is better, the function allocates pages,
even if it wraps them as netmems.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow()
2025-06-04 2:52 ` [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
2025-06-04 17:02 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:30 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:30 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> Use netmem alloc/put APIs instead of page alloc/put APIs in
> __page_pool_alloc_pages_slow().
>
> While at it, improved some comments.
Same comment as with Patch 3
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem()
2025-06-04 2:52 ` [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
2025-06-04 17:03 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:31 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:31 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> Now that page_pool_return_page() is for returning netmem, not struct
> page, rename it to page_pool_return_netmem() to reflect what it does.
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem()
2025-06-04 2:52 ` [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:33 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:33 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> Use netmem put API, put_netmem(), instead of put_page() in
> page_pool_return_netmem().
>
> While at it, delete #include <linux/mm.h> since the last put_page() in
> page_pool.c has been just removed with this patch.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
> ---
> net/core/page_pool.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index b7680dcb83e4..dab89bc69f10 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -20,7 +20,6 @@
> #include <linux/dma-direction.h>
> #include <linux/dma-mapping.h>
> #include <linux/page-flags.h>
> -#include <linux/mm.h> /* for put_page() */
> #include <linux/poison.h>
> #include <linux/ethtool.h>
> #include <linux/netdevice.h>
> @@ -712,7 +711,7 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
> /* Disconnects a page (from a page_pool). API users can have a need
> * to disconnect a page (from a page_pool), to allow it to be used as
> * a regular page (that will eventually be returned to the normal
> - * page-allocator via put_page).
> + * page-allocator via put_netmem()).
> */
> static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
> {
> @@ -733,7 +732,7 @@ static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
>
> if (put) {
> page_pool_clear_pp_info(netmem);
> - put_page(netmem_to_page(netmem));
> + put_netmem(netmem);
Same comment as well. I guess we shouldn't even be returning "put"
from memory providers.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma()
2025-06-04 2:52 ` [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:34 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:34 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> Now that __page_pool_release_page_dma() is for releasing netmem, not
> struct page, rename it to __page_pool_release_netmem_dma() to reflect
> what it does.
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
2025-06-04 2:52 ` [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:35 ` Pavel Begunkov
2025-06-05 10:39 ` Pavel Begunkov
1 sibling, 1 reply; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:35 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> Now that __page_pool_put_page() puts netmem, not struct page, rename it
> to __page_pool_put_netmem() to reflect what it does.
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem()
2025-06-05 10:35 ` Pavel Begunkov
@ 2025-06-05 10:39 ` Pavel Begunkov
0 siblings, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:39 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/5/25 11:35, Pavel Begunkov wrote:
> On 6/4/25 03:52, Byungchul Park wrote:
>> Now that __page_pool_put_page() puts netmem, not struct page, rename it
>> to __page_pool_put_netmem() to reflect what it does.
>
> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Actually, the function is for non-mp struct pages only, would make
sense to use struct page instead. I'd even argue that it's better
to change the argument to struct page *.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem()
2025-06-04 2:52 ` [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:40 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:40 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> The current page_to_netmem() doesn't cover const casting resulting in
> trying to cast const struct page * to const netmem_ref fails.
>
> To cover the case, change page_to_netmem() to use macro and _Generic.
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 13/18] netmem: remove __netmem_get_pp()
2025-06-04 2:52 ` [RFC v4 13/18] netmem: remove __netmem_get_pp() Byungchul Park
2025-06-04 16:56 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:41 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:41 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> There are no users of __netmem_get_pp(). Remove it.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem()
2025-06-04 2:52 ` [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
2025-06-04 16:57 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:45 ` Pavel Begunkov
1 sibling, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:45 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> The page pool members in struct page cannot be removed unless it's not
> allowed to access any of them via struct page.
>
> Do not access 'page->dma_addr' directly in page_pool_get_dma_addr() but
> just wrap page_pool_get_dma_addr_netmem() safely.
FWIW, it adds small extra cost to the function, but that should be fine
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem()
2025-06-04 2:52 ` [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
2025-06-04 16:59 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:45 ` Pavel Begunkov
2025-06-05 19:43 ` Mina Almasry
2 siblings, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:45 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> To eliminate the use of struct page in page pool, the page pool code
> should use netmem descriptor and APIs instead.
>
> As part of the work, introduce a netmem API to convert a virtual address
> to a head netmem allowing the code to use it rather than the existing
> API, virt_to_head_page() for struct page.
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-04 2:52 ` [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
2025-06-04 16:59 ` Toke Høiland-Jørgensen
@ 2025-06-05 10:56 ` Pavel Begunkov
2025-06-05 11:49 ` Harry Yoo
1 sibling, 1 reply; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 10:56 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, toke, tariqt, edumazet, pabeni, saeedm, leon, ast,
daniel, david, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola
On 6/4/25 03:52, Byungchul Park wrote:
> To simplify struct page, the effort to separate its own descriptor from
> struct page is required and the work for page pool is on going.
>
> To achieve that, all the code should avoid directly accessing page pool
> members of struct page.
Just to clarify, are we leaving the corresponding struct page fields
for now until the final memdesc conversion is done? If so, it might be
better to leave the access in page_pool_page_is_pp() to be "page->pp_magic",
so that once removed the build fails until the helper is fixed up to
use the page->type or so.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-05 10:56 ` Pavel Begunkov
@ 2025-06-05 11:49 ` Harry Yoo
2025-06-05 12:17 ` Harry Yoo
2025-06-05 19:47 ` Mina Almasry
0 siblings, 2 replies; 65+ messages in thread
From: Harry Yoo @ 2025-06-05 11:49 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, hawk, akpm,
davem, john.fastabend, andrew+netdev, toke, tariqt, edumazet,
pabeni, saeedm, leon, ast, daniel, david, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola
On Thu, Jun 05, 2025 at 11:56:14AM +0100, Pavel Begunkov wrote:
> On 6/4/25 03:52, Byungchul Park wrote:
> > To simplify struct page, the effort to separate its own descriptor from
> > struct page is required and the work for page pool is on going.
> >
> > To achieve that, all the code should avoid directly accessing page pool
> > members of struct page.
>
> Just to clarify, are we leaving the corresponding struct page fields
> for now until the final memdesc conversion is done?
Yes, that's correct.
> If so, it might be better to leave the access in page_pool_page_is_pp()
> to be "page->pp_magic", so that once removed the build fails until
> the helper is fixed up to use the page->type or so.
When we truly separate netmem from struct page, we won't have 'lru' field
in memdesc (because not all types of memory are on LRU list),
so NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic) should fail.
And then page_pool_page_is_pp() should be changed to check lower bits
of memdesc pointer to identify its type.
https://kernelnewbies.org/MatthewWilcox/Memdescs/Path
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-05 11:49 ` Harry Yoo
@ 2025-06-05 12:17 ` Harry Yoo
2025-06-05 13:28 ` Pavel Begunkov
2025-06-05 19:47 ` Mina Almasry
1 sibling, 1 reply; 65+ messages in thread
From: Harry Yoo @ 2025-06-05 12:17 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, hawk, akpm,
davem, john.fastabend, andrew+netdev, toke, tariqt, edumazet,
pabeni, saeedm, leon, ast, daniel, david, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola
On Thu, Jun 05, 2025 at 08:49:07PM +0900, Harry Yoo wrote:
> On Thu, Jun 05, 2025 at 11:56:14AM +0100, Pavel Begunkov wrote:
> > On 6/4/25 03:52, Byungchul Park wrote:
> > > To simplify struct page, the effort to separate its own descriptor from
> > > struct page is required and the work for page pool is on going.
> > >
> > > To achieve that, all the code should avoid directly accessing page pool
> > > members of struct page.
> >
> > Just to clarify, are we leaving the corresponding struct page fields
> > for now until the final memdesc conversion is done?
>
> Yes, that's correct.
Oops, looks like misread it. If by "leaving the corresponding struct page
fields" you meant "leaving netmem fields in struct page", no.
It'll be removed.
> > If so, it might be better to leave the access in page_pool_page_is_pp()
> > to be "page->pp_magic", so that once removed the build fails until
> > the helper is fixed up to use the page->type or so.
>
> When we truly separate netmem from struct page, we won't have 'lru' field
> in memdesc (because not all types of memory are on LRU list),
> so NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic) should fail.
>
> And then page_pool_page_is_pp() should be changed to check lower bits
> of memdesc pointer to identify its type.
>
> https://kernelnewbies.org/MatthewWilcox/Memdescs/Path
>
> --
> Cheers,
> Harry / Hyeonggon
>
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-05 12:17 ` Harry Yoo
@ 2025-06-05 13:28 ` Pavel Begunkov
0 siblings, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 13:28 UTC (permalink / raw)
To: Harry Yoo
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, hawk, akpm,
davem, john.fastabend, andrew+netdev, toke, tariqt, edumazet,
pabeni, saeedm, leon, ast, daniel, david, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola
On 6/5/25 13:17, Harry Yoo wrote:
> On Thu, Jun 05, 2025 at 08:49:07PM +0900, Harry Yoo wrote:
>> On Thu, Jun 05, 2025 at 11:56:14AM +0100, Pavel Begunkov wrote:
>>> On 6/4/25 03:52, Byungchul Park wrote:
>>>> To simplify struct page, the effort to separate its own descriptor from
>>>> struct page is required and the work for page pool is on going.
>>>>
>>>> To achieve that, all the code should avoid directly accessing page pool
>>>> members of struct page.
>>>
>>> Just to clarify, are we leaving the corresponding struct page fields
>>> for now until the final memdesc conversion is done?
>>
>> Yes, that's correct.
>
> Oops, looks like misread it. If by "leaving the corresponding struct page
> fields" you meant "leaving netmem fields in struct page", no.
> It'll be removed.
I see, in which case instead we might want to leave a reminder
in page_pool_page_is_pp in a form of a build warning, but the
patch looks fine either way.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring struct page
2025-06-04 2:52 ` [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-06-04 16:53 ` Toke Høiland-Jørgensen
2025-06-05 10:03 ` Pavel Begunkov
@ 2025-06-05 19:34 ` Mina Almasry
2 siblings, 0 replies; 65+ messages in thread
From: Mina Almasry @ 2025-06-05 19:34 UTC (permalink / raw)
To: Byungchul Park
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
On Tue, Jun 3, 2025 at 7:53 PM Byungchul Park <byungchul@sk.com> wrote:
>
> To simplify struct page, the page pool members of struct page should be
> moved to other, allowing these members to be removed from struct page.
>
> Introduce a network memory descriptor to store the members, struct
> netmem_desc, and make it union'ed with the existing fields in struct
> net_iov, allowing to organize the fields of struct net_iov.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
But, if you want this merged via net-next, follow the netdev rules:
https://docs.kernel.org/process/maintainer-netdev.html
In particular, the series needs to target the net-next tree via the
[PATCH net-next ...] prefix. And net-next is currently closed, so
resend once it reopens as non-RFC.
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
2025-06-05 10:26 ` Pavel Begunkov
@ 2025-06-05 19:39 ` Mina Almasry
2025-06-05 20:27 ` Pavel Begunkov
0 siblings, 1 reply; 65+ messages in thread
From: Mina Almasry @ 2025-06-05 19:39 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
On Thu, Jun 5, 2025 at 3:25 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 6/4/25 03:52, Byungchul Park wrote:
> > Use netmem alloc/put APIs instead of page alloc/put APIs and make it
> > return netmem_ref instead of struct page * in
> > __page_pool_alloc_page_order().
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > Reviewed-by: Mina Almasry <almasrymina@google.com>
> > ---
> > net/core/page_pool.c | 26 +++++++++++++-------------
> > 1 file changed, 13 insertions(+), 13 deletions(-)
> >
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index 4011eb305cee..523354f2db1c 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -518,29 +518,29 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
> > return false;
> > }
> >
> > -static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
> > - gfp_t gfp)
> > +static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
> > + gfp_t gfp)
> > {
> > - struct page *page;
> > + netmem_ref netmem;
> >
> > gfp |= __GFP_COMP;
> > - page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
> > - if (unlikely(!page))
> > - return NULL;
> > + netmem = alloc_netmems_node(pool->p.nid, gfp, pool->p.order);
> > + if (unlikely(!netmem))
> > + return 0;
> >
> > - if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page), gfp))) {
> > - put_page(page);
> > - return NULL;
> > + if (pool->dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
> > + put_netmem(netmem);
>
> It's a bad idea to have {put,get}_netmem in page pool's code, it has a
> different semantics from what page pool expects for net_iov. I.e.
> instead of releasing the netmem and allowing it to be reallocated by
> page pool, put_netmem(niov) will drop a memory provider reference and
> leak the net_iov. Depending on implementation it might even underflow
> mp refs if a net_iov is ever passed here.
>
Hmm, put_netmem (I hope) is designed and implemented to do the right
thing no matter what netmem you pass it (and it needs to, because we
can't predict what netmem will be passed to it):
- For non-pp pages, it drops a page ref.
- For pp pages, it drops a pp ref.
- For non-pp net_iovs (devmem TX), it drops a net_iov ref (which for
devmem net_iovs is a binding ref)
- For pp net_iovs, it drops a niov->pp ref (the same for both iouring
and devmem).
In my estimation using it should be safe to use put_netmem here, but
I'm not opposed to reverting to put_page here, since we're sure it's a
page in this call path anyway.
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem()
2025-06-04 2:52 ` [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
2025-06-04 16:59 ` Toke Høiland-Jørgensen
2025-06-05 10:45 ` Pavel Begunkov
@ 2025-06-05 19:43 ` Mina Almasry
2 siblings, 0 replies; 65+ messages in thread
From: Mina Almasry @ 2025-06-05 19:43 UTC (permalink / raw)
To: Byungchul Park
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
On Tue, Jun 3, 2025 at 7:53 PM Byungchul Park <byungchul@sk.com> wrote:
>
> To eliminate the use of struct page in page pool, the page pool code
> should use netmem descriptor and APIs instead.
>
> As part of the work, introduce a netmem API to convert a virtual address
> to a head netmem allowing the code to use it rather than the existing
> API, virt_to_head_page() for struct page.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
> include/net/netmem.h | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index d4066fcb1fee..d84ab624b489 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -265,6 +265,13 @@ static inline netmem_ref netmem_compound_head(netmem_ref netmem)
> return page_to_netmem(compound_head(netmem_to_page(netmem)));
> }
>
> +static inline netmem_ref virt_to_head_netmem(const void *x)
> +{
> + netmem_ref netmem = virt_to_netmem(x);
> +
> + return netmem_compound_head(netmem);
> +}
> +
I would squash with the patch that first calls this to shrink the
series, but anyway:
Reviewed-by: Mina Almasry <almasrymina@google.com>
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-05 11:49 ` Harry Yoo
2025-06-05 12:17 ` Harry Yoo
@ 2025-06-05 19:47 ` Mina Almasry
2025-06-05 20:16 ` Pavel Begunkov
1 sibling, 1 reply; 65+ messages in thread
From: Mina Almasry @ 2025-06-05 19:47 UTC (permalink / raw)
To: Harry Yoo
Cc: Pavel Begunkov, Byungchul Park, willy, netdev, linux-kernel,
linux-mm, kernel_team, kuba, ilias.apalodimas, hawk, akpm, davem,
john.fastabend, andrew+netdev, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
On Thu, Jun 5, 2025 at 4:49 AM Harry Yoo <harry.yoo@oracle.com> wrote:
>
> On Thu, Jun 05, 2025 at 11:56:14AM +0100, Pavel Begunkov wrote:
> > On 6/4/25 03:52, Byungchul Park wrote:
> > > To simplify struct page, the effort to separate its own descriptor from
> > > struct page is required and the work for page pool is on going.
> > >
> > > To achieve that, all the code should avoid directly accessing page pool
> > > members of struct page.
> >
> > Just to clarify, are we leaving the corresponding struct page fields
> > for now until the final memdesc conversion is done?
>
> Yes, that's correct.
>
> > If so, it might be better to leave the access in page_pool_page_is_pp()
> > to be "page->pp_magic", so that once removed the build fails until
> > the helper is fixed up to use the page->type or so.
>
> When we truly separate netmem from struct page, we won't have 'lru' field
> in memdesc (because not all types of memory are on LRU list),
> so NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic) should fail.
>
> And then page_pool_page_is_pp() should be changed to check lower bits
> of memdesc pointer to identify its type.
>
Oh boy, I'm not sure that works. We already do LSB tricks with
netmem_ref to tell what kind of ref it is. I think the LSB pointer
tricks with netmem_ref and netmem_desc may trample each other's toes.
I guess we'll cross that bridge when we get to it...
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 00/18] Split netmem from struct page
2025-06-04 3:23 ` [RFC v4 00/18] Split netmem from struct page Byungchul Park
@ 2025-06-05 19:55 ` Mina Almasry
2025-06-09 4:22 ` Byungchul Park
0 siblings, 1 reply; 65+ messages in thread
From: Mina Almasry @ 2025-06-05 19:55 UTC (permalink / raw)
To: Byungchul Park
Cc: willy, linux-kernel, linux-mm, kernel_team, kuba,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, netdev
On Tue, Jun 3, 2025 at 8:23 PM Byungchul Park <byungchul@sk.com> wrote:
>
> On Wed, Jun 04, 2025 at 11:52:28AM +0900, Byungchul Park wrote:
> > The MM subsystem is trying to reduce struct page to a single pointer.
> > The first step towards that is splitting struct page by its individual
> > users, as has already been done with folio and slab. This patchset does
> > that for netmem which is used for page pools.
> >
> > Matthew Wilcox tried and stopped the same work, you can see in:
> >
> > https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/
> >
> > Mina Almasry already has done a lot fo prerequisite works by luck. I
> > stacked my patches on the top of his work e.i. netmem.
> >
> > I focused on removing the page pool members in struct page this time,
> > not moving the allocation code of page pool from net to mm. It can be
> > done later if needed.
> >
> > The final patch removing the page pool fields will be submitted once
> > all the converting work of page to netmem are done:
> >
> > 1. converting of libeth_fqe by Tony Nguyen.
> > 2. converting of mlx5 by Tariq Toukan.
> > 3. converting of prueth_swdata (on me).
> > 4. converting of freescale driver (on me).
> >
> > For our discussion, I'm sharing what the final patch looks like the
> > following.
>
> To Willy and Mina,
>
> I believe this version might be the final version. Please check the
> direction if it's going as you meant so as to go ahead convinced.
>
> As I mentioned above, the final patch should be submitted later once all
> the required works on drivers are done, but you can check what it looks
> like, in the following embedded patch in this cover letter.
>
We need this tested with at least 1 of devmem TCP and io_uring zc to
make sure the net_iov stuff isn't broken (I'll get to that when I have
time).
And we need page_pool benchmark numbers before/after this series,
please run those yourself, if at all possible:
https://lore.kernel.org/netdev/20250525034354.258247-1-almasrymina@google.com/
This series adds a bunch of netmem/page casts. I expect them not to
affect fast-path perf, but making sure would be nice.
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-05 19:47 ` Mina Almasry
@ 2025-06-05 20:16 ` Pavel Begunkov
0 siblings, 0 replies; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 20:16 UTC (permalink / raw)
To: Mina Almasry, Harry Yoo
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, ilias.apalodimas, hawk, akpm, davem,
john.fastabend, andrew+netdev, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
On 6/5/25 20:47, Mina Almasry wrote:
> On Thu, Jun 5, 2025 at 4:49 AM Harry Yoo <harry.yoo@oracle.com> wrote:
>>
>> On Thu, Jun 05, 2025 at 11:56:14AM +0100, Pavel Begunkov wrote:
>>> On 6/4/25 03:52, Byungchul Park wrote:
>>>> To simplify struct page, the effort to separate its own descriptor from
>>>> struct page is required and the work for page pool is on going.
>>>>
>>>> To achieve that, all the code should avoid directly accessing page pool
>>>> members of struct page.
>>>
>>> Just to clarify, are we leaving the corresponding struct page fields
>>> for now until the final memdesc conversion is done?
>>
>> Yes, that's correct.
>>
>>> If so, it might be better to leave the access in page_pool_page_is_pp()
>>> to be "page->pp_magic", so that once removed the build fails until
>>> the helper is fixed up to use the page->type or so.
>>
>> When we truly separate netmem from struct page, we won't have 'lru' field
>> in memdesc (because not all types of memory are on LRU list),
>> so NETMEM_DESC_ASSERT_OFFSET(lru, pp_magic) should fail.
>>
>> And then page_pool_page_is_pp() should be changed to check lower bits
>> of memdesc pointer to identify its type.
>>
>
> Oh boy, I'm not sure that works. We already do LSB tricks with
> netmem_ref to tell what kind of ref it is. I think the LSB pointer
> tricks with netmem_ref and netmem_desc may trample each other's toes.
> I guess we'll cross that bridge when we get to it...
I believe Harry wants to tag struct page::memdesc, while
netmem is tagging the struct page pointer / net_iov.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
2025-06-05 19:39 ` Mina Almasry
@ 2025-06-05 20:27 ` Pavel Begunkov
2025-06-05 20:34 ` Mina Almasry
0 siblings, 1 reply; 65+ messages in thread
From: Pavel Begunkov @ 2025-06-05 20:27 UTC (permalink / raw)
To: Mina Almasry
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
On 6/5/25 20:39, Mina Almasry wrote:
> On Thu, Jun 5, 2025 at 3:25 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> On 6/4/25 03:52, Byungchul Park wrote:
>>> Use netmem alloc/put APIs instead of page alloc/put APIs and make it
>>> return netmem_ref instead of struct page * in
>>> __page_pool_alloc_page_order().
>>>
>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
>>> ---
>>> net/core/page_pool.c | 26 +++++++++++++-------------
>>> 1 file changed, 13 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
>>> index 4011eb305cee..523354f2db1c 100644
>>> --- a/net/core/page_pool.c
>>> +++ b/net/core/page_pool.c
>>> @@ -518,29 +518,29 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
>>> return false;
>>> }
>>>
>>> -static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
>>> - gfp_t gfp)
>>> +static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
>>> + gfp_t gfp)
>>> {
>>> - struct page *page;
>>> + netmem_ref netmem;
>>>
>>> gfp |= __GFP_COMP;
>>> - page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
>>> - if (unlikely(!page))
>>> - return NULL;
>>> + netmem = alloc_netmems_node(pool->p.nid, gfp, pool->p.order);
>>> + if (unlikely(!netmem))
>>> + return 0;
>>>
>>> - if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page), gfp))) {
>>> - put_page(page);
>>> - return NULL;
>>> + if (pool->dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
>>> + put_netmem(netmem);
>>
>> It's a bad idea to have {put,get}_netmem in page pool's code, it has a
>> different semantics from what page pool expects for net_iov. I.e.
>> instead of releasing the netmem and allowing it to be reallocated by
>> page pool, put_netmem(niov) will drop a memory provider reference and
>> leak the net_iov. Depending on implementation it might even underflow
>> mp refs if a net_iov is ever passed here.
>>
>
> Hmm, put_netmem (I hope) is designed and implemented to do the right
> thing no matter what netmem you pass it (and it needs to, because we
> can't predict what netmem will be passed to it):
>
> - For non-pp pages, it drops a page ref.
> - For pp pages, it drops a pp ref.
> - For non-pp net_iovs (devmem TX), it drops a net_iov ref (which for
> devmem net_iovs is a binding ref)
> - For pp net_iovs, it drops a niov->pp ref (the same for both iouring
> and devmem).
void put_netmem(netmem_ref netmem)
{
struct net_iov *niov;
if (netmem_is_net_iov(netmem)) {
niov = netmem_to_net_iov(netmem);
if (net_is_devmem_iov(niov))
net_devmem_put_net_iov(netmem_to_net_iov(netmem));
return;
}
put_page(netmem_to_page(netmem));
}
EXPORT_SYMBOL(put_netmem);
void net_devmem_put_net_iov(struct net_iov *niov)
{
net_devmem_dmabuf_binding_put(net_devmem_iov_binding(niov));
}
Am I looking at an outdated version? for devmem net_iov it always puts
the binding and not niov refs, and it's always does put_page for pages.
And it'd also silently ignore io_uring. And we're also patching early
alloc/init failures in this series, so gauging if it's pp or non-pp
originated struct page might be dangerous and depend on init order. We
don't even need to think about all that if we continue to use put_page,
which is why I think it's a much better option.
> In my estimation using it should be safe to use put_netmem here, but
> I'm not opposed to reverting to put_page here, since we're sure it's a
> page in this call path anyway.
>
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order()
2025-06-05 20:27 ` Pavel Begunkov
@ 2025-06-05 20:34 ` Mina Almasry
0 siblings, 0 replies; 65+ messages in thread
From: Mina Almasry @ 2025-06-05 20:34 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola
On Thu, Jun 5, 2025 at 1:26 PM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 6/5/25 20:39, Mina Almasry wrote:
> > On Thu, Jun 5, 2025 at 3:25 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>
> >> On 6/4/25 03:52, Byungchul Park wrote:
> >>> Use netmem alloc/put APIs instead of page alloc/put APIs and make it
> >>> return netmem_ref instead of struct page * in
> >>> __page_pool_alloc_page_order().
> >>>
> >>> Signed-off-by: Byungchul Park <byungchul@sk.com>
> >>> Reviewed-by: Mina Almasry <almasrymina@google.com>
> >>> ---
> >>> net/core/page_pool.c | 26 +++++++++++++-------------
> >>> 1 file changed, 13 insertions(+), 13 deletions(-)
> >>>
> >>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> >>> index 4011eb305cee..523354f2db1c 100644
> >>> --- a/net/core/page_pool.c
> >>> +++ b/net/core/page_pool.c
> >>> @@ -518,29 +518,29 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g
> >>> return false;
> >>> }
> >>>
> >>> -static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
> >>> - gfp_t gfp)
> >>> +static netmem_ref __page_pool_alloc_page_order(struct page_pool *pool,
> >>> + gfp_t gfp)
> >>> {
> >>> - struct page *page;
> >>> + netmem_ref netmem;
> >>>
> >>> gfp |= __GFP_COMP;
> >>> - page = alloc_pages_node(pool->p.nid, gfp, pool->p.order);
> >>> - if (unlikely(!page))
> >>> - return NULL;
> >>> + netmem = alloc_netmems_node(pool->p.nid, gfp, pool->p.order);
> >>> + if (unlikely(!netmem))
> >>> + return 0;
> >>>
> >>> - if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page), gfp))) {
> >>> - put_page(page);
> >>> - return NULL;
> >>> + if (pool->dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) {
> >>> + put_netmem(netmem);
> >>
> >> It's a bad idea to have {put,get}_netmem in page pool's code, it has a
> >> different semantics from what page pool expects for net_iov. I.e.
> >> instead of releasing the netmem and allowing it to be reallocated by
> >> page pool, put_netmem(niov) will drop a memory provider reference and
> >> leak the net_iov. Depending on implementation it might even underflow
> >> mp refs if a net_iov is ever passed here.
> >>
> >
> > Hmm, put_netmem (I hope) is designed and implemented to do the right
> > thing no matter what netmem you pass it (and it needs to, because we
> > can't predict what netmem will be passed to it):
> >
> > - For non-pp pages, it drops a page ref.
> > - For pp pages, it drops a pp ref.
> > - For non-pp net_iovs (devmem TX), it drops a net_iov ref (which for
> > devmem net_iovs is a binding ref)
> > - For pp net_iovs, it drops a niov->pp ref (the same for both iouring
> > and devmem).
>
> void put_netmem(netmem_ref netmem)
> {
> struct net_iov *niov;
>
> if (netmem_is_net_iov(netmem)) {
> niov = netmem_to_net_iov(netmem);
> if (net_is_devmem_iov(niov))
> net_devmem_put_net_iov(netmem_to_net_iov(netmem));
> return;
> }
>
> put_page(netmem_to_page(netmem));
> }
> EXPORT_SYMBOL(put_netmem);
>
> void net_devmem_put_net_iov(struct net_iov *niov)
> {
> net_devmem_dmabuf_binding_put(net_devmem_iov_binding(niov));
> }
>
> Am I looking at an outdated version? for devmem net_iov it always puts
> the binding and not niov refs, and it's always does put_page for pages.
> And it'd also silently ignore io_uring. And we're also patching early
> alloc/init failures in this series, so gauging if it's pp or non-pp
> originated struct page might be dangerous and depend on init order. We
> don't even need to think about all that if we continue to use put_page,
> which is why I think it's a much better option.
>
Oh, my bad. I was thinking of skb_page_unref, which actually handles
all net_iov/page types correctly. You're right, put_netmem doesn't
actually do that.
In that case reverting to put_page would be better here indeed.
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 00/18] Split netmem from struct page
2025-06-05 19:55 ` Mina Almasry
@ 2025-06-09 4:22 ` Byungchul Park
2025-06-09 7:53 ` Byungchul Park
0 siblings, 1 reply; 65+ messages in thread
From: Byungchul Park @ 2025-06-09 4:22 UTC (permalink / raw)
To: Mina Almasry
Cc: willy, linux-kernel, linux-mm, kernel_team, kuba,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, netdev
On Thu, Jun 05, 2025 at 12:55:30PM -0700, Mina Almasry wrote:
> On Tue, Jun 3, 2025 at 8:23 PM Byungchul Park <byungchul@sk.com> wrote:
> >
> > On Wed, Jun 04, 2025 at 11:52:28AM +0900, Byungchul Park wrote:
> > > The MM subsystem is trying to reduce struct page to a single pointer.
> > > The first step towards that is splitting struct page by its individual
> > > users, as has already been done with folio and slab. This patchset does
> > > that for netmem which is used for page pools.
> > >
> > > Matthew Wilcox tried and stopped the same work, you can see in:
> > >
> > > https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/
> > >
> > > Mina Almasry already has done a lot fo prerequisite works by luck. I
> > > stacked my patches on the top of his work e.i. netmem.
> > >
> > > I focused on removing the page pool members in struct page this time,
> > > not moving the allocation code of page pool from net to mm. It can be
> > > done later if needed.
> > >
> > > The final patch removing the page pool fields will be submitted once
> > > all the converting work of page to netmem are done:
> > >
> > > 1. converting of libeth_fqe by Tony Nguyen.
> > > 2. converting of mlx5 by Tariq Toukan.
> > > 3. converting of prueth_swdata (on me).
> > > 4. converting of freescale driver (on me).
> > >
> > > For our discussion, I'm sharing what the final patch looks like the
> > > following.
> >
> > To Willy and Mina,
> >
> > I believe this version might be the final version. Please check the
> > direction if it's going as you meant so as to go ahead convinced.
> >
> > As I mentioned above, the final patch should be submitted later once all
> > the required works on drivers are done, but you can check what it looks
> > like, in the following embedded patch in this cover letter.
> >
>
> We need this tested with at least 1 of devmem TCP and io_uring zc to
> make sure the net_iov stuff isn't broken (I'll get to that when I have
> time).
>
> And we need page_pool benchmark numbers before/after this series,
> please run those yourself, if at all possible:
I'm trying but it keeps conflicting on several steps.. Please share a
better manual.
Byungchul
> https://lore.kernel.org/netdev/20250525034354.258247-1-almasrymina@google.com/
>
> This series adds a bunch of netmem/page casts. I expect them not to
> affect fast-path perf, but making sure would be nice.
>
> --
> Thanks,
> Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [RFC v4 00/18] Split netmem from struct page
2025-06-09 4:22 ` Byungchul Park
@ 2025-06-09 7:53 ` Byungchul Park
0 siblings, 0 replies; 65+ messages in thread
From: Byungchul Park @ 2025-06-09 7:53 UTC (permalink / raw)
To: Mina Almasry
Cc: willy, linux-kernel, linux-mm, kernel_team, kuba,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, netdev
On Mon, Jun 09, 2025 at 01:22:55PM +0900, Byungchul Park wrote:
> On Thu, Jun 05, 2025 at 12:55:30PM -0700, Mina Almasry wrote:
> > On Tue, Jun 3, 2025 at 8:23 PM Byungchul Park <byungchul@sk.com> wrote:
> > >
> > > On Wed, Jun 04, 2025 at 11:52:28AM +0900, Byungchul Park wrote:
> > > > The MM subsystem is trying to reduce struct page to a single pointer.
> > > > The first step towards that is splitting struct page by its individual
> > > > users, as has already been done with folio and slab. This patchset does
> > > > that for netmem which is used for page pools.
> > > >
> > > > Matthew Wilcox tried and stopped the same work, you can see in:
> > > >
> > > > https://lore.kernel.org/linux-mm/20230111042214.907030-1-willy@infradead.org/
> > > >
> > > > Mina Almasry already has done a lot fo prerequisite works by luck. I
> > > > stacked my patches on the top of his work e.i. netmem.
> > > >
> > > > I focused on removing the page pool members in struct page this time,
> > > > not moving the allocation code of page pool from net to mm. It can be
> > > > done later if needed.
> > > >
> > > > The final patch removing the page pool fields will be submitted once
> > > > all the converting work of page to netmem are done:
> > > >
> > > > 1. converting of libeth_fqe by Tony Nguyen.
> > > > 2. converting of mlx5 by Tariq Toukan.
> > > > 3. converting of prueth_swdata (on me).
> > > > 4. converting of freescale driver (on me).
> > > >
> > > > For our discussion, I'm sharing what the final patch looks like the
> > > > following.
> > >
> > > To Willy and Mina,
> > >
> > > I believe this version might be the final version. Please check the
> > > direction if it's going as you meant so as to go ahead convinced.
> > >
> > > As I mentioned above, the final patch should be submitted later once all
> > > the required works on drivers are done, but you can check what it looks
> > > like, in the following embedded patch in this cover letter.
> > >
> >
> > We need this tested with at least 1 of devmem TCP and io_uring zc to
> > make sure the net_iov stuff isn't broken (I'll get to that when I have
> > time).
> >
> > And we need page_pool benchmark numbers before/after this series,
> > please run those yourself, if at all possible:
>
> I'm trying but it keeps conflicting on several steps.. Please share a
> better manual.
>
> Byungchul
>
> > https://lore.kernel.org/netdev/20250525034354.258247-1-almasrymina@google.com/
I will try this guide again with some adjusted.. Thanks anyway.
Byungchul
> >
> > This series adds a bunch of netmem/page casts. I expect them not to
> > affect fast-path perf, but making sure would be nice.
> >
> > --
> > Thanks,
> > Mina
^ permalink raw reply [flat|nested] 65+ messages in thread
end of thread, other threads:[~2025-06-09 7:53 UTC | newest]
Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 2:52 [RFC v4 00/18] Split netmem from struct page Byungchul Park
2025-06-04 2:52 ` [RFC v4 01/18] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-06-04 16:53 ` Toke Høiland-Jørgensen
2025-06-05 10:03 ` Pavel Begunkov
2025-06-05 10:04 ` Pavel Begunkov
2025-06-05 19:34 ` Mina Almasry
2025-06-04 2:52 ` [RFC v4 02/18] netmem: introduce netmem alloc APIs to wrap page alloc APIs Byungchul Park
2025-06-04 15:14 ` Suren Baghdasaryan
2025-06-05 0:53 ` Byungchul Park
2025-06-05 10:05 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 03/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_page_order() Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:26 ` Pavel Begunkov
2025-06-05 19:39 ` Mina Almasry
2025-06-05 20:27 ` Pavel Begunkov
2025-06-05 20:34 ` Mina Almasry
2025-06-04 2:52 ` [RFC v4 04/18] page_pool: rename __page_pool_alloc_page_order() to __page_pool_alloc_netmem_order() Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:28 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 05/18] page_pool: use netmem alloc/put APIs in __page_pool_alloc_pages_slow() Byungchul Park
2025-06-04 17:02 ` Toke Høiland-Jørgensen
2025-06-05 10:30 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 06/18] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
2025-06-04 17:03 ` Toke Høiland-Jørgensen
2025-06-05 10:31 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 07/18] page_pool: use netmem put API in page_pool_return_netmem() Byungchul Park
2025-06-04 16:54 ` Toke Høiland-Jørgensen
2025-06-05 10:33 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 08/18] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:34 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 09/18] page_pool: rename __page_pool_put_page() to __page_pool_put_netmem() Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:35 ` Pavel Begunkov
2025-06-05 10:39 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 10/18] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
2025-06-04 17:03 ` Toke Høiland-Jørgensen
2025-06-04 2:52 ` [RFC v4 11/18] mlx4: use netmem descriptor and APIs for page pool Byungchul Park
2025-06-04 2:52 ` [RFC v4 12/18] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
2025-06-04 16:55 ` Toke Høiland-Jørgensen
2025-06-05 10:40 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 13/18] netmem: remove __netmem_get_pp() Byungchul Park
2025-06-04 16:56 ` Toke Høiland-Jørgensen
2025-06-05 10:41 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 14/18] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
2025-06-04 16:57 ` Toke Høiland-Jørgensen
2025-06-05 10:45 ` Pavel Begunkov
2025-06-04 2:52 ` [RFC v4 15/18] netdevsim: use netmem descriptor and APIs for page pool Byungchul Park
2025-06-04 2:52 ` [RFC v4 16/18] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
2025-06-04 16:59 ` Toke Høiland-Jørgensen
2025-06-05 10:45 ` Pavel Begunkov
2025-06-05 19:43 ` Mina Almasry
2025-06-04 2:52 ` [RFC v4 17/18] mt76: use netmem descriptor and APIs for page pool Byungchul Park
2025-06-04 2:52 ` [RFC v4 18/18] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
2025-06-04 16:59 ` Toke Høiland-Jørgensen
2025-06-05 10:56 ` Pavel Begunkov
2025-06-05 11:49 ` Harry Yoo
2025-06-05 12:17 ` Harry Yoo
2025-06-05 13:28 ` Pavel Begunkov
2025-06-05 19:47 ` Mina Almasry
2025-06-05 20:16 ` Pavel Begunkov
2025-06-04 3:23 ` [RFC v4 00/18] Split netmem from struct page Byungchul Park
2025-06-05 19:55 ` Mina Almasry
2025-06-09 4:22 ` Byungchul Park
2025-06-09 7:53 ` Byungchul Park
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).