* [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-23 9:32 ` David Hildenbrand
2025-06-20 4:12 ` [PATCH net-next v6 2/9] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
` (7 subsequent siblings)
8 siblings, 1 reply; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
To simplify struct page, the page pool members of struct page should be
moved to other, allowing these members to be removed from struct page.
Introduce a network memory descriptor to store the members, struct
netmem_desc, and make it union'ed with the existing fields in struct
net_iov, allowing to organize the fields of struct net_iov.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Harry Yoo <harry.yoo@oracle.com>
---
include/net/netmem.h | 94 ++++++++++++++++++++++++++++++++++----------
1 file changed, 73 insertions(+), 21 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 7a1dafa3f080..e0aa67aca9bb 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -12,6 +12,50 @@
#include <linux/mm.h>
#include <net/net_debug.h>
+/* These fields in struct page are used by the page_pool and net stack:
+ *
+ * struct {
+ * unsigned long pp_magic;
+ * struct page_pool *pp;
+ * unsigned long _pp_mapping_pad;
+ * unsigned long dma_addr;
+ * atomic_long_t pp_ref_count;
+ * };
+ *
+ * We mirror the page_pool fields here so the page_pool can access these
+ * fields without worrying whether the underlying fields belong to a
+ * page or netmem_desc.
+ *
+ * CAUTION: Do not update the fields in netmem_desc without also
+ * updating the anonymous aliasing union in struct net_iov.
+ */
+struct netmem_desc {
+ unsigned long _flags;
+ unsigned long pp_magic;
+ struct page_pool *pp;
+ unsigned long _pp_mapping_pad;
+ unsigned long dma_addr;
+ atomic_long_t pp_ref_count;
+};
+
+#define NETMEM_DESC_ASSERT_OFFSET(pg, desc) \
+ static_assert(offsetof(struct page, pg) == \
+ offsetof(struct netmem_desc, desc))
+NETMEM_DESC_ASSERT_OFFSET(flags, _flags);
+NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic);
+NETMEM_DESC_ASSERT_OFFSET(pp, pp);
+NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
+NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr);
+NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
+#undef NETMEM_DESC_ASSERT_OFFSET
+
+/*
+ * Since struct netmem_desc uses the space in struct page, the size
+ * should be checked, until struct netmem_desc has its own instance from
+ * slab, to avoid conflicting with other members within struct page.
+ */
+static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
+
/* net_iov */
DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
@@ -31,12 +75,25 @@ enum net_iov_type {
};
struct net_iov {
- enum net_iov_type type;
- unsigned long pp_magic;
- struct page_pool *pp;
+ union {
+ struct netmem_desc desc;
+
+ /* XXX: The following part should be removed once all
+ * the references to them are converted so as to be
+ * accessed via netmem_desc e.g. niov->desc.pp instead
+ * of niov->pp.
+ */
+ struct {
+ unsigned long _flags;
+ unsigned long pp_magic;
+ struct page_pool *pp;
+ unsigned long _pp_mapping_pad;
+ unsigned long dma_addr;
+ atomic_long_t pp_ref_count;
+ };
+ };
struct net_iov_area *owner;
- unsigned long dma_addr;
- atomic_long_t pp_ref_count;
+ enum net_iov_type type;
};
struct net_iov_area {
@@ -48,27 +105,22 @@ struct net_iov_area {
unsigned long base_virtual;
};
-/* These fields in struct page are used by the page_pool and net stack:
+/* net_iov is union'ed with struct netmem_desc mirroring struct page, so
+ * the page_pool can access these fields without worrying whether the
+ * underlying fields are accessed via netmem_desc or directly via
+ * net_iov, until all the references to them are converted so as to be
+ * accessed via netmem_desc e.g. niov->desc.pp instead of niov->pp.
*
- * struct {
- * unsigned long pp_magic;
- * struct page_pool *pp;
- * unsigned long _pp_mapping_pad;
- * unsigned long dma_addr;
- * atomic_long_t pp_ref_count;
- * };
- *
- * We mirror the page_pool fields here so the page_pool can access these fields
- * without worrying whether the underlying fields belong to a page or net_iov.
- *
- * The non-net stack fields of struct page are private to the mm stack and must
- * never be mirrored to net_iov.
+ * The non-net stack fields of struct page are private to the mm stack
+ * and must never be mirrored to net_iov.
*/
-#define NET_IOV_ASSERT_OFFSET(pg, iov) \
- static_assert(offsetof(struct page, pg) == \
+#define NET_IOV_ASSERT_OFFSET(desc, iov) \
+ static_assert(offsetof(struct netmem_desc, desc) == \
offsetof(struct net_iov, iov))
+NET_IOV_ASSERT_OFFSET(_flags, _flags);
NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic);
NET_IOV_ASSERT_OFFSET(pp, pp);
+NET_IOV_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad);
NET_IOV_ASSERT_OFFSET(dma_addr, dma_addr);
NET_IOV_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
#undef NET_IOV_ASSERT_OFFSET
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-20 4:12 ` [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring " Byungchul Park
@ 2025-06-23 9:32 ` David Hildenbrand
2025-06-23 10:28 ` Byungchul Park
0 siblings, 1 reply; 35+ messages in thread
From: David Hildenbrand @ 2025-06-23 9:32 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola,
hannes, ziy, jackmanb
On 20.06.25 06:12, Byungchul Park wrote:
> To simplify struct page, the page pool members of struct page should be
> moved to other, allowing these members to be removed from struct page.
>
> Introduce a network memory descriptor to store the members, struct
> netmem_desc, and make it union'ed with the existing fields in struct
> net_iov, allowing to organize the fields of struct net_iov.
It would be great adding some result from the previous discussions in
here, such as that the layout of "struct net_iov" can be changed because
it is not a "struct page" overlay, what the next steps based on this
patch are etc.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-23 9:32 ` David Hildenbrand
@ 2025-06-23 10:28 ` Byungchul Park
2025-06-23 10:38 ` David Hildenbrand
2025-06-23 12:18 ` Harry Yoo
0 siblings, 2 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-23 10:28 UTC (permalink / raw)
To: David Hildenbrand
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, ziy, jackmanb
On Mon, Jun 23, 2025 at 11:32:16AM +0200, David Hildenbrand wrote:
> On 20.06.25 06:12, Byungchul Park wrote:
> > To simplify struct page, the page pool members of struct page should be
> > moved to other, allowing these members to be removed from struct page.
> >
> > Introduce a network memory descriptor to store the members, struct
> > netmem_desc, and make it union'ed with the existing fields in struct
> > net_iov, allowing to organize the fields of struct net_iov.
>
> It would be great adding some result from the previous discussions in
> here, such as that the layout of "struct net_iov" can be changed because
> it is not a "struct page" overlay, what the next steps based on this
I think the network folks already know how to use and interpret their
data struct, struct net_iov for sure.. but I will add the comment if it
you think is needed. Thanks for the comment.
Byungchul
> patch are etc.
>
> --
> Cheers,
>
> David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-23 10:28 ` Byungchul Park
@ 2025-06-23 10:38 ` David Hildenbrand
2025-06-23 12:18 ` Harry Yoo
1 sibling, 0 replies; 35+ messages in thread
From: David Hildenbrand @ 2025-06-23 10:38 UTC (permalink / raw)
To: Byungchul Park
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, ziy, jackmanb
On 23.06.25 12:28, Byungchul Park wrote:
> On Mon, Jun 23, 2025 at 11:32:16AM +0200, David Hildenbrand wrote:
>> On 20.06.25 06:12, Byungchul Park wrote:
>>> To simplify struct page, the page pool members of struct page should be
>>> moved to other, allowing these members to be removed from struct page.
>>>
>>> Introduce a network memory descriptor to store the members, struct
>>> netmem_desc, and make it union'ed with the existing fields in struct
>>> net_iov, allowing to organize the fields of struct net_iov.
>>
>> It would be great adding some result from the previous discussions in
>> here, such as that the layout of "struct net_iov" can be changed because
>> it is not a "struct page" overlay, what the next steps based on this
>
> I think the network folks already know how to use and interpret their
> data struct, struct net_iov for sure.. but I will add the comment if it
> you think is needed. Thanks for the comment.
Well, you CC MM folks like me ... :)
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-23 10:28 ` Byungchul Park
2025-06-23 10:38 ` David Hildenbrand
@ 2025-06-23 12:18 ` Harry Yoo
2025-06-23 19:09 ` Mina Almasry
1 sibling, 1 reply; 35+ messages in thread
From: Harry Yoo @ 2025-06-23 12:18 UTC (permalink / raw)
To: Byungchul Park
Cc: David Hildenbrand, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, hawk, akpm,
davem, john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, ziy, jackmanb
On Mon, Jun 23, 2025 at 07:28:21PM +0900, Byungchul Park wrote:
> On Mon, Jun 23, 2025 at 11:32:16AM +0200, David Hildenbrand wrote:
> > On 20.06.25 06:12, Byungchul Park wrote:
> > > To simplify struct page, the page pool members of struct page should be
> > > moved to other, allowing these members to be removed from struct page.
> > >
> > > Introduce a network memory descriptor to store the members, struct
> > > netmem_desc, and make it union'ed with the existing fields in struct
> > > net_iov, allowing to organize the fields of struct net_iov.
> >
> > It would be great adding some result from the previous discussions in
> > here, such as that the layout of "struct net_iov" can be changed because
> > it is not a "struct page" overlay, what the next steps based on this
>
> I think the network folks already know how to use and interpret their
> data struct, struct net_iov for sure.. but I will add the comment if it
> you think is needed. Thanks for the comment.
I agree with David - it's not immediately obvious at first glance.
That was my feedback on the previous version as well :)
I think it'd be great to add that explanation, since this is where MM and
networking intersect.
> Byungchul
>
> > patch are etc.
> >
> > --
> > Cheers,
> >
> > David / dhildenb
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-23 12:18 ` Harry Yoo
@ 2025-06-23 19:09 ` Mina Almasry
2025-06-23 19:28 ` Pavel Begunkov
2025-06-24 1:17 ` Byungchul Park
0 siblings, 2 replies; 35+ messages in thread
From: Mina Almasry @ 2025-06-23 19:09 UTC (permalink / raw)
To: Harry Yoo
Cc: Byungchul Park, David Hildenbrand, willy, netdev, linux-kernel,
linux-mm, kernel_team, kuba, ilias.apalodimas, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, ziy, jackmanb
On Mon, Jun 23, 2025 at 5:18 AM Harry Yoo <harry.yoo@oracle.com> wrote:
>
> On Mon, Jun 23, 2025 at 07:28:21PM +0900, Byungchul Park wrote:
> > On Mon, Jun 23, 2025 at 11:32:16AM +0200, David Hildenbrand wrote:
> > > On 20.06.25 06:12, Byungchul Park wrote:
> > > > To simplify struct page, the page pool members of struct page should be
> > > > moved to other, allowing these members to be removed from struct page.
> > > >
> > > > Introduce a network memory descriptor to store the members, struct
> > > > netmem_desc, and make it union'ed with the existing fields in struct
> > > > net_iov, allowing to organize the fields of struct net_iov.
> > >
> > > It would be great adding some result from the previous discussions in
> > > here, such as that the layout of "struct net_iov" can be changed because
> > > it is not a "struct page" overlay, what the next steps based on this
> >
> > I think the network folks already know how to use and interpret their
> > data struct, struct net_iov for sure.. but I will add the comment if it
> > you think is needed. Thanks for the comment.
>
> I agree with David - it's not immediately obvious at first glance.
> That was my feedback on the previous version as well :)
>
> I think it'd be great to add that explanation, since this is where MM and
> networking intersect.
>
I think a lot of people are now saying the same thing: (1) lets keep
net_iov and page/netmem_desc separate, and (2) lets add comments
explaining their relation so this intersection between MM and
networking is not confused in the long term .
For #1, concretely I would recommend removing the union inside struct
net_iov? And also revert all the changes to net_iov for that matter.
They are all to bring netmem_desc and net_iov closer together, but the
feedback is that we should keep them separate, and I kinda agree with
that. The fact that net_iov includes a netmem_desc in your patch makes
readers think they're very closely related.
For #2, add this comment (roughly) on top of struct net_iov? Untested
with kdoc and spell checker:
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 7a1dafa3f080..8fb2b294e5f2 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -30,6 +30,25 @@ enum net_iov_type {
NET_IOV_MAX = ULONG_MAX
};
+/* A memory descriptor representing abstract networking I/O vectors.
+ *
+ * net_iovs are allocated by networking code, and generally represent some
+ * abstract form of non-paged memory used by the networking stack. The size
+ * of the chunk is PAGE_SIZE.
+ *
+ * This memory can be any form of non-struct paged memory. Examples include
+ * imported dmabuf memory and imported io_uring memory. See
net_iov_type for all
+ * the supported types.
+ *
+ * @type: the type of the memory. Different types of net_iovs are supported.
+ * @pp_magic: pp field, similar to the one in struct page/struct netmem_desc.
+ * @pp: the pp this net_iov belongs to, if any.
+ * @owner: the net_iov_area this net_iov belongs to, if any.
+ * @dma_addr: the dma addrs of the net_iov. Needed for the network card to
+ * send/receive this net_iov.
+ * @pp_ref_count: the pp ref count of this net_iov, exactly the same usage as
+ * struct page/struct netmem_desc.
+ */
--
Thanks,
Mina
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-23 19:09 ` Mina Almasry
@ 2025-06-23 19:28 ` Pavel Begunkov
2025-06-24 1:17 ` Byungchul Park
1 sibling, 0 replies; 35+ messages in thread
From: Pavel Begunkov @ 2025-06-23 19:28 UTC (permalink / raw)
To: Mina Almasry, Harry Yoo
Cc: Byungchul Park, David Hildenbrand, willy, netdev, linux-kernel,
linux-mm, kernel_team, kuba, ilias.apalodimas, hawk, akpm, davem,
john.fastabend, andrew+netdev, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola,
hannes, ziy, jackmanb
On 6/23/25 20:09, Mina Almasry wrote:
> On Mon, Jun 23, 2025 at 5:18 AM Harry Yoo <harry.yoo@oracle.com> wrote:
>>
>> On Mon, Jun 23, 2025 at 07:28:21PM +0900, Byungchul Park wrote:
>>> On Mon, Jun 23, 2025 at 11:32:16AM +0200, David Hildenbrand wrote:
>>>> On 20.06.25 06:12, Byungchul Park wrote:
>>>>> To simplify struct page, the page pool members of struct page should be
>>>>> moved to other, allowing these members to be removed from struct page.
>>>>>
>>>>> Introduce a network memory descriptor to store the members, struct
>>>>> netmem_desc, and make it union'ed with the existing fields in struct
>>>>> net_iov, allowing to organize the fields of struct net_iov.
>>>>
>>>> It would be great adding some result from the previous discussions in
>>>> here, such as that the layout of "struct net_iov" can be changed because
>>>> it is not a "struct page" overlay, what the next steps based on this
>>>
>>> I think the network folks already know how to use and interpret their
>>> data struct, struct net_iov for sure.. but I will add the comment if it
>>> you think is needed. Thanks for the comment.
>>
>> I agree with David - it's not immediately obvious at first glance.
>> That was my feedback on the previous version as well :)
>>
>> I think it'd be great to add that explanation, since this is where MM and
>> networking intersect.
>>
>
> I think a lot of people are now saying the same thing: (1) lets keep
> net_iov and page/netmem_desc separate, and (2) lets add comments
> explaining their relation so this intersection between MM and
> networking is not confused in the long term .
>
> For #1, concretely I would recommend removing the union inside struct
> net_iov? And also revert all the changes to net_iov for that matter.
Without it the merge logistics will get more complicated than it
should be because of conflicts with changes in the io_uring tree.
It's a temporary solution, I'll convert once the series is merged
and reaches io_uring.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring struct page
2025-06-23 19:09 ` Mina Almasry
2025-06-23 19:28 ` Pavel Begunkov
@ 2025-06-24 1:17 ` Byungchul Park
1 sibling, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-24 1:17 UTC (permalink / raw)
To: Mina Almasry
Cc: Harry Yoo, David Hildenbrand, willy, netdev, linux-kernel,
linux-mm, kernel_team, kuba, ilias.apalodimas, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, ziy, jackmanb
On Mon, Jun 23, 2025 at 12:09:09PM -0700, Mina Almasry wrote:
> On Mon, Jun 23, 2025 at 5:18 AM Harry Yoo <harry.yoo@oracle.com> wrote:
> > On Mon, Jun 23, 2025 at 07:28:21PM +0900, Byungchul Park wrote:
> > > On Mon, Jun 23, 2025 at 11:32:16AM +0200, David Hildenbrand wrote:
> > > > On 20.06.25 06:12, Byungchul Park wrote:
> > > > > To simplify struct page, the page pool members of struct page should be
> > > > > moved to other, allowing these members to be removed from struct page.
> > > > >
> > > > > Introduce a network memory descriptor to store the members, struct
> > > > > netmem_desc, and make it union'ed with the existing fields in struct
> > > > > net_iov, allowing to organize the fields of struct net_iov.
> > > >
> > > > It would be great adding some result from the previous discussions in
> > > > here, such as that the layout of "struct net_iov" can be changed because
> > > > it is not a "struct page" overlay, what the next steps based on this
> > >
> > > I think the network folks already know how to use and interpret their
> > > data struct, struct net_iov for sure.. but I will add the comment if it
> > > you think is needed. Thanks for the comment.
> >
> > I agree with David - it's not immediately obvious at first glance.
> > That was my feedback on the previous version as well :)
> >
> > I think it'd be great to add that explanation, since this is where MM and
> > networking intersect.
> >
>
> I think a lot of people are now saying the same thing: (1) lets keep
> net_iov and page/netmem_desc separate, and (2) lets add comments
> explaining their relation so this intersection between MM and
> networking is not confused in the long term .
>
> For #1, concretely I would recommend removing the union inside struct
> net_iov? And also revert all the changes to net_iov for that matter.
It seems like many got it wrong. I didn't change the layout of net_iov
much. I did nothing but replaced the existing pp fields in net_iov with
a wrapper, netmem_desc that still has the same fields. Even with
net_iov reverted, net_iov still has the fields of netmemdesc.
Just to clarify, netmem_desc is the intersection but net_iov is not.
net_iov is a network's thing.
> They are all to bring netmem_desc and net_iov closer together, but the
It was already closer together even before this series, since netmem_ref
is used to unify the related usages.
> feedback is that we should keep them separate, and I kinda agree with
> that. The fact that net_iov includes a netmem_desc in your patch makes
> readers think they're very closely related.
Again, they were already very closely related before this series. Of
course I agree with that it should be kept separated but it's another
issue. It can be done on top of this series by e.g. Pavel as he said.
> For #2, add this comment (roughly) on top of struct net_iov? Untested
> with kdoc and spell checker:
>
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index 7a1dafa3f080..8fb2b294e5f2 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -30,6 +30,25 @@ enum net_iov_type {
> NET_IOV_MAX = ULONG_MAX
> };
>
> +/* A memory descriptor representing abstract networking I/O vectors.
> + *
> + * net_iovs are allocated by networking code, and generally represent some
> + * abstract form of non-paged memory used by the networking stack. The size
> + * of the chunk is PAGE_SIZE.
> + *
> + * This memory can be any form of non-struct paged memory. Examples include
> + * imported dmabuf memory and imported io_uring memory. See
> net_iov_type for all
> + * the supported types.
The description should be changed depending on the result of discussion.
However, I will basically add this doc with some adjustment.
Thanks Mina.
Byungchul
> + *
> + * @type: the type of the memory. Different types of net_iovs are supported.
> + * @pp_magic: pp field, similar to the one in struct page/struct netmem_desc.
> + * @pp: the pp this net_iov belongs to, if any.
> + * @owner: the net_iov_area this net_iov belongs to, if any.
> + * @dma_addr: the dma addrs of the net_iov. Needed for the network card to
> + * send/receive this net_iov.
> + * @pp_ref_count: the pp ref count of this net_iov, exactly the same usage as
> + * struct page/struct netmem_desc.
> + */
>
>
>
>
>
> --
> Thanks,
> Mina
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH net-next v6 2/9] page_pool: rename page_pool_return_page() to page_pool_return_netmem()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring " Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 3/9] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
` (6 subsequent siblings)
8 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
Now that page_pool_return_page() is for returning netmem, not struct
page, rename it to page_pool_return_netmem() to reflect what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---
net/core/page_pool.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index ba7cf3e3c32f..3bf25e554f96 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -371,7 +371,7 @@ struct page_pool *page_pool_create(const struct page_pool_params *params)
}
EXPORT_SYMBOL(page_pool_create);
-static void page_pool_return_page(struct page_pool *pool, netmem_ref netmem);
+static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem);
static noinline netmem_ref page_pool_refill_alloc_cache(struct page_pool *pool)
{
@@ -409,7 +409,7 @@ static noinline netmem_ref page_pool_refill_alloc_cache(struct page_pool *pool)
* (2) break out to fallthrough to alloc_pages_node.
* This limit stress on page buddy alloactor.
*/
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
alloc_stat_inc(pool, waive);
netmem = 0;
break;
@@ -712,7 +712,7 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
* a regular page (that will eventually be returned to the normal
* page-allocator via put_page).
*/
-void page_pool_return_page(struct page_pool *pool, netmem_ref netmem)
+static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
{
int count;
bool put;
@@ -826,7 +826,7 @@ __page_pool_put_page(struct page_pool *pool, netmem_ref netmem,
* will be invoking put_page.
*/
recycle_stat_inc(pool, released_refcnt);
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
return 0;
}
@@ -869,7 +869,7 @@ void page_pool_put_unrefed_netmem(struct page_pool *pool, netmem_ref netmem,
if (netmem && !page_pool_recycle_in_ring(pool, netmem)) {
/* Cache full, fallback to free pages */
recycle_stat_inc(pool, ring_full);
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
EXPORT_SYMBOL(page_pool_put_unrefed_netmem);
@@ -912,7 +912,7 @@ static void page_pool_recycle_ring_bulk(struct page_pool *pool,
* since put_page() with refcnt == 1 can be an expensive operation.
*/
for (; i < bulk_len; i++)
- page_pool_return_page(pool, bulk[i]);
+ page_pool_return_netmem(pool, bulk[i]);
}
/**
@@ -995,7 +995,7 @@ static netmem_ref page_pool_drain_frag(struct page_pool *pool,
return netmem;
}
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
return 0;
}
@@ -1009,7 +1009,7 @@ static void page_pool_free_frag(struct page_pool *pool)
if (!netmem || page_pool_unref_netmem(netmem, drain_count))
return;
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
netmem_ref page_pool_alloc_frag_netmem(struct page_pool *pool,
@@ -1076,7 +1076,7 @@ static void page_pool_empty_ring(struct page_pool *pool)
pr_crit("%s() page_pool refcnt %d violation\n",
__func__, netmem_ref_count(netmem));
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
@@ -1109,7 +1109,7 @@ static void page_pool_empty_alloc_cache_once(struct page_pool *pool)
*/
while (pool->alloc.count) {
netmem = pool->alloc.cache[--pool->alloc.count];
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
@@ -1253,7 +1253,7 @@ void page_pool_update_nid(struct page_pool *pool, int new_nid)
/* Flush pool alloc cache, as refill will check NUMA node */
while (pool->alloc.count) {
netmem = pool->alloc.cache[--pool->alloc.count];
- page_pool_return_page(pool, netmem);
+ page_pool_return_netmem(pool, netmem);
}
}
EXPORT_SYMBOL(page_pool_update_nid);
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH net-next v6 3/9] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 1/9] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 2/9] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 4/9] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
` (5 subsequent siblings)
8 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
Now that __page_pool_release_page_dma() is for releasing netmem, not
struct page, rename it to __page_pool_release_netmem_dma() to reflect
what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---
net/core/page_pool.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 3bf25e554f96..95ffa48c7c67 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -673,8 +673,8 @@ void page_pool_clear_pp_info(netmem_ref netmem)
netmem_set_pp(netmem, NULL);
}
-static __always_inline void __page_pool_release_page_dma(struct page_pool *pool,
- netmem_ref netmem)
+static __always_inline void __page_pool_release_netmem_dma(struct page_pool *pool,
+ netmem_ref netmem)
{
struct page *old, *page = netmem_to_page(netmem);
unsigned long id;
@@ -721,7 +721,7 @@ static void page_pool_return_netmem(struct page_pool *pool, netmem_ref netmem)
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
put = pool->mp_ops->release_netmem(pool, netmem);
else
- __page_pool_release_page_dma(pool, netmem);
+ __page_pool_release_netmem_dma(pool, netmem);
/* This may be the last page returned, releasing the pool, so
* it is not safe to reference pool afterwards.
@@ -1136,7 +1136,7 @@ static void page_pool_scrub(struct page_pool *pool)
}
xa_for_each(&pool->dma_mapped, id, ptr)
- __page_pool_release_page_dma(pool, page_to_netmem(ptr));
+ __page_pool_release_netmem_dma(pool, page_to_netmem((struct page *)ptr));
}
/* No more consumers should exist, but producers could still
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH net-next v6 4/9] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
` (2 preceding siblings ...)
2025-06-20 4:12 ` [PATCH net-next v6 3/9] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 5/9] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
` (4 subsequent siblings)
8 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
Now that __page_pool_alloc_pages_slow() is for allocating netmem, not
struct page, rename it to __page_pool_alloc_netmems_slow() to reflect
what it does.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
net/core/page_pool.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 95ffa48c7c67..05e2e22a8f7c 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -544,8 +544,8 @@ static struct page *__page_pool_alloc_page_order(struct page_pool *pool,
}
/* slow path */
-static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool,
- gfp_t gfp)
+static noinline netmem_ref __page_pool_alloc_netmems_slow(struct page_pool *pool,
+ gfp_t gfp)
{
const int bulk = PP_ALLOC_CACHE_REFILL;
unsigned int pp_order = pool->p.order;
@@ -615,7 +615,7 @@ netmem_ref page_pool_alloc_netmems(struct page_pool *pool, gfp_t gfp)
if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
netmem = pool->mp_ops->alloc_netmems(pool, gfp);
else
- netmem = __page_pool_alloc_pages_slow(pool, gfp);
+ netmem = __page_pool_alloc_netmems_slow(pool, gfp);
return netmem;
}
EXPORT_SYMBOL(page_pool_alloc_netmems);
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH net-next v6 5/9] netmem: use _Generic to cover const casting for page_to_netmem()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
` (3 preceding siblings ...)
2025-06-20 4:12 ` [PATCH net-next v6 4/9] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 6/9] netmem: remove __netmem_get_pp() Byungchul Park
` (3 subsequent siblings)
8 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
The current page_to_netmem() doesn't cover const casting resulting in
trying to cast const struct page * to const netmem_ref fails.
To cover the case, change page_to_netmem() to use macro and _Generic.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
---
include/net/netmem.h | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index e0aa67aca9bb..e27ed0b9c82e 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -191,10 +191,9 @@ static inline netmem_ref net_iov_to_netmem(struct net_iov *niov)
return (__force netmem_ref)((unsigned long)niov | NET_IOV);
}
-static inline netmem_ref page_to_netmem(const struct page *page)
-{
- return (__force netmem_ref)page;
-}
+#define page_to_netmem(p) (_Generic((p), \
+ const struct page * : (__force const netmem_ref)(p), \
+ struct page * : (__force netmem_ref)(p)))
/**
* virt_to_netmem - convert virtual memory pointer to a netmem reference
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH net-next v6 6/9] netmem: remove __netmem_get_pp()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
` (4 preceding siblings ...)
2025-06-20 4:12 ` [PATCH net-next v6 5/9] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-23 4:32 ` Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 7/9] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
` (2 subsequent siblings)
8 siblings, 1 reply; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
There are no users of __netmem_get_pp(). Remove it.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---
include/net/netmem.h | 16 ----------------
1 file changed, 16 deletions(-)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index e27ed0b9c82e..d0a84557983d 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -245,22 +245,6 @@ static inline struct net_iov *__netmem_clear_lsb(netmem_ref netmem)
return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV);
}
-/**
- * __netmem_get_pp - unsafely get pointer to the &page_pool backing @netmem
- * @netmem: netmem reference to get the pointer from
- *
- * Unsafe version of netmem_get_pp(). When @netmem is always page-backed,
- * e.g. when it's a header buffer, performs faster and generates smaller
- * object code (avoids clearing the LSB). When @netmem points to IOV,
- * provokes invalid memory access.
- *
- * Return: pointer to the &page_pool (garbage if @netmem is not page-backed).
- */
-static inline struct page_pool *__netmem_get_pp(netmem_ref netmem)
-{
- return __netmem_to_page(netmem)->pp;
-}
-
static inline struct page_pool *netmem_get_pp(netmem_ref netmem)
{
return __netmem_clear_lsb(netmem)->pp;
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 6/9] netmem: remove __netmem_get_pp()
2025-06-20 4:12 ` [PATCH net-next v6 6/9] netmem: remove __netmem_get_pp() Byungchul Park
@ 2025-06-23 4:32 ` Byungchul Park
2025-06-24 0:17 ` Jakub Kicinski
0 siblings, 1 reply; 35+ messages in thread
From: Byungchul Park @ 2025-06-23 4:32 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
On Fri, Jun 20, 2025 at 01:12:21PM +0900, Byungchul Park wrote:
> There are no users of __netmem_get_pp(). Remove it.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> ---
> include/net/netmem.h | 16 ----------------
> 1 file changed, 16 deletions(-)
>
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index e27ed0b9c82e..d0a84557983d 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -245,22 +245,6 @@ static inline struct net_iov *__netmem_clear_lsb(netmem_ref netmem)
> return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV);
> }
>
> -/**
> - * __netmem_get_pp - unsafely get pointer to the &page_pool backing @netmem
> - * @netmem: netmem reference to get the pointer from
> - *
> - * Unsafe version of netmem_get_pp(). When @netmem is always page-backed,
> - * e.g. when it's a header buffer, performs faster and generates smaller
> - * object code (avoids clearing the LSB). When @netmem points to IOV,
> - * provokes invalid memory access.
> - *
> - * Return: pointer to the &page_pool (garbage if @netmem is not page-backed).
> - */
> -static inline struct page_pool *__netmem_get_pp(netmem_ref netmem)
> -{
> - return __netmem_to_page(netmem)->pp;
> -}
> -
In the meantime, libeth started to use __netmem_get_pp() again :(
Discard this patch please. Do I have to resend this series with this
excluded?
Byungchul
> static inline struct page_pool *netmem_get_pp(netmem_ref netmem)
> {
> return __netmem_clear_lsb(netmem)->pp;
> --
> 2.17.1
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 6/9] netmem: remove __netmem_get_pp()
2025-06-23 4:32 ` Byungchul Park
@ 2025-06-24 0:17 ` Jakub Kicinski
2025-06-24 1:27 ` Byungchul Park
0 siblings, 1 reply; 35+ messages in thread
From: Jakub Kicinski @ 2025-06-24 0:17 UTC (permalink / raw)
To: Byungchul Park
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
On Mon, 23 Jun 2025 13:32:07 +0900 Byungchul Park wrote:
> In the meantime, libeth started to use __netmem_get_pp() again :(
>
> Discard this patch please. Do I have to resend this series with this
> excluded?
You'll need to repost, unfortunately. If there's a build failure
the CI doesn't execute any functional testing nodes.
--
pw-bot: cr
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 6/9] netmem: remove __netmem_get_pp()
2025-06-24 0:17 ` Jakub Kicinski
@ 2025-06-24 1:27 ` Byungchul Park
0 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-24 1:27 UTC (permalink / raw)
To: Jakub Kicinski
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
On Mon, Jun 23, 2025 at 05:17:06PM -0700, Jakub Kicinski wrote:
> On Mon, 23 Jun 2025 13:32:07 +0900 Byungchul Park wrote:
> > In the meantime, libeth started to use __netmem_get_pp() again :(
> >
> > Discard this patch please. Do I have to resend this series with this
> > excluded?
>
> You'll need to repost, unfortunately. If there's a build failure
> the CI doesn't execute any functional testing nodes.
I will. Thanks.
Byungchul
> --
> pw-bot: cr
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH net-next v6 7/9] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
` (5 preceding siblings ...)
2025-06-20 4:12 ` [PATCH net-next v6 6/9] netmem: remove __netmem_get_pp() Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 8/9] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
8 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
The page pool members in struct page cannot be removed unless it's not
allowed to access any of them via struct page.
Do not access 'page->dma_addr' directly in page_pool_get_dma_addr() but
just wrap page_pool_get_dma_addr_netmem() safely.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
---
include/net/page_pool/helpers.h | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 773fc65780b5..db180626be06 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -444,12 +444,7 @@ static inline dma_addr_t page_pool_get_dma_addr_netmem(netmem_ref netmem)
*/
static inline dma_addr_t page_pool_get_dma_addr(const struct page *page)
{
- dma_addr_t ret = page->dma_addr;
-
- if (PAGE_POOL_32BIT_ARCH_WITH_64BIT_DMA)
- ret <<= PAGE_SHIFT;
-
- return ret;
+ return page_pool_get_dma_addr_netmem(page_to_netmem(page));
}
static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool,
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH net-next v6 8/9] netmem: introduce a netmem API, virt_to_head_netmem()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
` (6 preceding siblings ...)
2025-06-20 4:12 ` [PATCH net-next v6 7/9] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-20 4:12 ` [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
8 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
To eliminate the use of struct page in page pool, the page pool code
should use netmem descriptor and APIs instead.
As part of the work, introduce a netmem API to convert a virtual address
to a head netmem allowing the code to use it rather than the existing
API, virt_to_head_page() for struct page.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
include/net/netmem.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/net/netmem.h b/include/net/netmem.h
index d0a84557983d..d49ed49d250b 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -276,6 +276,13 @@ static inline netmem_ref netmem_compound_head(netmem_ref netmem)
return page_to_netmem(compound_head(netmem_to_page(netmem)));
}
+static inline netmem_ref virt_to_head_netmem(const void *x)
+{
+ netmem_ref netmem = virt_to_netmem(x);
+
+ return netmem_compound_head(netmem);
+}
+
/**
* __netmem_address - unsafely get pointer to the memory backing @netmem
* @netmem: netmem reference to get the pointer for
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-20 4:12 [PATCH net-next v6 0/9] Split netmem from struct page Byungchul Park
` (7 preceding siblings ...)
2025-06-20 4:12 ` [PATCH net-next v6 8/9] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
@ 2025-06-20 4:12 ` Byungchul Park
2025-06-23 9:16 ` David Hildenbrand
8 siblings, 1 reply; 35+ messages in thread
From: Byungchul Park @ 2025-06-20 4:12 UTC (permalink / raw)
To: willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, david, lorenzo.stoakes, Liam.Howlett,
vbabka, rppt, surenb, mhocko, horms, linux-rdma, bpf,
vishal.moola, hannes, ziy, jackmanb
To simplify struct page, the effort to separate its own descriptor from
struct page is required and the work for page pool is on going.
To achieve that, all the code should avoid directly accessing page pool
members of struct page.
Access ->pp_magic through struct netmem_desc instead of directly
accessing it through struct page in page_pool_page_is_pp(). Plus, move
page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
without header dependency issue.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Harry Yoo <harry.yoo@oracle.com>
---
include/linux/mm.h | 12 ------------
include/net/netmem.h | 14 ++++++++++++++
mm/page_alloc.c | 1 +
3 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0ef2ba0c667a..0b7f7f998085 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
*/
#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
-#ifdef CONFIG_PAGE_POOL
-static inline bool page_pool_page_is_pp(struct page *page)
-{
- return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
-}
-#else
-static inline bool page_pool_page_is_pp(struct page *page)
-{
- return false;
-}
-#endif
-
#endif /* _LINUX_MM_H */
diff --git a/include/net/netmem.h b/include/net/netmem.h
index d49ed49d250b..3d1b1dfc9ba5 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
*/
static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
+#ifdef CONFIG_PAGE_POOL
+static inline bool page_pool_page_is_pp(struct page *page)
+{
+ struct netmem_desc *desc = (struct netmem_desc *)page;
+
+ return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
+}
+#else
+static inline bool page_pool_page_is_pp(struct page *page)
+{
+ return false;
+}
+#endif
+
/* net_iov */
DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2ef3c07266b3..cc1d169853e8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -55,6 +55,7 @@
#include <linux/delayacct.h>
#include <linux/cacheinfo.h>
#include <linux/pgalloc_tag.h>
+#include <net/netmem.h>
#include <asm/div64.h>
#include "internal.h"
#include "shuffle.h"
--
2.17.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-20 4:12 ` [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
@ 2025-06-23 9:16 ` David Hildenbrand
2025-06-23 10:16 ` Byungchul Park
0 siblings, 1 reply; 35+ messages in thread
From: David Hildenbrand @ 2025-06-23 9:16 UTC (permalink / raw)
To: Byungchul Park, willy, netdev
Cc: linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola,
hannes, ziy, jackmanb
On 20.06.25 06:12, Byungchul Park wrote:
> To simplify struct page, the effort to separate its own descriptor from
> struct page is required and the work for page pool is on going.
>
> To achieve that, all the code should avoid directly accessing page pool
> members of struct page.
>
> Access ->pp_magic through struct netmem_desc instead of directly
> accessing it through struct page in page_pool_page_is_pp(). Plus, move
> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> without header dependency issue.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Harry Yoo <harry.yoo@oracle.com>
> ---
> include/linux/mm.h | 12 ------------
> include/net/netmem.h | 14 ++++++++++++++
> mm/page_alloc.c | 1 +
> 3 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0ef2ba0c667a..0b7f7f998085 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> */
> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>
> -#ifdef CONFIG_PAGE_POOL
> -static inline bool page_pool_page_is_pp(struct page *page)
> -{
> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> -}
> -#else
> -static inline bool page_pool_page_is_pp(struct page *page)
> -{
> - return false;
> -}
> -#endif
> -
> #endif /* _LINUX_MM_H */
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index d49ed49d250b..3d1b1dfc9ba5 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> */
> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>
> +#ifdef CONFIG_PAGE_POOL
> +static inline bool page_pool_page_is_pp(struct page *page)
> +{
> + struct netmem_desc *desc = (struct netmem_desc *)page;
> +
> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> +}
> +#else
> +static inline bool page_pool_page_is_pp(struct page *page)
> +{
> + return false;
> +}
> +#endif
I wonder how helpful this cleanup is long-term.
page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
There, we want to make sure that no pagepool page is ever returned to
the buddy.
How reasonable is this sanity check to have long-term? Wouldn't we be
able to check that on some higher-level freeing path?
The reason I am commenting is that once we decouple "struct page" from
"struct netmem_desc", we'd have to lookup here the corresponding "struct
netmem_desc".
... but at that point here (when we free the actual pages), the "struct
netmem_desc" would likely already have been freed separately (remember:
it will be dynamically allocated).
With that in mind:
1) Is there a higher level "struct netmem_desc" freeing path where we
could check that instead, so we don't have to cast from pages to
netmem_desc at all.
2) How valuable are these sanity checks deep in the buddy?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 9:16 ` David Hildenbrand
@ 2025-06-23 10:16 ` Byungchul Park
2025-06-23 11:13 ` Zi Yan
0 siblings, 1 reply; 35+ messages in thread
From: Byungchul Park @ 2025-06-23 10:16 UTC (permalink / raw)
To: David Hildenbrand
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, ziy, jackmanb
On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
> On 20.06.25 06:12, Byungchul Park wrote:
> > To simplify struct page, the effort to separate its own descriptor from
> > struct page is required and the work for page pool is on going.
> >
> > To achieve that, all the code should avoid directly accessing page pool
> > members of struct page.
> >
> > Access ->pp_magic through struct netmem_desc instead of directly
> > accessing it through struct page in page_pool_page_is_pp(). Plus, move
> > page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> > without header dependency issue.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > Reviewed-by: Mina Almasry <almasrymina@google.com>
> > Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
> > Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> > Acked-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> > include/linux/mm.h | 12 ------------
> > include/net/netmem.h | 14 ++++++++++++++
> > mm/page_alloc.c | 1 +
> > 3 files changed, 15 insertions(+), 12 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0ef2ba0c667a..0b7f7f998085 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> > */
> > #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
> >
> > -#ifdef CONFIG_PAGE_POOL
> > -static inline bool page_pool_page_is_pp(struct page *page)
> > -{
> > - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > -}
> > -#else
> > -static inline bool page_pool_page_is_pp(struct page *page)
> > -{
> > - return false;
> > -}
> > -#endif
> > -
> > #endif /* _LINUX_MM_H */
> > diff --git a/include/net/netmem.h b/include/net/netmem.h
> > index d49ed49d250b..3d1b1dfc9ba5 100644
> > --- a/include/net/netmem.h
> > +++ b/include/net/netmem.h
> > @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> > */
> > static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
> >
> > +#ifdef CONFIG_PAGE_POOL
> > +static inline bool page_pool_page_is_pp(struct page *page)
> > +{
> > + struct netmem_desc *desc = (struct netmem_desc *)page;
> > +
> > + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > +}
> > +#else
> > +static inline bool page_pool_page_is_pp(struct page *page)
> > +{
> > + return false;
> > +}
> > +#endif
>
> I wonder how helpful this cleanup is long-term.
>
> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
Yes.
> There, we want to make sure that no pagepool page is ever returned to
> the buddy.
>
> How reasonable is this sanity check to have long-term? Wouldn't we be
> able to check that on some higher-level freeing path?
>
> The reason I am commenting is that once we decouple "struct page" from
> "struct netmem_desc", we'd have to lookup here the corresponding "struct
> netmem_desc".
>
> ... but at that point here (when we free the actual pages), the "struct
> netmem_desc" would likely already have been freed separately (remember:
> it will be dynamically allocated).
>
> With that in mind:
>
> 1) Is there a higher level "struct netmem_desc" freeing path where we
> could check that instead, so we don't have to cast from pages to
> netmem_desc at all.
I also thought it's too paranoiac. However, I thought it's other issue
than this work. That's why I left the API as is for now, it can be gone
once we get convinced the check is unnecessary in deep buddy. Wrong?
> 2) How valuable are these sanity checks deep in the buddy?
That was also what I felt weird on.
Byungchul
> --
> Cheers,
>
> David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 10:16 ` Byungchul Park
@ 2025-06-23 11:13 ` Zi Yan
2025-06-23 11:25 ` Byungchul Park
2025-06-23 14:58 ` David Hildenbrand
0 siblings, 2 replies; 35+ messages in thread
From: Zi Yan @ 2025-06-23 11:13 UTC (permalink / raw)
To: Byungchul Park, David Hildenbrand
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, jackmanb
On 23 Jun 2025, at 6:16, Byungchul Park wrote:
> On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
>> On 20.06.25 06:12, Byungchul Park wrote:
>>> To simplify struct page, the effort to separate its own descriptor from
>>> struct page is required and the work for page pool is on going.
>>>
>>> To achieve that, all the code should avoid directly accessing page pool
>>> members of struct page.
>>>
>>> Access ->pp_magic through struct netmem_desc instead of directly
>>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
>>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
>>> without header dependency issue.
>>>
>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
>>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
>>> ---
>>> include/linux/mm.h | 12 ------------
>>> include/net/netmem.h | 14 ++++++++++++++
>>> mm/page_alloc.c | 1 +
>>> 3 files changed, 15 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 0ef2ba0c667a..0b7f7f998085 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>>> */
>>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>
>>> -#ifdef CONFIG_PAGE_POOL
>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>> -{
>>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>> -}
>>> -#else
>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>> -{
>>> - return false;
>>> -}
>>> -#endif
>>> -
>>> #endif /* _LINUX_MM_H */
>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>> index d49ed49d250b..3d1b1dfc9ba5 100644
>>> --- a/include/net/netmem.h
>>> +++ b/include/net/netmem.h
>>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>>> */
>>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>>>
>>> +#ifdef CONFIG_PAGE_POOL
>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>> +{
>>> + struct netmem_desc *desc = (struct netmem_desc *)page;
>>> +
>>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>> +}
>>> +#else
>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>> +{
>>> + return false;
>>> +}
>>> +#endif
>>
>> I wonder how helpful this cleanup is long-term.
>>
>> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
>
> Yes.
>
>> There, we want to make sure that no pagepool page is ever returned to
>> the buddy.
>>
>> How reasonable is this sanity check to have long-term? Wouldn't we be
>> able to check that on some higher-level freeing path?
>>
>> The reason I am commenting is that once we decouple "struct page" from
>> "struct netmem_desc", we'd have to lookup here the corresponding "struct
>> netmem_desc".
>>
>> ... but at that point here (when we free the actual pages), the "struct
>> netmem_desc" would likely already have been freed separately (remember:
>> it will be dynamically allocated).
>>
>> With that in mind:
>>
>> 1) Is there a higher level "struct netmem_desc" freeing path where we
>> could check that instead, so we don't have to cast from pages to
>> netmem_desc at all.
>
> I also thought it's too paranoiac. However, I thought it's other issue
> than this work. That's why I left the API as is for now, it can be gone
> once we get convinced the check is unnecessary in deep buddy. Wrong?
>
>> 2) How valuable are these sanity checks deep in the buddy?
>
> That was also what I felt weird on.
It seems very useful when I asked last time[1]:
|> We have actually used this at Cloudflare to catch some page_pool bugs.
[1] https://lore.kernel.org/linux-mm/4d35bda2-d032-49db-bb6e-b1d70f10d436@kernel.org/
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 11:13 ` Zi Yan
@ 2025-06-23 11:25 ` Byungchul Park
2025-06-23 14:58 ` David Hildenbrand
1 sibling, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-23 11:25 UTC (permalink / raw)
To: Zi Yan
Cc: David Hildenbrand, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, harry.yoo, hawk,
akpm, davem, john.fastabend, andrew+netdev, asml.silence, toke,
tariqt, edumazet, pabeni, saeedm, leon, ast, daniel,
lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
horms, linux-rdma, bpf, vishal.moola, hannes, jackmanb
On Mon, Jun 23, 2025 at 07:13:21AM -0400, Zi Yan wrote:
> On 23 Jun 2025, at 6:16, Byungchul Park wrote:
> > On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
> >> On 20.06.25 06:12, Byungchul Park wrote:
> >>> To simplify struct page, the effort to separate its own descriptor from
> >>> struct page is required and the work for page pool is on going.
> >>>
> >>> To achieve that, all the code should avoid directly accessing page pool
> >>> members of struct page.
> >>>
> >>> Access ->pp_magic through struct netmem_desc instead of directly
> >>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
> >>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> >>> without header dependency issue.
> >>>
> >>> Signed-off-by: Byungchul Park <byungchul@sk.com>
> >>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >>> Reviewed-by: Mina Almasry <almasrymina@google.com>
> >>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
> >>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
> >>> ---
> >>> include/linux/mm.h | 12 ------------
> >>> include/net/netmem.h | 14 ++++++++++++++
> >>> mm/page_alloc.c | 1 +
> >>> 3 files changed, 15 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >>> index 0ef2ba0c667a..0b7f7f998085 100644
> >>> --- a/include/linux/mm.h
> >>> +++ b/include/linux/mm.h
> >>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> >>> */
> >>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
> >>>
> >>> -#ifdef CONFIG_PAGE_POOL
> >>> -static inline bool page_pool_page_is_pp(struct page *page)
> >>> -{
> >>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> >>> -}
> >>> -#else
> >>> -static inline bool page_pool_page_is_pp(struct page *page)
> >>> -{
> >>> - return false;
> >>> -}
> >>> -#endif
> >>> -
> >>> #endif /* _LINUX_MM_H */
> >>> diff --git a/include/net/netmem.h b/include/net/netmem.h
> >>> index d49ed49d250b..3d1b1dfc9ba5 100644
> >>> --- a/include/net/netmem.h
> >>> +++ b/include/net/netmem.h
> >>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> >>> */
> >>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
> >>>
> >>> +#ifdef CONFIG_PAGE_POOL
> >>> +static inline bool page_pool_page_is_pp(struct page *page)
> >>> +{
> >>> + struct netmem_desc *desc = (struct netmem_desc *)page;
> >>> +
> >>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> >>> +}
> >>> +#else
> >>> +static inline bool page_pool_page_is_pp(struct page *page)
> >>> +{
> >>> + return false;
> >>> +}
> >>> +#endif
> >>
> >> I wonder how helpful this cleanup is long-term.
> >>
> >> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
> >
> > Yes.
> >
> >> There, we want to make sure that no pagepool page is ever returned to
> >> the buddy.
> >>
> >> How reasonable is this sanity check to have long-term? Wouldn't we be
> >> able to check that on some higher-level freeing path?
> >>
> >> The reason I am commenting is that once we decouple "struct page" from
> >> "struct netmem_desc", we'd have to lookup here the corresponding "struct
> >> netmem_desc".
> >>
> >> ... but at that point here (when we free the actual pages), the "struct
> >> netmem_desc" would likely already have been freed separately (remember:
> >> it will be dynamically allocated).
> >>
> >> With that in mind:
> >>
> >> 1) Is there a higher level "struct netmem_desc" freeing path where we
> >> could check that instead, so we don't have to cast from pages to
> >> netmem_desc at all.
> >
> > I also thought it's too paranoiac. However, I thought it's other issue
> > than this work. That's why I left the API as is for now, it can be gone
> > once we get convinced the check is unnecessary in deep buddy. Wrong?
> >
> >> 2) How valuable are these sanity checks deep in the buddy?
> >
> > That was also what I felt weird on.
>
> It seems very useful when I asked last time[1]:
>
> |> We have actually used this at Cloudflare to catch some page_pool bugs.
Indeed.. So I think it'd be better to leave the check as is until we
will be fully convinced on that issue, I ideally agree with David's
opinion tho.
Byungchul
> [1] https://lore.kernel.org/linux-mm/4d35bda2-d032-49db-bb6e-b1d70f10d436@kernel.org/
>
> --
> Best Regards,
> Yan, Zi
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 11:13 ` Zi Yan
2025-06-23 11:25 ` Byungchul Park
@ 2025-06-23 14:58 ` David Hildenbrand
2025-06-23 15:25 ` Zi Yan
2025-06-23 17:06 ` Pavel Begunkov
1 sibling, 2 replies; 35+ messages in thread
From: David Hildenbrand @ 2025-06-23 14:58 UTC (permalink / raw)
To: Zi Yan, Byungchul Park
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, asml.silence, toke, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, jackmanb
On 23.06.25 13:13, Zi Yan wrote:
> On 23 Jun 2025, at 6:16, Byungchul Park wrote:
>
>> On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
>>> On 20.06.25 06:12, Byungchul Park wrote:
>>>> To simplify struct page, the effort to separate its own descriptor from
>>>> struct page is required and the work for page pool is on going.
>>>>
>>>> To achieve that, all the code should avoid directly accessing page pool
>>>> members of struct page.
>>>>
>>>> Access ->pp_magic through struct netmem_desc instead of directly
>>>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
>>>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
>>>> without header dependency issue.
>>>>
>>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
>>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
>>>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
>>>> ---
>>>> include/linux/mm.h | 12 ------------
>>>> include/net/netmem.h | 14 ++++++++++++++
>>>> mm/page_alloc.c | 1 +
>>>> 3 files changed, 15 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>> index 0ef2ba0c667a..0b7f7f998085 100644
>>>> --- a/include/linux/mm.h
>>>> +++ b/include/linux/mm.h
>>>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>>>> */
>>>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>>
>>>> -#ifdef CONFIG_PAGE_POOL
>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>> -{
>>>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>> -}
>>>> -#else
>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>> -{
>>>> - return false;
>>>> -}
>>>> -#endif
>>>> -
>>>> #endif /* _LINUX_MM_H */
>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>>> index d49ed49d250b..3d1b1dfc9ba5 100644
>>>> --- a/include/net/netmem.h
>>>> +++ b/include/net/netmem.h
>>>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>>>> */
>>>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>>>>
>>>> +#ifdef CONFIG_PAGE_POOL
>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>> +{
>>>> + struct netmem_desc *desc = (struct netmem_desc *)page;
>>>> +
>>>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>> +}
>>>> +#else
>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>> +{
>>>> + return false;
>>>> +}
>>>> +#endif
>>>
>>> I wonder how helpful this cleanup is long-term.
>>>
>>> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
>>
>> Yes.
>>
>>> There, we want to make sure that no pagepool page is ever returned to
>>> the buddy.
>>>
>>> How reasonable is this sanity check to have long-term? Wouldn't we be
>>> able to check that on some higher-level freeing path?
>>>
>>> The reason I am commenting is that once we decouple "struct page" from
>>> "struct netmem_desc", we'd have to lookup here the corresponding "struct
>>> netmem_desc".
>>>
>>> ... but at that point here (when we free the actual pages), the "struct
>>> netmem_desc" would likely already have been freed separately (remember:
>>> it will be dynamically allocated).
>>>
>>> With that in mind:
>>>
>>> 1) Is there a higher level "struct netmem_desc" freeing path where we
>>> could check that instead, so we don't have to cast from pages to
>>> netmem_desc at all.
>>
>> I also thought it's too paranoiac. However, I thought it's other issue
>> than this work. That's why I left the API as is for now, it can be gone
>> once we get convinced the check is unnecessary in deep buddy. Wrong?
>>
>>> 2) How valuable are these sanity checks deep in the buddy?
>>
>> That was also what I felt weird on.
>
> It seems very useful when I asked last time[1]:
>
> |> We have actually used this at Cloudflare to catch some page_pool bugs.
My question is rather, whether there is some higher-level freeing path
for netmem_desc where we could check that instead (IOW, earlier).
Or is it really arbitrary put_page() (IOW, we assume that many possible
references can be held)?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 14:58 ` David Hildenbrand
@ 2025-06-23 15:25 ` Zi Yan
2025-06-24 14:43 ` Toke Høiland-Jørgensen
2025-06-23 17:06 ` Pavel Begunkov
1 sibling, 1 reply; 35+ messages in thread
From: Zi Yan @ 2025-06-23 15:25 UTC (permalink / raw)
To: David Hildenbrand, Toke Høiland-Jørgensen
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, harry.yoo, hawk,
akpm, davem, john.fastabend, andrew+netdev, asml.silence, toke,
tariqt, edumazet, pabeni, saeedm, leon, ast, daniel,
lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
horms, linux-rdma, bpf, vishal.moola, hannes, jackmanb
On 23 Jun 2025, at 10:58, David Hildenbrand wrote:
> On 23.06.25 13:13, Zi Yan wrote:
>> On 23 Jun 2025, at 6:16, Byungchul Park wrote:
>>
>>> On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
>>>> On 20.06.25 06:12, Byungchul Park wrote:
>>>>> To simplify struct page, the effort to separate its own descriptor from
>>>>> struct page is required and the work for page pool is on going.
>>>>>
>>>>> To achieve that, all the code should avoid directly accessing page pool
>>>>> members of struct page.
>>>>>
>>>>> Access ->pp_magic through struct netmem_desc instead of directly
>>>>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
>>>>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
>>>>> without header dependency issue.
>>>>>
>>>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
>>>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
>>>>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
>>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
>>>>> ---
>>>>> include/linux/mm.h | 12 ------------
>>>>> include/net/netmem.h | 14 ++++++++++++++
>>>>> mm/page_alloc.c | 1 +
>>>>> 3 files changed, 15 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>> index 0ef2ba0c667a..0b7f7f998085 100644
>>>>> --- a/include/linux/mm.h
>>>>> +++ b/include/linux/mm.h
>>>>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>>>>> */
>>>>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>>>
>>>>> -#ifdef CONFIG_PAGE_POOL
>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>> -{
>>>>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>> -}
>>>>> -#else
>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>> -{
>>>>> - return false;
>>>>> -}
>>>>> -#endif
>>>>> -
>>>>> #endif /* _LINUX_MM_H */
>>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>>>> index d49ed49d250b..3d1b1dfc9ba5 100644
>>>>> --- a/include/net/netmem.h
>>>>> +++ b/include/net/netmem.h
>>>>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>>>>> */
>>>>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>>>>>
>>>>> +#ifdef CONFIG_PAGE_POOL
>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>> +{
>>>>> + struct netmem_desc *desc = (struct netmem_desc *)page;
>>>>> +
>>>>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>> +}
>>>>> +#else
>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>> +{
>>>>> + return false;
>>>>> +}
>>>>> +#endif
>>>>
>>>> I wonder how helpful this cleanup is long-term.
>>>>
>>>> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
>>>
>>> Yes.
>>>
>>>> There, we want to make sure that no pagepool page is ever returned to
>>>> the buddy.
>>>>
>>>> How reasonable is this sanity check to have long-term? Wouldn't we be
>>>> able to check that on some higher-level freeing path?
>>>>
>>>> The reason I am commenting is that once we decouple "struct page" from
>>>> "struct netmem_desc", we'd have to lookup here the corresponding "struct
>>>> netmem_desc".
>>>>
>>>> ... but at that point here (when we free the actual pages), the "struct
>>>> netmem_desc" would likely already have been freed separately (remember:
>>>> it will be dynamically allocated).
>>>>
>>>> With that in mind:
>>>>
>>>> 1) Is there a higher level "struct netmem_desc" freeing path where we
>>>> could check that instead, so we don't have to cast from pages to
>>>> netmem_desc at all.
>>>
>>> I also thought it's too paranoiac. However, I thought it's other issue
>>> than this work. That's why I left the API as is for now, it can be gone
>>> once we get convinced the check is unnecessary in deep buddy. Wrong?
>>>
>>>> 2) How valuable are these sanity checks deep in the buddy?
>>>
>>> That was also what I felt weird on.
>>
>> It seems very useful when I asked last time[1]:
>>
>> |> We have actually used this at Cloudflare to catch some page_pool bugs.
>
> My question is rather, whether there is some higher-level freeing path for netmem_desc where we could check that instead (IOW, earlier).
>
> Or is it really arbitrary put_page() (IOW, we assume that many possible references can be held)?
+Toke, who I talked about this last time.
Maybe he can shed some light on it.
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 15:25 ` Zi Yan
@ 2025-06-24 14:43 ` Toke Høiland-Jørgensen
2025-06-24 14:56 ` David Hildenbrand
0 siblings, 1 reply; 35+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-06-24 14:43 UTC (permalink / raw)
To: Zi Yan, David Hildenbrand
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, harry.yoo, hawk,
akpm, davem, john.fastabend, andrew+netdev, asml.silence, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, jackmanb, jesper@cloudflare.com
Zi Yan <ziy@nvidia.com> writes:
> On 23 Jun 2025, at 10:58, David Hildenbrand wrote:
>
>> On 23.06.25 13:13, Zi Yan wrote:
>>> On 23 Jun 2025, at 6:16, Byungchul Park wrote:
>>>
>>>> On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
>>>>> On 20.06.25 06:12, Byungchul Park wrote:
>>>>>> To simplify struct page, the effort to separate its own descriptor from
>>>>>> struct page is required and the work for page pool is on going.
>>>>>>
>>>>>> To achieve that, all the code should avoid directly accessing page pool
>>>>>> members of struct page.
>>>>>>
>>>>>> Access ->pp_magic through struct netmem_desc instead of directly
>>>>>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
>>>>>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
>>>>>> without header dependency issue.
>>>>>>
>>>>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
>>>>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
>>>>>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
>>>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>>>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>> ---
>>>>>> include/linux/mm.h | 12 ------------
>>>>>> include/net/netmem.h | 14 ++++++++++++++
>>>>>> mm/page_alloc.c | 1 +
>>>>>> 3 files changed, 15 insertions(+), 12 deletions(-)
>>>>>>
>>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>>> index 0ef2ba0c667a..0b7f7f998085 100644
>>>>>> --- a/include/linux/mm.h
>>>>>> +++ b/include/linux/mm.h
>>>>>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>>>>>> */
>>>>>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>>>>
>>>>>> -#ifdef CONFIG_PAGE_POOL
>>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>>> -{
>>>>>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>>> -}
>>>>>> -#else
>>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>>> -{
>>>>>> - return false;
>>>>>> -}
>>>>>> -#endif
>>>>>> -
>>>>>> #endif /* _LINUX_MM_H */
>>>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>>>>> index d49ed49d250b..3d1b1dfc9ba5 100644
>>>>>> --- a/include/net/netmem.h
>>>>>> +++ b/include/net/netmem.h
>>>>>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>>>>>> */
>>>>>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>>>>>>
>>>>>> +#ifdef CONFIG_PAGE_POOL
>>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>>> +{
>>>>>> + struct netmem_desc *desc = (struct netmem_desc *)page;
>>>>>> +
>>>>>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>>> +}
>>>>>> +#else
>>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>>> +{
>>>>>> + return false;
>>>>>> +}
>>>>>> +#endif
>>>>>
>>>>> I wonder how helpful this cleanup is long-term.
>>>>>
>>>>> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
>>>>
>>>> Yes.
>>>>
>>>>> There, we want to make sure that no pagepool page is ever returned to
>>>>> the buddy.
>>>>>
>>>>> How reasonable is this sanity check to have long-term? Wouldn't we be
>>>>> able to check that on some higher-level freeing path?
>>>>>
>>>>> The reason I am commenting is that once we decouple "struct page" from
>>>>> "struct netmem_desc", we'd have to lookup here the corresponding "struct
>>>>> netmem_desc".
>>>>>
>>>>> ... but at that point here (when we free the actual pages), the "struct
>>>>> netmem_desc" would likely already have been freed separately (remember:
>>>>> it will be dynamically allocated).
>>>>>
>>>>> With that in mind:
>>>>>
>>>>> 1) Is there a higher level "struct netmem_desc" freeing path where we
>>>>> could check that instead, so we don't have to cast from pages to
>>>>> netmem_desc at all.
>>>>
>>>> I also thought it's too paranoiac. However, I thought it's other issue
>>>> than this work. That's why I left the API as is for now, it can be gone
>>>> once we get convinced the check is unnecessary in deep buddy. Wrong?
>>>>
>>>>> 2) How valuable are these sanity checks deep in the buddy?
>>>>
>>>> That was also what I felt weird on.
>>>
>>> It seems very useful when I asked last time[1]:
>>>
>>> |> We have actually used this at Cloudflare to catch some page_pool bugs.
>>
>> My question is rather, whether there is some higher-level freeing path for netmem_desc where we could check that instead (IOW, earlier).
>>
>> Or is it really arbitrary put_page() (IOW, we assume that many possible references can be held)?
>
> +Toke, who I talked about this last time.
>
> Maybe he can shed some light on it.
As others have pointed out, basically, AFAIU: Yes, pages are *supposed*
to go through a common freeing path where this check could reside, but
we've had bugs where they ended up leaking anyway, which is why this
check in MM was added in the first place.
I don't recall the specifics of *what* the bug was; +Jesper who maybe does?
-Toke
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-24 14:43 ` Toke Høiland-Jørgensen
@ 2025-06-24 14:56 ` David Hildenbrand
2025-06-25 1:24 ` Byungchul Park
0 siblings, 1 reply; 35+ messages in thread
From: David Hildenbrand @ 2025-06-24 14:56 UTC (permalink / raw)
To: Toke Høiland-Jørgensen, Zi Yan
Cc: Byungchul Park, willy, netdev, linux-kernel, linux-mm,
kernel_team, kuba, almasrymina, ilias.apalodimas, harry.yoo, hawk,
akpm, davem, john.fastabend, andrew+netdev, asml.silence, tariqt,
edumazet, pabeni, saeedm, leon, ast, daniel, lorenzo.stoakes,
Liam.Howlett, vbabka, rppt, surenb, mhocko, horms, linux-rdma,
bpf, vishal.moola, hannes, jackmanb, jesper@cloudflare.com
On 24.06.25 16:43, Toke Høiland-Jørgensen wrote:
> Zi Yan <ziy@nvidia.com> writes:
>
>> On 23 Jun 2025, at 10:58, David Hildenbrand wrote:
>>
>>> On 23.06.25 13:13, Zi Yan wrote:
>>>> On 23 Jun 2025, at 6:16, Byungchul Park wrote:
>>>>
>>>>> On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
>>>>>> On 20.06.25 06:12, Byungchul Park wrote:
>>>>>>> To simplify struct page, the effort to separate its own descriptor from
>>>>>>> struct page is required and the work for page pool is on going.
>>>>>>>
>>>>>>> To achieve that, all the code should avoid directly accessing page pool
>>>>>>> members of struct page.
>>>>>>>
>>>>>>> Access ->pp_magic through struct netmem_desc instead of directly
>>>>>>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
>>>>>>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
>>>>>>> without header dependency issue.
>>>>>>>
>>>>>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
>>>>>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
>>>>>>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
>>>>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>>>>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>>> ---
>>>>>>> include/linux/mm.h | 12 ------------
>>>>>>> include/net/netmem.h | 14 ++++++++++++++
>>>>>>> mm/page_alloc.c | 1 +
>>>>>>> 3 files changed, 15 insertions(+), 12 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>>>> index 0ef2ba0c667a..0b7f7f998085 100644
>>>>>>> --- a/include/linux/mm.h
>>>>>>> +++ b/include/linux/mm.h
>>>>>>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>>>>>>> */
>>>>>>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>>>>>
>>>>>>> -#ifdef CONFIG_PAGE_POOL
>>>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>>>> -{
>>>>>>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>>>> -}
>>>>>>> -#else
>>>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>>>> -{
>>>>>>> - return false;
>>>>>>> -}
>>>>>>> -#endif
>>>>>>> -
>>>>>>> #endif /* _LINUX_MM_H */
>>>>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>>>>>> index d49ed49d250b..3d1b1dfc9ba5 100644
>>>>>>> --- a/include/net/netmem.h
>>>>>>> +++ b/include/net/netmem.h
>>>>>>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>>>>>>> */
>>>>>>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>>>>>>>
>>>>>>> +#ifdef CONFIG_PAGE_POOL
>>>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>>>> +{
>>>>>>> + struct netmem_desc *desc = (struct netmem_desc *)page;
>>>>>>> +
>>>>>>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>>>> +}
>>>>>>> +#else
>>>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>>>> +{
>>>>>>> + return false;
>>>>>>> +}
>>>>>>> +#endif
>>>>>>
>>>>>> I wonder how helpful this cleanup is long-term.
>>>>>>
>>>>>> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
>>>>>
>>>>> Yes.
>>>>>
>>>>>> There, we want to make sure that no pagepool page is ever returned to
>>>>>> the buddy.
>>>>>>
>>>>>> How reasonable is this sanity check to have long-term? Wouldn't we be
>>>>>> able to check that on some higher-level freeing path?
>>>>>>
>>>>>> The reason I am commenting is that once we decouple "struct page" from
>>>>>> "struct netmem_desc", we'd have to lookup here the corresponding "struct
>>>>>> netmem_desc".
>>>>>>
>>>>>> ... but at that point here (when we free the actual pages), the "struct
>>>>>> netmem_desc" would likely already have been freed separately (remember:
>>>>>> it will be dynamically allocated).
>>>>>>
>>>>>> With that in mind:
>>>>>>
>>>>>> 1) Is there a higher level "struct netmem_desc" freeing path where we
>>>>>> could check that instead, so we don't have to cast from pages to
>>>>>> netmem_desc at all.
>>>>>
>>>>> I also thought it's too paranoiac. However, I thought it's other issue
>>>>> than this work. That's why I left the API as is for now, it can be gone
>>>>> once we get convinced the check is unnecessary in deep buddy. Wrong?
>>>>>
>>>>>> 2) How valuable are these sanity checks deep in the buddy?
>>>>>
>>>>> That was also what I felt weird on.
>>>>
>>>> It seems very useful when I asked last time[1]:
>>>>
>>>> |> We have actually used this at Cloudflare to catch some page_pool bugs.
>>>
>>> My question is rather, whether there is some higher-level freeing path for netmem_desc where we could check that instead (IOW, earlier).
>>>
>>> Or is it really arbitrary put_page() (IOW, we assume that many possible references can be held)?
>>
>> +Toke, who I talked about this last time.
>>
>> Maybe he can shed some light on it.
>
> As others have pointed out, basically, AFAIU: Yes, pages are *supposed*
> to go through a common freeing path where this check could reside, but
> we've had bugs where they ended up leaking anyway, which is why this
> check in MM was added in the first place.
Okay, thanks. If we could be using a page type instead to catch such
leaks to the page allocator, we could implement it without any such
pp-specific checks.
page types are stored in page->page_type and overlay page->_mapcount
right now.
Looking at "struct netmem_desc", page->_mapcount should not be overlayed
(good!).
So, you could be setting the type when creating a "struct netmem_desc"
page, and clearing the type when about to free the page. In the buddy,
you can then check without any casts from page to whatever else if the
type is still unexpectedly set. If still set, you know that there is
unexpected freeing.
I'll note that page types will be a building blocks of memdescs, to
descibe "what we are pointing at". See
https://kernelnewbies.org/MatthewWilcox/Memdescs
Willy already planned for a "Bump" type; I assume this would now be
"NMDesc" or sth like that IIUC.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-24 14:56 ` David Hildenbrand
@ 2025-06-25 1:24 ` Byungchul Park
2025-06-26 6:35 ` Byungchul Park
0 siblings, 1 reply; 35+ messages in thread
From: Byungchul Park @ 2025-06-25 1:24 UTC (permalink / raw)
To: David Hildenbrand
Cc: Toke Høiland-Jørgensen, Zi Yan, willy, netdev,
linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola, hannes,
jackmanb, jesper@cloudflare.com
On Tue, Jun 24, 2025 at 04:56:32PM +0200, David Hildenbrand wrote:
>
> On 24.06.25 16:43, Toke Høiland-Jørgensen wrote:
> > Zi Yan <ziy@nvidia.com> writes:
> >
> > > On 23 Jun 2025, at 10:58, David Hildenbrand wrote:
> > >
> > > > On 23.06.25 13:13, Zi Yan wrote:
> > > > > On 23 Jun 2025, at 6:16, Byungchul Park wrote:
> > > > >
> > > > > > On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
> > > > > > > On 20.06.25 06:12, Byungchul Park wrote:
> > > > > > > > To simplify struct page, the effort to separate its own descriptor from
> > > > > > > > struct page is required and the work for page pool is on going.
> > > > > > > >
> > > > > > > > To achieve that, all the code should avoid directly accessing page pool
> > > > > > > > members of struct page.
> > > > > > > >
> > > > > > > > Access ->pp_magic through struct netmem_desc instead of directly
> > > > > > > > accessing it through struct page in page_pool_page_is_pp(). Plus, move
> > > > > > > > page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> > > > > > > > without header dependency issue.
> > > > > > > >
> > > > > > > > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > > > > > > > Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > > > > > > > Reviewed-by: Mina Almasry <almasrymina@google.com>
> > > > > > > > Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
> > > > > > > > Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > > > > Acked-by: Harry Yoo <harry.yoo@oracle.com>
> > > > > > > > ---
> > > > > > > > include/linux/mm.h | 12 ------------
> > > > > > > > include/net/netmem.h | 14 ++++++++++++++
> > > > > > > > mm/page_alloc.c | 1 +
> > > > > > > > 3 files changed, 15 insertions(+), 12 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > > > > > > index 0ef2ba0c667a..0b7f7f998085 100644
> > > > > > > > --- a/include/linux/mm.h
> > > > > > > > +++ b/include/linux/mm.h
> > > > > > > > @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> > > > > > > > */
> > > > > > > > #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
> > > > > > > >
> > > > > > > > -#ifdef CONFIG_PAGE_POOL
> > > > > > > > -static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > -{
> > > > > > > > - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > > > > > > > -}
> > > > > > > > -#else
> > > > > > > > -static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > -{
> > > > > > > > - return false;
> > > > > > > > -}
> > > > > > > > -#endif
> > > > > > > > -
> > > > > > > > #endif /* _LINUX_MM_H */
> > > > > > > > diff --git a/include/net/netmem.h b/include/net/netmem.h
> > > > > > > > index d49ed49d250b..3d1b1dfc9ba5 100644
> > > > > > > > --- a/include/net/netmem.h
> > > > > > > > +++ b/include/net/netmem.h
> > > > > > > > @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> > > > > > > > */
> > > > > > > > static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
> > > > > > > >
> > > > > > > > +#ifdef CONFIG_PAGE_POOL
> > > > > > > > +static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > +{
> > > > > > > > + struct netmem_desc *desc = (struct netmem_desc *)page;
> > > > > > > > +
> > > > > > > > + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > > > > > > > +}
> > > > > > > > +#else
> > > > > > > > +static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > +{
> > > > > > > > + return false;
> > > > > > > > +}
> > > > > > > > +#endif
> > > > > > >
> > > > > > > I wonder how helpful this cleanup is long-term.
> > > > > > >
> > > > > > > page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
> > > > > >
> > > > > > Yes.
> > > > > >
> > > > > > > There, we want to make sure that no pagepool page is ever returned to
> > > > > > > the buddy.
> > > > > > >
> > > > > > > How reasonable is this sanity check to have long-term? Wouldn't we be
> > > > > > > able to check that on some higher-level freeing path?
> > > > > > >
> > > > > > > The reason I am commenting is that once we decouple "struct page" from
> > > > > > > "struct netmem_desc", we'd have to lookup here the corresponding "struct
> > > > > > > netmem_desc".
> > > > > > >
> > > > > > > ... but at that point here (when we free the actual pages), the "struct
> > > > > > > netmem_desc" would likely already have been freed separately (remember:
> > > > > > > it will be dynamically allocated).
> > > > > > >
> > > > > > > With that in mind:
> > > > > > >
> > > > > > > 1) Is there a higher level "struct netmem_desc" freeing path where we
> > > > > > > could check that instead, so we don't have to cast from pages to
> > > > > > > netmem_desc at all.
> > > > > >
> > > > > > I also thought it's too paranoiac. However, I thought it's other issue
> > > > > > than this work. That's why I left the API as is for now, it can be gone
> > > > > > once we get convinced the check is unnecessary in deep buddy. Wrong?
> > > > > >
> > > > > > > 2) How valuable are these sanity checks deep in the buddy?
> > > > > >
> > > > > > That was also what I felt weird on.
> > > > >
> > > > > It seems very useful when I asked last time[1]:
> > > > >
> > > > > |> We have actually used this at Cloudflare to catch some page_pool bugs.
> > > >
> > > > My question is rather, whether there is some higher-level freeing path for netmem_desc where we could check that instead (IOW, earlier).
> > > >
> > > > Or is it really arbitrary put_page() (IOW, we assume that many possible references can be held)?
> > >
> > > +Toke, who I talked about this last time.
> > >
> > > Maybe he can shed some light on it.
> >
> > As others have pointed out, basically, AFAIU: Yes, pages are *supposed*
> > to go through a common freeing path where this check could reside, but
> > we've had bugs where they ended up leaking anyway, which is why this
> > check in MM was added in the first place.
>
> Okay, thanks. If we could be using a page type instead to catch such
> leaks to the page allocator, we could implement it without any such
> pp-specific checks.
>
> page types are stored in page->page_type and overlay page->_mapcount
> right now.
>
> Looking at "struct netmem_desc", page->_mapcount should not be overlayed
> (good!).
>
>
> So, you could be setting the type when creating a "struct netmem_desc"
> page, and clearing the type when about to free the page. In the buddy,
> you can then check without any casts from page to whatever else if the
> type is still unexpectedly set. If still set, you know that there is
> unexpected freeing.
Yeah, this is what we all were looking forward to. However, I decided
to use the pp field for the checking since it's not ready for now.
So.. Is the current approach okay *for now*, even though the approach
should be updated once the type can be checked inside mm later?
Or do you want me to wait for it to be ready before this netmem work to
remove the pp fields from struct page?
Byungchul
>
> I'll note that page types will be a building blocks of memdescs, to
> descibe "what we are pointing at". See
>
> https://kernelnewbies.org/MatthewWilcox/Memdescs
>
> Willy already planned for a "Bump" type; I assume this would now be
> "NMDesc" or sth like that IIUC.
>
> --
> Cheers,
>
> David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-25 1:24 ` Byungchul Park
@ 2025-06-26 6:35 ` Byungchul Park
0 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-26 6:35 UTC (permalink / raw)
To: David Hildenbrand
Cc: Toke Høiland-Jørgensen, Zi Yan, willy, netdev,
linux-kernel, linux-mm, kernel_team, kuba, almasrymina,
ilias.apalodimas, harry.yoo, hawk, akpm, davem, john.fastabend,
andrew+netdev, asml.silence, tariqt, edumazet, pabeni, saeedm,
leon, ast, daniel, lorenzo.stoakes, Liam.Howlett, vbabka, rppt,
surenb, mhocko, horms, linux-rdma, bpf, vishal.moola, hannes,
jackmanb, jesper@cloudflare.com
On Wed, Jun 25, 2025 at 10:24:32AM +0900, Byungchul Park wrote:
> On Tue, Jun 24, 2025 at 04:56:32PM +0200, David Hildenbrand wrote:
> >
> > On 24.06.25 16:43, Toke Høiland-Jørgensen wrote:
> > > Zi Yan <ziy@nvidia.com> writes:
> > >
> > > > On 23 Jun 2025, at 10:58, David Hildenbrand wrote:
> > > >
> > > > > On 23.06.25 13:13, Zi Yan wrote:
> > > > > > On 23 Jun 2025, at 6:16, Byungchul Park wrote:
> > > > > >
> > > > > > > On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
> > > > > > > > On 20.06.25 06:12, Byungchul Park wrote:
> > > > > > > > > To simplify struct page, the effort to separate its own descriptor from
> > > > > > > > > struct page is required and the work for page pool is on going.
> > > > > > > > >
> > > > > > > > > To achieve that, all the code should avoid directly accessing page pool
> > > > > > > > > members of struct page.
> > > > > > > > >
> > > > > > > > > Access ->pp_magic through struct netmem_desc instead of directly
> > > > > > > > > accessing it through struct page in page_pool_page_is_pp(). Plus, move
> > > > > > > > > page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> > > > > > > > > without header dependency issue.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > > > > > > > > Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > > > > > > > > Reviewed-by: Mina Almasry <almasrymina@google.com>
> > > > > > > > > Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
> > > > > > > > > Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> > > > > > > > > Acked-by: Harry Yoo <harry.yoo@oracle.com>
> > > > > > > > > ---
> > > > > > > > > include/linux/mm.h | 12 ------------
> > > > > > > > > include/net/netmem.h | 14 ++++++++++++++
> > > > > > > > > mm/page_alloc.c | 1 +
> > > > > > > > > 3 files changed, 15 insertions(+), 12 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > > > > > > > index 0ef2ba0c667a..0b7f7f998085 100644
> > > > > > > > > --- a/include/linux/mm.h
> > > > > > > > > +++ b/include/linux/mm.h
> > > > > > > > > @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> > > > > > > > > */
> > > > > > > > > #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
> > > > > > > > >
> > > > > > > > > -#ifdef CONFIG_PAGE_POOL
> > > > > > > > > -static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > > -{
> > > > > > > > > - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > > > > > > > > -}
> > > > > > > > > -#else
> > > > > > > > > -static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > > -{
> > > > > > > > > - return false;
> > > > > > > > > -}
> > > > > > > > > -#endif
> > > > > > > > > -
> > > > > > > > > #endif /* _LINUX_MM_H */
> > > > > > > > > diff --git a/include/net/netmem.h b/include/net/netmem.h
> > > > > > > > > index d49ed49d250b..3d1b1dfc9ba5 100644
> > > > > > > > > --- a/include/net/netmem.h
> > > > > > > > > +++ b/include/net/netmem.h
> > > > > > > > > @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> > > > > > > > > */
> > > > > > > > > static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
> > > > > > > > >
> > > > > > > > > +#ifdef CONFIG_PAGE_POOL
> > > > > > > > > +static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > > +{
> > > > > > > > > + struct netmem_desc *desc = (struct netmem_desc *)page;
> > > > > > > > > +
> > > > > > > > > + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> > > > > > > > > +}
> > > > > > > > > +#else
> > > > > > > > > +static inline bool page_pool_page_is_pp(struct page *page)
> > > > > > > > > +{
> > > > > > > > > + return false;
> > > > > > > > > +}
> > > > > > > > > +#endif
> > > > > > > >
> > > > > > > > I wonder how helpful this cleanup is long-term.
> > > > > > > >
> > > > > > > > page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
> > > > > > >
> > > > > > > Yes.
> > > > > > >
> > > > > > > > There, we want to make sure that no pagepool page is ever returned to
> > > > > > > > the buddy.
> > > > > > > >
> > > > > > > > How reasonable is this sanity check to have long-term? Wouldn't we be
> > > > > > > > able to check that on some higher-level freeing path?
> > > > > > > >
> > > > > > > > The reason I am commenting is that once we decouple "struct page" from
> > > > > > > > "struct netmem_desc", we'd have to lookup here the corresponding "struct
> > > > > > > > netmem_desc".
> > > > > > > >
> > > > > > > > ... but at that point here (when we free the actual pages), the "struct
> > > > > > > > netmem_desc" would likely already have been freed separately (remember:
> > > > > > > > it will be dynamically allocated).
> > > > > > > >
> > > > > > > > With that in mind:
> > > > > > > >
> > > > > > > > 1) Is there a higher level "struct netmem_desc" freeing path where we
> > > > > > > > could check that instead, so we don't have to cast from pages to
> > > > > > > > netmem_desc at all.
> > > > > > >
> > > > > > > I also thought it's too paranoiac. However, I thought it's other issue
> > > > > > > than this work. That's why I left the API as is for now, it can be gone
> > > > > > > once we get convinced the check is unnecessary in deep buddy. Wrong?
> > > > > > >
> > > > > > > > 2) How valuable are these sanity checks deep in the buddy?
> > > > > > >
> > > > > > > That was also what I felt weird on.
> > > > > >
> > > > > > It seems very useful when I asked last time[1]:
> > > > > >
> > > > > > |> We have actually used this at Cloudflare to catch some page_pool bugs.
> > > > >
> > > > > My question is rather, whether there is some higher-level freeing path for netmem_desc where we could check that instead (IOW, earlier).
> > > > >
> > > > > Or is it really arbitrary put_page() (IOW, we assume that many possible references can be held)?
> > > >
> > > > +Toke, who I talked about this last time.
> > > >
> > > > Maybe he can shed some light on it.
> > >
> > > As others have pointed out, basically, AFAIU: Yes, pages are *supposed*
> > > to go through a common freeing path where this check could reside, but
> > > we've had bugs where they ended up leaking anyway, which is why this
> > > check in MM was added in the first place.
> >
> > Okay, thanks. If we could be using a page type instead to catch such
> > leaks to the page allocator, we could implement it without any such
> > pp-specific checks.
> >
> > page types are stored in page->page_type and overlay page->_mapcount
> > right now.
> >
> > Looking at "struct netmem_desc", page->_mapcount should not be overlayed
> > (good!).
> >
> >
> > So, you could be setting the type when creating a "struct netmem_desc"
> > page, and clearing the type when about to free the page. In the buddy,
> > you can then check without any casts from page to whatever else if the
> > type is still unexpectedly set. If still set, you know that there is
> > unexpected freeing.
>
> Yeah, this is what we all were looking forward to. However, I decided
> to use the pp field for the checking since it's not ready for now.
>
> So.. Is the current approach okay *for now*, even though the approach
> should be updated once the type can be checked inside mm later?
>
> Or do you want me to wait for it to be ready before this netmem work to
> remove the pp fields from struct page?
Just in case you may get it wrong, I'm still waiting for your answer,
even though I've sent v7 with this patch excluded.
To Willy and David,
Is the work to add pp type(or bump) going to be done shortly? If yes,
I should wait for it. If no, it'd better use the pp magic field *for
now*.
Byungchul
> Byungchul
> >
> > I'll note that page types will be a building blocks of memdescs, to
> > descibe "what we are pointing at". See
> >
> > https://kernelnewbies.org/MatthewWilcox/Memdescs
> >
> > Willy already planned for a "Bump" type; I assume this would now be
> > "NMDesc" or sth like that IIUC.
> >
> > --
> > Cheers,
> >
> > David / dhildenb
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 14:58 ` David Hildenbrand
2025-06-23 15:25 ` Zi Yan
@ 2025-06-23 17:06 ` Pavel Begunkov
2025-06-23 17:28 ` Mina Almasry
1 sibling, 1 reply; 35+ messages in thread
From: Pavel Begunkov @ 2025-06-23 17:06 UTC (permalink / raw)
To: David Hildenbrand, Zi Yan, Byungchul Park
Cc: willy, netdev, linux-kernel, linux-mm, kernel_team, kuba,
almasrymina, ilias.apalodimas, harry.yoo, hawk, akpm, davem,
john.fastabend, andrew+netdev, toke, tariqt, edumazet, pabeni,
saeedm, leon, ast, daniel, lorenzo.stoakes, Liam.Howlett, vbabka,
rppt, surenb, mhocko, horms, linux-rdma, bpf, vishal.moola,
hannes, jackmanb
On 6/23/25 15:58, David Hildenbrand wrote:
> On 23.06.25 13:13, Zi Yan wrote:
>> On 23 Jun 2025, at 6:16, Byungchul Park wrote:
>>
>>> On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
>>>> On 20.06.25 06:12, Byungchul Park wrote:
>>>>> To simplify struct page, the effort to separate its own descriptor from
>>>>> struct page is required and the work for page pool is on going.
>>>>>
>>>>> To achieve that, all the code should avoid directly accessing page pool
>>>>> members of struct page.
>>>>>
>>>>> Access ->pp_magic through struct netmem_desc instead of directly
>>>>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
>>>>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
>>>>> without header dependency issue.
>>>>>
>>>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
>>>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
>>>>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
>>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
>>>>> ---
>>>>> include/linux/mm.h | 12 ------------
>>>>> include/net/netmem.h | 14 ++++++++++++++
>>>>> mm/page_alloc.c | 1 +
>>>>> 3 files changed, 15 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>> index 0ef2ba0c667a..0b7f7f998085 100644
>>>>> --- a/include/linux/mm.h
>>>>> +++ b/include/linux/mm.h
>>>>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
>>>>> */
>>>>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
>>>>>
>>>>> -#ifdef CONFIG_PAGE_POOL
>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>> -{
>>>>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>> -}
>>>>> -#else
>>>>> -static inline bool page_pool_page_is_pp(struct page *page)
>>>>> -{
>>>>> - return false;
>>>>> -}
>>>>> -#endif
>>>>> -
>>>>> #endif /* _LINUX_MM_H */
>>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>>>> index d49ed49d250b..3d1b1dfc9ba5 100644
>>>>> --- a/include/net/netmem.h
>>>>> +++ b/include/net/netmem.h
>>>>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
>>>>> */
>>>>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>>>>>
>>>>> +#ifdef CONFIG_PAGE_POOL
>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>> +{
>>>>> + struct netmem_desc *desc = (struct netmem_desc *)page;
>>>>> +
>>>>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
>>>>> +}
>>>>> +#else
>>>>> +static inline bool page_pool_page_is_pp(struct page *page)
>>>>> +{
>>>>> + return false;
>>>>> +}
>>>>> +#endif
>>>>
>>>> I wonder how helpful this cleanup is long-term.
>>>>
>>>> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
>>>
>>> Yes.
>>>
>>>> There, we want to make sure that no pagepool page is ever returned to
>>>> the buddy.
>>>>
>>>> How reasonable is this sanity check to have long-term? Wouldn't we be
>>>> able to check that on some higher-level freeing path?
>>>>
>>>> The reason I am commenting is that once we decouple "struct page" from
>>>> "struct netmem_desc", we'd have to lookup here the corresponding "struct
>>>> netmem_desc".
>>>>
>>>> ... but at that point here (when we free the actual pages), the "struct
>>>> netmem_desc" would likely already have been freed separately (remember:
>>>> it will be dynamically allocated).
>>>>
>>>> With that in mind:
>>>>
>>>> 1) Is there a higher level "struct netmem_desc" freeing path where we
>>>> could check that instead, so we don't have to cast from pages to
>>>> netmem_desc at all.
As you said, it's just a sanity check, all page pool pages should
be freed by the networking code. It checks the ownership with
netmem_is_pp(), which is basically the same as page_pool_page_is_pp()
but done though some aliasing.
static inline bool netmem_is_pp(netmem_ref netmem)
{
return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
}
I assume there is no point in moving the check to skbuff.c as it
already does exactly same test, but we can probably just kill it.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 17:06 ` Pavel Begunkov
@ 2025-06-23 17:28 ` Mina Almasry
2025-06-23 18:09 ` Vlastimil Babka
2025-06-23 18:14 ` Pavel Begunkov
0 siblings, 2 replies; 35+ messages in thread
From: Mina Almasry @ 2025-06-23 17:28 UTC (permalink / raw)
To: Pavel Begunkov
Cc: David Hildenbrand, Zi Yan, Byungchul Park, willy, netdev,
linux-kernel, linux-mm, kernel_team, kuba, ilias.apalodimas,
harry.yoo, hawk, akpm, davem, john.fastabend, andrew+netdev, toke,
tariqt, edumazet, pabeni, saeedm, leon, ast, daniel,
lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
horms, linux-rdma, bpf, vishal.moola, hannes, jackmanb
On Mon, Jun 23, 2025 at 10:05 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 6/23/25 15:58, David Hildenbrand wrote:
> > On 23.06.25 13:13, Zi Yan wrote:
> >> On 23 Jun 2025, at 6:16, Byungchul Park wrote:
> >>
> >>> On Mon, Jun 23, 2025 at 11:16:43AM +0200, David Hildenbrand wrote:
> >>>> On 20.06.25 06:12, Byungchul Park wrote:
> >>>>> To simplify struct page, the effort to separate its own descriptor from
> >>>>> struct page is required and the work for page pool is on going.
> >>>>>
> >>>>> To achieve that, all the code should avoid directly accessing page pool
> >>>>> members of struct page.
> >>>>>
> >>>>> Access ->pp_magic through struct netmem_desc instead of directly
> >>>>> accessing it through struct page in page_pool_page_is_pp(). Plus, move
> >>>>> page_pool_page_is_pp() from mm.h to netmem.h to use struct netmem_desc
> >>>>> without header dependency issue.
> >>>>>
> >>>>> Signed-off-by: Byungchul Park <byungchul@sk.com>
> >>>>> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >>>>> Reviewed-by: Mina Almasry <almasrymina@google.com>
> >>>>> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
> >>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >>>>> Acked-by: Harry Yoo <harry.yoo@oracle.com>
> >>>>> ---
> >>>>> include/linux/mm.h | 12 ------------
> >>>>> include/net/netmem.h | 14 ++++++++++++++
> >>>>> mm/page_alloc.c | 1 +
> >>>>> 3 files changed, 15 insertions(+), 12 deletions(-)
> >>>>>
> >>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >>>>> index 0ef2ba0c667a..0b7f7f998085 100644
> >>>>> --- a/include/linux/mm.h
> >>>>> +++ b/include/linux/mm.h
> >>>>> @@ -4172,16 +4172,4 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
> >>>>> */
> >>>>> #define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
> >>>>>
> >>>>> -#ifdef CONFIG_PAGE_POOL
> >>>>> -static inline bool page_pool_page_is_pp(struct page *page)
> >>>>> -{
> >>>>> - return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> >>>>> -}
> >>>>> -#else
> >>>>> -static inline bool page_pool_page_is_pp(struct page *page)
> >>>>> -{
> >>>>> - return false;
> >>>>> -}
> >>>>> -#endif
> >>>>> -
> >>>>> #endif /* _LINUX_MM_H */
> >>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
> >>>>> index d49ed49d250b..3d1b1dfc9ba5 100644
> >>>>> --- a/include/net/netmem.h
> >>>>> +++ b/include/net/netmem.h
> >>>>> @@ -56,6 +56,20 @@ NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> >>>>> */
> >>>>> static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
> >>>>>
> >>>>> +#ifdef CONFIG_PAGE_POOL
> >>>>> +static inline bool page_pool_page_is_pp(struct page *page)
> >>>>> +{
> >>>>> + struct netmem_desc *desc = (struct netmem_desc *)page;
> >>>>> +
> >>>>> + return (desc->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
> >>>>> +}
> >>>>> +#else
> >>>>> +static inline bool page_pool_page_is_pp(struct page *page)
> >>>>> +{
> >>>>> + return false;
> >>>>> +}
> >>>>> +#endif
> >>>>
> >>>> I wonder how helpful this cleanup is long-term.
> >>>>
> >>>> page_pool_page_is_pp() is only called from mm/page_alloc.c, right?
> >>>
> >>> Yes.
> >>>
> >>>> There, we want to make sure that no pagepool page is ever returned to
> >>>> the buddy.
> >>>>
> >>>> How reasonable is this sanity check to have long-term? Wouldn't we be
> >>>> able to check that on some higher-level freeing path?
> >>>>
> >>>> The reason I am commenting is that once we decouple "struct page" from
> >>>> "struct netmem_desc", we'd have to lookup here the corresponding "struct
> >>>> netmem_desc".
> >>>>
> >>>> ... but at that point here (when we free the actual pages), the "struct
> >>>> netmem_desc" would likely already have been freed separately (remember:
> >>>> it will be dynamically allocated).
> >>>>
> >>>> With that in mind:
> >>>>
> >>>> 1) Is there a higher level "struct netmem_desc" freeing path where we
> >>>> could check that instead, so we don't have to cast from pages to
> >>>> netmem_desc at all.
>
> As you said, it's just a sanity check, all page pool pages should
> be freed by the networking code. It checks the ownership with
> netmem_is_pp(), which is basically the same as page_pool_page_is_pp()
> but done though some aliasing.
>
> static inline bool netmem_is_pp(netmem_ref netmem)
> {
> return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
> }
>
> I assume there is no point in moving the check to skbuff.c as it
> already does exactly same test, but we can probably just kill it.
>
Even if we do kill it, maybe lets do that in a separate patch, and
maybe a separate series. I would recommend not complicating this one?
Also, AFAIU, this is about removing/moving the checks in
bad_page_reason() and page_expected_state()? I think this check does
fire sometimes. I saw at least 1 report in the last year of a
bad_page_reason() check firing because the page_pool got its
accounting wrong and released a page to the buddy allocator early, so
maybe that new patch that removes that check should explain why this
check is no longer necessary.
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 17:28 ` Mina Almasry
@ 2025-06-23 18:09 ` Vlastimil Babka
2025-06-23 18:14 ` Pavel Begunkov
1 sibling, 0 replies; 35+ messages in thread
From: Vlastimil Babka @ 2025-06-23 18:09 UTC (permalink / raw)
To: Mina Almasry, Pavel Begunkov
Cc: David Hildenbrand, Zi Yan, Byungchul Park, willy, netdev,
linux-kernel, linux-mm, kernel_team, kuba, ilias.apalodimas,
harry.yoo, hawk, akpm, davem, john.fastabend, andrew+netdev, toke,
tariqt, edumazet, pabeni, saeedm, leon, ast, daniel,
lorenzo.stoakes, Liam.Howlett, rppt, surenb, mhocko, horms,
linux-rdma, bpf, vishal.moola, hannes, jackmanb
On 6/23/25 19:28, Mina Almasry wrote:
> On Mon, Jun 23, 2025 at 10:05 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>>
>> As you said, it's just a sanity check, all page pool pages should
>> be freed by the networking code. It checks the ownership with
>> netmem_is_pp(), which is basically the same as page_pool_page_is_pp()
>> but done though some aliasing.
>>
>> static inline bool netmem_is_pp(netmem_ref netmem)
>> {
>> return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
>> }
>>
>> I assume there is no point in moving the check to skbuff.c as it
>> already does exactly same test, but we can probably just kill it.
>>
>
> Even if we do kill it, maybe lets do that in a separate patch, and
> maybe a separate series. I would recommend not complicating this one?
>
> Also, AFAIU, this is about removing/moving the checks in
> bad_page_reason() and page_expected_state()? I think this check does
> fire sometimes. I saw at least 1 report in the last year of a
> bad_page_reason() check firing because the page_pool got its
> accounting wrong and released a page to the buddy allocator early, so
> maybe that new patch that removes that check should explain why this
> check is no longer necessary.
Note these checks are these days only done with CONFIG_DEBUG_VM, or
debugging/hardening options like debug_pagealloc/init_on_alloc/init_on_free,
see mem_debugging_and_hardening_init() and when check_pages_enabled static
key is enabled.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 17:28 ` Mina Almasry
2025-06-23 18:09 ` Vlastimil Babka
@ 2025-06-23 18:14 ` Pavel Begunkov
2025-06-24 1:54 ` Byungchul Park
1 sibling, 1 reply; 35+ messages in thread
From: Pavel Begunkov @ 2025-06-23 18:14 UTC (permalink / raw)
To: Mina Almasry
Cc: David Hildenbrand, Zi Yan, Byungchul Park, willy, netdev,
linux-kernel, linux-mm, kernel_team, kuba, ilias.apalodimas,
harry.yoo, hawk, akpm, davem, john.fastabend, andrew+netdev, toke,
tariqt, edumazet, pabeni, saeedm, leon, ast, daniel,
lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
horms, linux-rdma, bpf, vishal.moola, hannes, jackmanb
On 6/23/25 18:28, Mina Almasry wrote:
> On Mon, Jun 23, 2025 at 10:05 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
...>> As you said, it's just a sanity check, all page pool pages should
>> be freed by the networking code. It checks the ownership with
>> netmem_is_pp(), which is basically the same as page_pool_page_is_pp()
>> but done though some aliasing.
>>
>> static inline bool netmem_is_pp(netmem_ref netmem)
>> {
>> return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
>> }
>>
>> I assume there is no point in moving the check to skbuff.c as it
>> already does exactly same test, but we can probably just kill it.
>>
>
> Even if we do kill it, maybe lets do that in a separate patch, and
> maybe a separate series. I would recommend not complicating this one?
FWIW, the discussion somewhat mentioned "long term", but I'm not
suggesting actually removing it, it serves the purpose. And in
long term the helper will be converted to use page->type / etc.
without touching pp fields, that should reduce the degree of
ugliness and make it more acceptable for keeping in mm.
> Also, AFAIU, this is about removing/moving the checks in
> bad_page_reason() and page_expected_state()? I think this check does
> fire sometimes. I saw at least 1 report in the last year of a
> bad_page_reason() check firing because the page_pool got its
> accounting wrong and released a page to the buddy allocator early, so
> maybe that new patch that removes that check should explain why this
> check is no longer necessary.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH net-next v6 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp()
2025-06-23 18:14 ` Pavel Begunkov
@ 2025-06-24 1:54 ` Byungchul Park
0 siblings, 0 replies; 35+ messages in thread
From: Byungchul Park @ 2025-06-24 1:54 UTC (permalink / raw)
To: Pavel Begunkov
Cc: Mina Almasry, David Hildenbrand, Zi Yan, willy, netdev,
linux-kernel, linux-mm, kernel_team, kuba, ilias.apalodimas,
harry.yoo, hawk, akpm, davem, john.fastabend, andrew+netdev, toke,
tariqt, edumazet, pabeni, saeedm, leon, ast, daniel,
lorenzo.stoakes, Liam.Howlett, vbabka, rppt, surenb, mhocko,
horms, linux-rdma, bpf, vishal.moola, hannes, jackmanb
On Mon, Jun 23, 2025 at 07:14:41PM +0100, Pavel Begunkov wrote:
> On 6/23/25 18:28, Mina Almasry wrote:
> > On Mon, Jun 23, 2025 at 10:05 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> ...>> As you said, it's just a sanity check, all page pool pages should
> > > be freed by the networking code. It checks the ownership with
> > > netmem_is_pp(), which is basically the same as page_pool_page_is_pp()
> > > but done though some aliasing.
> > >
> > > static inline bool netmem_is_pp(netmem_ref netmem)
> > > {
> > > return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
> > > }
> > >
> > > I assume there is no point in moving the check to skbuff.c as it
> > > already does exactly same test, but we can probably just kill it.
> > >
> >
> > Even if we do kill it, maybe lets do that in a separate patch, and
> > maybe a separate series. I would recommend not complicating this one?
> FWIW, the discussion somewhat mentioned "long term", but I'm not
> suggesting actually removing it, it serves the purpose. And in
> long term the helper will be converted to use page->type / etc.
> without touching pp fields, that should reduce the degree of
> ugliness and make it more acceptable for keeping in mm.
Agree.
Byungchul
>
> > Also, AFAIU, this is about removing/moving the checks in
> > bad_page_reason() and page_expected_state()? I think this check does
> > fire sometimes. I saw at least 1 report in the last year of a
> > bad_page_reason() check firing because the page_pool got its
> > accounting wrong and released a page to the buddy allocator early, so
> > maybe that new patch that removes that check should explain why this
> > check is no longer necessary.
>
> --
> Pavel Begunkov
^ permalink raw reply [flat|nested] 35+ messages in thread