From: Byungchul Park <byungchul@sk.com>
To: Mina Almasry <almasrymina@google.com>
Cc: linux-mm@kvack.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, kernel_team@skhynix.com,
harry.yoo@oracle.com, ast@kernel.org, daniel@iogearbox.net,
davem@davemloft.net, kuba@kernel.org, hawk@kernel.org,
john.fastabend@gmail.com, sdf@fomichev.me, saeedm@nvidia.com,
leon@kernel.org, tariqt@nvidia.com, mbloch@nvidia.com,
andrew+netdev@lunn.ch, edumazet@google.com, pabeni@redhat.com,
akpm@linux-foundation.org, david@redhat.com,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
mhocko@suse.com, horms@kernel.org, jackmanb@google.com,
hannes@cmpxchg.org, ziy@nvidia.com, ilias.apalodimas@linaro.org,
willy@infradead.org, brauner@kernel.org, kas@kernel.org,
yuzhao@google.com, usamaarif642@gmail.com,
baolin.wang@linux.alibaba.com, toke@redhat.com,
asml.silence@gmail.com, bpf@vger.kernel.org,
linux-rdma@vger.kernel.org
Subject: Re: [PATCH] mm, page_pool: introduce a new page type for page pool in page type
Date: Fri, 25 Jul 2025 15:50:38 +0900 [thread overview]
Message-ID: <20250725065038.GA58004@system.software.com> (raw)
In-Reply-To: <CAHS8izM11fxu6jHZw5VJsHXeZ+Tk+6ZBGDk0vHiOoHyXZoOvOg@mail.gmail.com>
On Tue, Jul 22, 2025 at 03:17:15PM -0700, Mina Almasry wrote:
> On Sun, Jul 20, 2025 at 10:49 PM Byungchul Park <byungchul@sk.com> wrote:
> > diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
> > index cd95394399b4..39a97703d9ed 100644
> > --- a/net/core/netmem_priv.h
> > +++ b/net/core/netmem_priv.h
> > @@ -8,21 +8,11 @@ static inline unsigned long netmem_get_pp_magic(netmem_ref netmem)
> > return __netmem_clear_lsb(netmem)->pp_magic & ~PP_DMA_INDEX_MASK;
> > }
> >
> > -static inline void netmem_or_pp_magic(netmem_ref netmem, unsigned long pp_magic)
> > -{
> > - __netmem_clear_lsb(netmem)->pp_magic |= pp_magic;
> > -}
> > -
> > -static inline void netmem_clear_pp_magic(netmem_ref netmem)
> > -{
> > - WARN_ON_ONCE(__netmem_clear_lsb(netmem)->pp_magic & PP_DMA_INDEX_MASK);
> > -
> > - __netmem_clear_lsb(netmem)->pp_magic = 0;
> > -}
> > -
> > static inline bool netmem_is_pp(netmem_ref netmem)
> > {
> > - return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
> > + if (netmem_is_net_iov(netmem))
> > + return true;
>
> As Pavel alludes, this is dubious, and at least it's difficult to
> reason about it.
>
> There could be net_iovs that are not attached to pp, and should not be
> treated as pp memory. These are in the devmem (and future net_iov) tx
> paths.
>
> We need a way to tell if a net_iov is pp or not. A couple of options:
>
> 1. We could have it such that if net_iov->pp is set, then the
> netmem_is_pp == true, otherwise false.
> 2. We could implement a page-flags equivalent for net_iov.
>
> Option #1 is simpler and is my preferred. To do that properly, you need to:
>
> 1. Make sure everywhere net_iovs are allocated that pp=NULL in the
> non-pp case and pp=non NULL in the pp case. those callsites are
> net_devmem_bind_dmabuf (devmem rx & tx path), io_zcrx_create_area
A few seconds reviewing the code, fortunately netmem_set_pp(pool) and
netmem_or_pp_magic(PP_SIGNATURE) are always called paired, and
netmem_set_pp(NULL) and netmem_clear_pp_magic() are always called paired
too.
And there's no code to directly assign a value to ->pp and ->pp_magic,
except in net_devmem_alloc_dmabuf() but that is also safe because always
followed by page_pool_set_pp_info().
Even though I think it's already equivalent between checking
'->pp != NULL' and '->pp_magic == PP_SIGNATURE' with the current code,
more consideration for better code should be always welcome.
As you mentioned, at net_devmem_bind_dmabuf() and io_zcrx_create_area(),
it'd better initialize ->pp and ->pp_magic like:
--
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 00d0064b22a5..8f2051b2c505 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -430,6 +430,7 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
area->freelist[i] = i;
atomic_set(&area->user_refs[i], 0);
niov->type = NET_IOV_IOURING;
+ page_pool_clear_pp_info(net_iov_to_netmem(niov));
}
area->free_count = nr_iovs;
diff --git a/net/core/devmem.c b/net/core/devmem.c
index b3a62ca0df65..5d017c9f4986 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -285,6 +285,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
niov = &owner->area.niovs[i];
niov->type = NET_IOV_DMABUF;
niov->owner = &owner->area;
+ page_pool_clear_pp_info(net_iov_to_netmem(niov));
page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
net_devmem_get_dma_addr(niov));
if (direction == DMA_TO_DEVICE)
--
Do you think it works for using ->pp to check if a niov is pp?
Byungchul
prev parent reply other threads:[~2025-07-25 6:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-21 5:49 [PATCH] mm, page_pool: introduce a new page type for page pool in page type Byungchul Park
2025-07-21 8:05 ` David Hildenbrand
2025-07-21 8:19 ` Byungchul Park
2025-07-21 8:49 ` David Hildenbrand
2025-07-22 1:04 ` Byungchul Park
2025-07-28 10:57 ` Byungchul Park
2025-07-21 11:12 ` Pavel Begunkov
2025-07-22 1:03 ` Byungchul Park
2025-07-22 22:20 ` Mina Almasry
2025-07-22 22:17 ` Mina Almasry
2025-07-23 4:46 ` Byungchul Park
2025-07-24 21:23 ` Mina Almasry
2025-07-25 0:26 ` Byungchul Park
2025-07-25 6:50 ` Byungchul Park [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250725065038.GA58004@system.software.com \
--to=byungchul@sk.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=asml.silence@gmail.com \
--cc=ast@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=edumazet@google.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jackmanb@google.com \
--cc=john.fastabend@gmail.com \
--cc=kas@kernel.org \
--cc=kernel_team@skhynix.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mbloch@nvidia.com \
--cc=mhocko@suse.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=sdf@fomichev.me \
--cc=surenb@google.com \
--cc=tariqt@nvidia.com \
--cc=toke@redhat.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).