All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: Mina Almasry <almasrymina@google.com>,
	willy@infradead.org, Byungchul Park <byungchul@sk.com>,
	Jakub Kicinski <kuba@kernel.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, kernel_team@skhynix.com,
	ilias.apalodimas@linaro.org, hawk@kernel.org,
	akpm@linux-foundation.org, davem@davemloft.net,
	john.fastabend@gmail.com, andrew+netdev@lunn.ch,
	asml.silence@gmail.com, toke@redhat.com, tariqt@nvidia.com,
	edumazet@google.com, pabeni@redhat.com, saeedm@nvidia.com,
	leon@kernel.org, ast@kernel.org, daniel@iogearbox.net,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, horms@kernel.org, linux-rdma@vger.kernel.org,
	bpf@vger.kernel.org, vishal.moola@gmail.com
Subject: Re: netmem series needs some love and Acks from MM folks
Date: Wed, 18 Jun 2025 01:32:47 +0900	[thread overview]
Message-ID: <aFGYr6qlzVlqhK55@harry> (raw)
In-Reply-To: <129fe808-4285-48fe-95b6-00ea19bd87af@redhat.com>

On Tue, Jun 17, 2025 at 06:09:36PM +0200, David Hildenbrand wrote:
> On 17.06.25 04:31, Harry Yoo wrote:
> > On Fri, Jun 13, 2025 at 07:19:07PM -0700, Mina Almasry wrote:
> > > On Thu, Jun 12, 2025 at 6:13 PM Byungchul Park <byungchul@sk.com> wrote:
> > > > 
> > > > On Wed, Jun 11, 2025 at 06:55:42PM -0700, Jakub Kicinski wrote:
> > > > > On Tue, 10 Jun 2025 10:30:01 +0900 Byungchul Park wrote:
> > > > > > > What's the intended relation between the types?
> > > > > > 
> > > > > > One thing I'm trying to achieve is to remove pp fields from struct page,
> > > > > > and make network code use struct netmem_desc { pp fields; } instead of
> > > > > > sturc page for that purpose.
> > > > > > 
> > > > > > The reason why I union'ed it with the existing pp fields in struct
> > > > > > net_iov *temporarily* for now is, to fade out the existing pp fields
> > > > > > from struct net_iov so as to make the final form like:
> > > > > 
> > > > > I see, I may have mixed up the complaints there. I thought the effort
> > > > > was also about removing the need for the ref count. And Rx is
> > > > > relatively light on use of ref counting.
> > > > > 
> > > > > > > netmem_ref exists to clearly indicate that memory may not be readable.
> > > > > > > Majority of memory we expect to allocate from page pool must be
> > > > > > > kernel-readable. What's the plan for reading the "single pointer"
> > > > > > > memory within the kernel?
> > > > > > > 
> > > > > > > I think you're approaching this problem from the easiest and least
> > > > > > 
> > > > > > No, I've never looked for the easiest way.  My bad if there are a better
> > > > > > way to achieve it.  What would you recommend?
> > > > > 
> > > > > Sorry, I don't mean that the approach you took is the easiest way out.
> > > > > I meant that between Rx and Tx handling Rx is the easier part because
> > > > > we already have the suitable abstraction. It's true that we use more
> > > > > fields in page struct on Rx, but I thought Tx is also more urgent
> > > > > as there are open reports for networking taking references on slab
> > > > > pages.
> > > > > 
> > > > > In any case, please make sure you maintain clear separation between
> > > > > readable and unreadable memory in the code you produce.
> > > > 
> > > > Do you mean the current patches do not?  If yes, please point out one
> > > > as example, which would be helpful to extract action items.
> > > > 
> > > 
> > > I think one thing we could do to improve separation between readable
> > > (pages/netmem_desc) and unreadable (net_iov) is to remove the struct
> > > netmem_desc field inside the net_iov, and instead just duplicate the
> > > pp/pp_ref_count/etc fields. The current code gives off the impression
> > > that net_iov may be a container of netmem_desc which is not really
> > > accurate.
> > > 
> > > But I don't think that's a major blocker. I think maybe the real issue
> > > is that there are no reviews from any mm maintainers?
> > 
> > Let's try changing the subject to draw some attention from MM people :)
> 
> Hi, it worked! :P

Glad it worked :)

> > 
> > > So I'm not 100%
> > > sure this is in line with their memdesc plans. I think probably
> > > patches 2->8 are generic netmem-ifications that are good to merge
> > > anyway, but I would say patch 1 and 9 need a reviewed by from someone
> > > on the mm side. Just my 2 cents.
> > 
> > As someone who worked on the zpdesc series, I think it is pretty much
> > in line with the memdesc plans.
> > 
> > I mean, it does differ a bit from the initial idea of generalizing it as
> > "bump" allocator, but overall, it's still aligned with the memdesc
> > plans, and looks like a starting point, IMHO.
> 
> Just to summarize (not that there is any misunderstanding), the first step
> of the memdesc plan is simple:
> 
> 1) have a dedicated data-structure we will allocate alter dynamically.
> 
> 2) Make it overlay "struct page" for now in a way that doesn't break things
> 
> 3) Convert all users of "struct page" to the new data-structure
> 
> Later, the memdesc data-structure will then actually come be allocated
> dynamically, so "struct page" content will not apply anymore, and we can
> shrink "struct page".
> 
> 
> What I see in this patch is exactly 1) and 2).

I believe 3) will be accompolished with a follow-up patch series from
Byungchul. The previous series was 18 patches, but divided to two series
and this is the first half.

> I am not 100% sure about existing "struct net_iov" and how that interacts
> with "struct page" overlay. I suspects it's just a dynamically allocated
> structure?

Yeah, that's indeed confusing. As mentioned here [1], it does not
overlay struct page at all. It is dynamically allocated from slab.

IIUC net_iov mirrors netmem_desc fields because netmem_ref might be
netmem_desc (overlays struct page) or net_iov (allocated from slab), and
by mirroring netmem_desc fields in net_iov, page pool doesn't need to care
if it's net_iov or netmem_desc when acccessing the common fields shared
between them.

[1] https://lore.kernel.org/linux-mm/aFDQ9X2WsXszNJ5M@hyeyoo

> Because this patch changes the layout of "struct net_iov", which is a bit
> confusing at first sight?

Yeah, better documented.

-- 
Cheers,
Harry / Hyeonggon

  reply	other threads:[~2025-06-17 16:33 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-09  4:32 [PATCH net-next 0/9] Split netmem from struct page Byungchul Park
2025-06-09  4:32 ` [PATCH net-next 1/9] netmem: introduce struct netmem_desc mirroring " Byungchul Park
2025-06-09 19:32   ` Jakub Kicinski
2025-06-10  1:30     ` Byungchul Park
2025-06-12  1:55       ` Jakub Kicinski
2025-06-13  1:13         ` Byungchul Park
2025-06-14  2:19           ` Mina Almasry
2025-06-17  2:31             ` netmem series needs some love and Acks from MM folks Harry Yoo
2025-06-17 16:09               ` David Hildenbrand
2025-06-17 16:32                 ` Harry Yoo [this message]
2025-06-18  0:08                 ` Byungchul Park
2025-06-18  7:51                   ` Pavel Begunkov
2025-06-20  4:12             ` [PATCH net-next 1/9] netmem: introduce struct netmem_desc mirroring struct page Byungchul Park
2025-06-11 20:53     ` Mina Almasry
2025-06-12  1:58       ` Jakub Kicinski
2025-06-17  2:20   ` Harry Yoo
2025-06-19  9:32   ` Vlastimil Babka
2025-06-09  4:32 ` [PATCH net-next 2/9] page_pool: rename page_pool_return_page() to page_pool_return_netmem() Byungchul Park
2025-06-10 11:15   ` Ilias Apalodimas
2025-06-09  4:32 ` [PATCH net-next 3/9] page_pool: rename __page_pool_release_page_dma() to __page_pool_release_netmem_dma() Byungchul Park
2025-06-10 11:11   ` Ilias Apalodimas
2025-06-09  4:32 ` [PATCH net-next 4/9] page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Byungchul Park
2025-06-09  4:32 ` [PATCH net-next 5/9] netmem: use _Generic to cover const casting for page_to_netmem() Byungchul Park
2025-06-09  4:32 ` [PATCH net-next 6/9] netmem: remove __netmem_get_pp() Byungchul Park
2025-06-10 11:11   ` Ilias Apalodimas
2025-06-09  4:32 ` [PATCH net-next 7/9] page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() Byungchul Park
2025-06-09  4:32 ` [PATCH net-next 8/9] netmem: introduce a netmem API, virt_to_head_netmem() Byungchul Park
2025-06-09  4:32 ` [PATCH net-next 9/9] page_pool: access ->pp_magic through struct netmem_desc in page_pool_page_is_pp() Byungchul Park
2025-06-09 17:39   ` Mina Almasry
2025-06-10  1:45     ` Byungchul Park
2025-06-11 14:30       ` Pavel Begunkov
2025-06-12  0:39         ` Byungchul Park
2025-06-17  2:37         ` Harry Yoo
2025-06-19  9:55   ` Vlastimil Babka
2025-06-19 10:42     ` Byungchul Park
2025-06-09  6:49 ` [PATCH net-next 0/9] Split netmem from struct page Byungchul Park
2025-06-11 14:25 ` Pavel Begunkov
2025-06-11 20:48   ` Mina Almasry
2025-06-12  0:40     ` Byungchul Park
2025-06-16  7:45     ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFGYr6qlzVlqhK55@harry \
    --to=harry.yoo@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=byungchul@sk.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=john.fastabend@gmail.com \
    --cc=kernel_team@skhynix.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=surenb@google.com \
    --cc=tariqt@nvidia.com \
    --cc=toke@redhat.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.