The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Pedro Falcato <pfalcato@suse.de>
To: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Cc: Dragos Tatulea <dtatulea@nvidia.com>,
	 Byungchul Park <byungchul@sk.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	 netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel_team@skhynix.com,  harry.yoo@oracle.com, ast@kernel.org,
	daniel@iogearbox.net, davem@davemloft.net,  kuba@kernel.org,
	hawk@kernel.org, john.fastabend@gmail.com, sdf@fomichev.me,
	 saeedm@nvidia.com, leon@kernel.org, tariqt@nvidia.com,
	mbloch@nvidia.com,  andrew+netdev@lunn.ch, edumazet@google.com,
	pabeni@redhat.com, david@redhat.com,  lorenzo.stoakes@oracle.com,
	Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
	 surenb@google.com, mhocko@suse.com, horms@kernel.org,
	jackmanb@google.com,  hannes@cmpxchg.org, ziy@nvidia.com,
	ilias.apalodimas@linaro.org, willy@infradead.org,
	 brauner@kernel.org, kas@kernel.org, yuzhao@google.com,
	usamaarif642@gmail.com,  baolin.wang@linux.alibaba.com,
	almasrymina@google.com, toke@redhat.com, asml.silence@gmail.com,
	 bpf@vger.kernel.org, linux-rdma@vger.kernel.org,
	sfr@canb.auug.org.au, dw@davidwei.uk,  ap420073@gmail.com
Subject: Re: [PATCH v4] mm: introduce a new page type for page pool in page type
Date: Wed, 13 May 2026 10:26:13 +0100	[thread overview]
Message-ID: <agRB2QTbzceRgpzX@pedro-suse> (raw)
In-Reply-To: <4af19eda-c29c-4302-92d5-c0915267fc0c@kernel.org>

On Wed, May 13, 2026 at 11:12:43AM +0200, Vlastimil Babka (SUSE) wrote:
> On 5/13/26 11:00, Dragos Tatulea wrote:
> > 
> > 
> > On 24.02.26 06:13, Byungchul Park wrote:
> >> Currently, the condition 'page->pp_magic == PP_SIGNATURE' is used to
> >> determine if a page belongs to a page pool.  However, with the planned
> >> removal of @pp_magic, we should instead leverage the page_type in struct
> >> page, such as PGTY_netpp, for this purpose.
> >> 
> >> Introduce and use the page type APIs e.g. PageNetpp(), __SetPageNetpp(),
> >> and __ClearPageNetpp() instead, and remove the existing APIs accessing
> >> @pp_magic e.g. page_pool_page_is_pp(), netmem_or_pp_magic(), and
> >> netmem_clear_pp_magic().
> >> 
> >> Plus, add @page_type to struct net_iov at the same offset as struct page
> >> so as to use the page_type APIs for struct net_iov as well.  While at it,
> >> reorder @type and @owner in struct net_iov to avoid a hole and
> >> increasing the struct size.
> >> 
> >> This work was inspired by the following link:
> >> 
> >>   https://lore.kernel.org/all/582f41c0-2742-4400-9c81-0d46bf4e8314@gmail.com/
> >> 
> >> While at it, move the sanity check for page pool to on the free path.
> >> 
> >> Suggested-by: David Hildenbrand <david@redhat.com>
> >> Co-developed-by: Pavel Begunkov <asml.silence@gmail.com>
> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> >> Signed-off-by: Byungchul Park <byungchul@sk.com>
> >> Acked-by: David Hildenbrand <david@redhat.com>
> >> Acked-by: Zi Yan <ziy@nvidia.com>
> >> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> >> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >> ---
> > 
> > Seems like this patch broke tcp_mmap because
> > validate_page_before_insert() returns -EINVAL due
> > to a page having a type. Here's the full flow:
> > 
> > getsockopt(TCP_ZEROCOPY_RECEIVE) returns -EINVAL because of the
> > below flow in the kernel:
> > 
> > tcp_zerocopy_receive()
> > -> tcp_zerocopy_vm_insert_batch()
> >   -> vm_insert_pages()
> >     -> insert_pages()
> >       -> insert_page_in_batch_locked()
> >         -> validate_page_before_insert() returns -EINVAL
> >            because page_has_type(page) is now true.
> > 
> > The patch below fixes the issue. But is this a valid fix?
> 
> Hmm the check traces back to commit 0ee930e6cafa0 "mm/memory.c: prevent
> mapping typed pages to userspace"
> 
> > Pages which use page_type must never be mapped to userspace as it would
> > destroy their page type.  Add an explicit check for this instead of
> > assuming that kernel drivers always get this right.
> 
> So uh, this doesn't look good I think.

Yep, you fundamentally can't map a page with a type as page type aliases with
mapcount. Even with the given diff, just mapping it will increment the mapcount
and wreak havoc. I think we need to revert this patch for now.

I'm not sure what the long term plan for this would be. If page types are moved
to memdesc types, then the two stop colliding and that could work. I don't know
if that's Willy's plan, however.

(then there's the other question: are page pool pages really folios? not really.
they are mappable, but they aren't part of the page cache, or anon, nor are
they in the LRU or have rmap capabilities. perhaps we need a different memdesc
for those. we're one step away from reinventing class polymorphism from first
principles ;)

> 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index ea6568571131..4cb12673f450 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2326,7 +2326,7 @@ static int validate_page_before_insert(struct vm_area_struct *vma,
> >                         return -EINVAL;
> >                 return 0;
> >         }
> > -       if (folio_test_anon(folio) || page_has_type(page))
> > +       if (folio_test_anon(folio) || (page_has_type(page) && !PageNetpp(page)))
> >                 return -EINVAL;
> >         flush_dcache_folio(folio);
> >         return 0;
> > 
> > Thanks,
> > Dragos
> > 
> > 
> 
> 

-- 
Pedro

  reply	other threads:[~2026-05-13  9:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260224051347.19621-1-byungchul@sk.com>
2026-05-13  9:00 ` [PATCH v4] mm: introduce a new page type for page pool in page type Dragos Tatulea
2026-05-13  9:12   ` Vlastimil Babka (SUSE)
2026-05-13  9:26     ` Pedro Falcato [this message]
2026-05-13  9:36       ` David Hildenbrand (Arm)
2026-05-13 12:06         ` Dragos Tatulea
2026-05-13 12:11           ` David Hildenbrand (Arm)
2026-05-13  9:34   ` David Hildenbrand (Arm)
2026-05-13 12:18   ` Byungchul Park
2026-05-13 12:29     ` David Hildenbrand (Arm)
2026-05-13 12:39       ` Byungchul Park
2026-05-13 13:02         ` David Hildenbrand (Arm)
2026-05-13 13:26           ` Byungchul Park
2026-05-13  9:42 ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agRB2QTbzceRgpzX@pedro-suse \
    --to=pfalcato@suse.de \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=ap420073@gmail.com \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=byungchul@sk.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=dtatulea@nvidia.com \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jackmanb@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kas@kernel.org \
    --cc=kernel_team@skhynix.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mbloch@nvidia.com \
    --cc=mhocko@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rppt@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=sdf@fomichev.me \
    --cc=sfr@canb.auug.org.au \
    --cc=surenb@google.com \
    --cc=tariqt@nvidia.com \
    --cc=toke@redhat.com \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox