From: Dragos Tatulea <dtatulea@nvidia.com>
To: Byungchul Park <byungchul@sk.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
Pedro Falcato <pfalcato@suse.de>,
"Vlastimil Babka (SUSE)" <vbabka@kernel.org>,
linux-mm@kvack.org, akpm@linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel_team@skhynix.com, harry.yoo@oracle.com, ast@kernel.org,
daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org,
hawk@kernel.org, john.fastabend@gmail.com, sdf@fomichev.me,
saeedm@nvidia.com, leon@kernel.org, tariqt@nvidia.com,
mbloch@nvidia.com, andrew+netdev@lunn.ch, edumazet@google.com,
pabeni@redhat.com, lorenzo.stoakes@oracle.com,
Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, horms@kernel.org,
jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com,
ilias.apalodimas@linaro.org, willy@infradead.org,
brauner@kernel.org, kas@kernel.org, yuzhao@google.com,
usamaarif642@gmail.com, baolin.wang@linux.alibaba.com,
almasrymina@google.com, toke@redhat.com, asml.silence@gmail.com,
bpf@vger.kernel.org, linux-rdma@vger.kernel.org,
sfr@canb.auug.org.au, dw@davidwei.uk, ap420073@gmail.com
Subject: Re: [PATCH v4] mm: introduce a new page type for page pool in page type
Date: Thu, 14 May 2026 11:24:36 +0200 [thread overview]
Message-ID: <f4f5e3b2-64c4-4ad2-8678-d29ae08150e6@nvidia.com> (raw)
In-Reply-To: <20260514085402.GA63255@system.software.com>
On 14.05.26 10:54, Byungchul Park wrote:
> On Wed, May 13, 2026 at 02:06:06PM +0200, Dragos Tatulea wrote:
>> On 13.05.26 11:36, David Hildenbrand (Arm) wrote:
>>> On 5/13/26 11:26, Pedro Falcato wrote:
>>>> On Wed, May 13, 2026 at 11:12:43AM +0200, Vlastimil Babka (SUSE) wrote:
>>>>> On 5/13/26 11:00, Dragos Tatulea wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Seems like this patch broke tcp_mmap because
>>>>>> validate_page_before_insert() returns -EINVAL due
>>>>>> to a page having a type. Here's the full flow:
>>>>>>
>>>>>> getsockopt(TCP_ZEROCOPY_RECEIVE) returns -EINVAL because of the
>>>>>> below flow in the kernel:
>>>>>>
>>>>>> tcp_zerocopy_receive()
>>>>>> -> tcp_zerocopy_vm_insert_batch()
>>>>>> -> vm_insert_pages()
>>>>>> -> insert_pages()
>>>>>> -> insert_page_in_batch_locked()
>>>>>> -> validate_page_before_insert() returns -EINVAL
>>>>>> because page_has_type(page) is now true.
>>>>>>
>>>>>> The patch below fixes the issue. But is this a valid fix?
>>>>>
>>>>> Hmm the check traces back to commit 0ee930e6cafa0 "mm/memory.c: prevent
>>>>> mapping typed pages to userspace"
>>>>>
>>>>>> Pages which use page_type must never be mapped to userspace as it would
>>>>>> destroy their page type. Add an explicit check for this instead of
>>>>>> assuming that kernel drivers always get this right.
>>>>>
>>>>> So uh, this doesn't look good I think.
>>>>
>>>> Yep, you fundamentally can't map a page with a type as page type aliases with
>>>> mapcount. Even with the given diff, just mapping it will increment the mapcount
>>>> and wreak havoc. I think we need to revert this patch for now.
>>>>
>>>> I'm not sure what the long term plan for this would be. If page types are moved
>>>> to memdesc types, then the two stop colliding and that could work. I don't know
>>>> if that's Willy's plan, however.
>>>>
>>>> (then there's the other question: are page pool pages really folios? not really.
>>>> they are mappable, but they aren't part of the page cache, or anon, nor are
>>>> they in the LRU or have rmap capabilities. perhaps we need a different memdesc
>>>> for those. we're one step away from reinventing class polymorphism from first
>>>> principles ;)
>>>
>>> Zi Yan is working on this: non-folio pages would no longer mess with
>>> rmap/mapcounts, and page table walking code will identify them to be non-folio
>>> things to skip them.
>>>
>>> It will take a while, though ...
>>>
>> So do I get it right that the path forward here is to revert this commit [1]
>> and wait until the work from Zi Yan is ready?
>
> I think it's the best way for now.
>
Ack. Can you do it please? This is a more complicated revert and the risk that I
mess it up is higher.
Thanks,
Dragos
next prev parent reply other threads:[~2026-05-14 9:24 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-24 5:13 [PATCH v4] mm: introduce a new page type for page pool in page type Byungchul Park
2026-02-25 7:19 ` Mike Rapoport
2026-02-26 18:49 ` Johannes Weiner
2026-03-16 22:29 ` Byungchul Park
2026-03-16 22:31 ` [PATCH v5] " Byungchul Park
2026-03-17 9:20 ` Jesper Dangaard Brouer
2026-03-17 10:03 ` Ilias Apalodimas
2026-03-17 11:06 ` Dragos Tatulea
2026-03-19 23:31 ` Jakub Kicinski
2026-03-18 2:02 ` Byungchul Park
2026-03-20 11:44 ` Jesper Dangaard Brouer
2026-03-23 12:16 ` Ilias Apalodimas
2026-03-19 23:31 ` Jakub Kicinski
2026-05-13 9:00 ` [PATCH v4] " Dragos Tatulea
2026-05-13 9:12 ` Vlastimil Babka (SUSE)
2026-05-13 9:26 ` Pedro Falcato
2026-05-13 9:36 ` David Hildenbrand (Arm)
2026-05-13 12:06 ` Dragos Tatulea
2026-05-13 12:11 ` David Hildenbrand (Arm)
2026-05-14 8:54 ` Byungchul Park
2026-05-14 9:24 ` Dragos Tatulea [this message]
2026-05-15 0:01 ` Byungchul Park
2026-05-13 9:34 ` David Hildenbrand (Arm)
2026-05-13 12:18 ` Byungchul Park
2026-05-13 12:29 ` David Hildenbrand (Arm)
2026-05-13 12:39 ` Byungchul Park
2026-05-13 13:02 ` David Hildenbrand (Arm)
2026-05-13 13:26 ` Byungchul Park
2026-05-13 9:42 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f4f5e3b2-64c4-4ad2-8678-d29ae08150e6@nvidia.com \
--to=dtatulea@nvidia.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=ap420073@gmail.com \
--cc=asml.silence@gmail.com \
--cc=ast@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=byungchul@sk.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=david@kernel.org \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=jackmanb@google.com \
--cc=john.fastabend@gmail.com \
--cc=kas@kernel.org \
--cc=kernel_team@skhynix.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mbloch@nvidia.com \
--cc=mhocko@suse.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pfalcato@suse.de \
--cc=rppt@kernel.org \
--cc=saeedm@nvidia.com \
--cc=sdf@fomichev.me \
--cc=sfr@canb.auug.org.au \
--cc=surenb@google.com \
--cc=tariqt@nvidia.com \
--cc=toke@redhat.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.