From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Klara Modin <klarasmodin@gmail.com>,
Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, ljs@kernel.org, riel@surriel.com,
liam@infradead.org, vbabka@kernel.org, harry@kernel.org,
jannh@google.com, balbirs@nvidia.com, sj@kernel.org,
ziy@nvidia.com, lance.yang@linux.dev, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [Patch mm-hotfixes v5] mm/page_vma_mapped: fix device-private PMD handling
Date: Wed, 1 Jul 2026 17:36:33 +0200 [thread overview]
Message-ID: <d4e4180e-dcdf-40e6-b5a2-2ac55f4aecc4@kernel.org> (raw)
In-Reply-To: <akUebBUyNFCbWt_k@soda.int.kasm.eu>
On 7/1/26 16:33, Klara Modin wrote:
> Hi,
>
> On 2026-06-30 02:15:40 +0000, Wei Yang wrote:
>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
>> device-private entries") introduced the concept of device-private
>> PMD entries, but did not correctly update the rmap walk code to
>> account for them.
>>
>> As a result, when page_vma_mapped_walk() encounters device-private
>> PMD entries, it takes no action other than to acquire the PMD lock
>> and exit.
>>
>> However this is highly problematic for two reasons - firstly,
>> device private entries possess a PFN so check_pmd() needs to be
>> called to ensure an overlapping PFN range.
>>
>> Secondly, and more importantly, if PVMW_MIGRATION is set the
>> caller assumes the returned entry is a migration entry, resulting
>> in memory corruption when the caller tries to interpret the device
>> private entry as such.
>>
>> In addition, commit 146287290023 ("mm/huge_memory: implement
>> device-private THP splitting") allowed device private PMDs to be
>> split like THP mappings, but again did not update this code path.
>>
>> As a result, we might race a PMD split prior to acquiring the PMD
>> lock.
>>
>> This patch addresses all of these issues by invoking check_pmd(),
>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
>> us we do for PMD THP and migration entries.
>>
>> Instead of checking for a subset of the cases after taking the
>> pmd_lock(), put device-private along with pmd_trans_huge() and
>> pmd_is_migration_entry(). Also remove thp_migration_supported() as
>> it is already guarded by pmd_is_migration_entry().
>
> This results in a build bug for my Raspberry Pi 1:
>
> In file included from <command-line>:
> In function ‘check_pmd’,
> inlined from ‘page_vma_mapped_walk’ at /home/klara/git/linux/trees/bisect/mm/page_vma_mapped.c:256:10:
> /home/klara/git/linux/trees/bisect/include/linux/compiler_types.h:702:45: error: call to ‘__compiletime_assert_433’ declared with attribute error: BUILD_BUG failed
> 702 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
> | ^
> /home/klara/git/linux/trees/bisect/include/linux/compiler_types.h:683:25: note: in definition of macro ‘__compiletime_assert’
> 683 | prefix ## suffix(); \
> | ^~~~~~
> /home/klara/git/linux/trees/bisect/include/linux/compiler_types.h:702:9: note: in expansion of macro ‘_compiletime_assert’
> 702 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
> | ^~~~~~~~~~~~~~~~~~~
> /home/klara/git/linux/trees/bisect/include/linux/build_bug.h:40:37: note: in expansion of macro ‘compiletime_assert’
> 40 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
> | ^~~~~~~~~~~~~~~~~~
> /home/klara/git/linux/trees/bisect/include/linux/build_bug.h:60:21: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
> 60 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed")
> | ^~~~~~~~~~~~~~~~
> /home/klara/git/linux/trees/bisect/include/linux/huge_mm.h:113:28: note: in expansion of macro ‘BUILD_BUG’
> 113 | #define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
> | ^~~~~~~~~
> /home/klara/git/linux/trees/bisect/include/linux/huge_mm.h:117:26: note: in expansion of macro ‘HPAGE_PMD_SHIFT’
> 117 | #define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT)
> | ^~~~~~~~~~~~~~~
> /home/klara/git/linux/trees/bisect/include/linux/huge_mm.h:118:26: note: in expansion of macro ‘HPAGE_PMD_ORDER’
> 118 | #define HPAGE_PMD_NR (1<<HPAGE_PMD_ORDER)
> | ^~~~~~~~~~~~~~~
> /home/klara/git/linux/trees/bisect/mm/page_vma_mapped.c:142:20: note: in expansion of macro ‘HPAGE_PMD_NR’
> 142 | if ((pfn + HPAGE_PMD_NR - 1) < pvmw->pfn)
> | ^~~~~~~~~~~~
>
> bisect log:
>
> # bad: [be5c93fa674f0fc3c8f359c2143abce6bbb422e6] Add linux-next specific files for 20260630
> git bisect start 'HEAD'
> # status: waiting for 'good' commit(s), 'bad' commit known
> # good: [dc59e4fea9d83f03bad6bddf3fa2e52491777482] Linux 7.2-rc1
> git bisect good dc59e4fea9d83f03bad6bddf3fa2e52491777482
> # bad: [6148219e90732fd06f5d7a498bda974e6a43ab4b] Merge branch 'nand/next' of https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git
> git bisect bad 6148219e90732fd06f5d7a498bda974e6a43ab4b
> # bad: [e0326ebe10191447ab8fa2e904080df7b743765e] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
> git bisect bad e0326ebe10191447ab8fa2e904080df7b743765e
> # bad: [fbc9c5ac47cef5a2b04aef30c8e990b32dcf2548] Merge branch 'hwmon' of https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git
> git bisect bad fbc9c5ac47cef5a2b04aef30c8e990b32dcf2548
> # bad: [e488171f6f6df6fc899a355079665fdb3c50b0e3] Merge branch 'for-linus' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
> git bisect bad e488171f6f6df6fc899a355079665fdb3c50b0e3
> # bad: [60db0fcb8fc9d80ac0b63041c632b41a311a45f1] Merge branch 'fs-current' of linux-next
> git bisect bad 60db0fcb8fc9d80ac0b63041c632b41a311a45f1
> # good: [51021d260d682aa17b3533848a99160ab83e0c93] Merge branch 'vfs.fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> git bisect good 51021d260d682aa17b3533848a99160ab83e0c93
> # good: [ded56474db6552260786a65898322464b72c7540] mm: a second pagecache maintainer
> git bisect good ded56474db6552260786a65898322464b72c7540
> # good: [6c893b948351d42cfc3761cc746ab5b3d03ee7f3] Merge branch 'misc-7.2' into next-fixes
> git bisect good 6c893b948351d42cfc3761cc746ab5b3d03ee7f3
> # good: [bfcc55a14179495b0c41408908fd7b9d7785c694] lib: test_hmm: use device devt for coherent device range selection
> git bisect good bfcc55a14179495b0c41408908fd7b9d7785c694
> # good: [a27318567c92ba5482906d047e71a7aa4fd01889] Merge branch 'fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git
> git bisect good a27318567c92ba5482906d047e71a7aa4fd01889
> # bad: [6887a39652cdfd4cfd3b0962662c9cbc26ce5252] mm/page_vma_mapped: fix device-private PMD handling
> git bisect bad 6887a39652cdfd4cfd3b0962662c9cbc26ce5252
> # good: [2cc6bd0efc264b9ac760c2bc74dff4f521a680a1] MAINTAINERS: s/SeongJae/SJ/
> git bisect good 2cc6bd0efc264b9ac760c2bc74dff4f521a680a1
> # first 'bad' commit: [6887a39652cdfd4cfd3b0962662c9cbc26ce5252] mm/page_vma_mapped: fix device-private PMD handling
>
>>
>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Suggested-by: David Hildenbrand <david@kernel.org>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Balbir Singh <balbirs@nvidia.com>
>> Cc: SeongJae Park <sj@kernel.org>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Lorenzo Stoakes <ljs@kernel.org>
>> Cc: Lance Yang <lance.yang@linux.dev>
>>
>> ---
>> v5:
>> * put device-private pmd handling along with the other two cases
>> * remove thp_migration_supported()
>> v4: https://lore.kernel.org/all/20260624065353.1622-1-richard.weiyang@gmail.com/T/#u
>> * refine subject and commit log based on Lorenzo's suggestion
>> * put pmd device-private entry handling in its own if branch,
>> suggested by Lorenzo
>>
>> v3:
>> * remove cleanup part, only fix the issue for device-private entry
>> * refine user effect description based on Lorenzo's suggestion
>>
>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
>> * specify the possible error case of current code and user visible effect
>> * besides fix, cleanup the pmd entry handling based on David's suggestion
>>
>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
>> ---
>> mm/page_vma_mapped.c | 30 ++++++++++++++++--------------
>> 1 file changed, 16 insertions(+), 14 deletions(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc1..2d6c58488e3a 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -243,21 +243,30 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> */
>> pmde = pmdp_get_lockless(pvmw->pmd);
>>
>> - if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>> + if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>> + pmd_is_device_private_entry(pmde)) {
>> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> pmde = *pvmw->pmd;
>> - if (!pmd_present(pmde)) {
>> + if (pmd_is_migration_entry(pmde)) {
>> softleaf_t entry;
>>
>> - if (!thp_migration_supported() ||
>> - !(pvmw->flags & PVMW_MIGRATION))
>> + if (!(pvmw->flags & PVMW_MIGRATION))
>> return not_found(pvmw);
>> entry = softleaf_from_pmd(pmde);
>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> + return not_found(pvmw);
>> + return true;
>> + } else if (pmd_is_device_private_entry(pmde)) {
>> + softleaf_t entry;
>>
>
>> - if (!softleaf_is_migration(entry) ||
>> - !check_pmd(softleaf_to_pfn(entry), pvmw))
>
> My only guess here would be that the compiler evaluates
> !softleaf_is_migration(entry) to always be true and optimises away the
> !check_pmd(softleaf_to_pfn(entry), pvmw) which is why this worked
> before?
Weird, we enter this path only with
pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
pmd_is_device_private_entry(pmde)
If any one of these would compile for !CONFIG_TRANSPARENT_HUGEPAGE that would be
odd.
pmd_is_device_private_entry() is hard-coded to false unless
CONFIG_ARCH_ENABLE_THP_MIGRATION. Which is only selected with
ARCH_ENABLE_THP_MIGRATION.
pmd_trans_huge() as well.
Maybe it's struggling with pmd_is_migration_entry() on some (older) compilers?
(not innlining stuff and not properly optimizing it out).
The whole conditional must be optimized out.
We could check for IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) right at the start
to make it easier for the compiler:
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) &&
(pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
pmd_is_device_private_entry(pmde))) {
--
Cheers,
David
next prev parent reply other threads:[~2026-07-01 15:36 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-30 2:15 [Patch mm-hotfixes v5] mm/page_vma_mapped: fix device-private PMD handling Wei Yang
2026-06-30 2:43 ` Lance Yang
2026-06-30 3:57 ` Balbir Singh
2026-07-01 14:33 ` Klara Modin
2026-07-01 15:36 ` David Hildenbrand (Arm) [this message]
2026-07-01 16:33 ` Lance Yang
2026-07-01 16:46 ` Klara Modin
2026-07-02 1:47 ` Lance Yang
2026-07-02 17:49 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d4e4180e-dcdf-40e6-b5a2-2ac55f4aecc4@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=balbirs@nvidia.com \
--cc=harry@kernel.org \
--cc=jannh@google.com \
--cc=klarasmodin@gmail.com \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=richard.weiyang@gmail.com \
--cc=riel@surriel.com \
--cc=sj@kernel.org \
--cc=stable@vger.kernel.org \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox