From: Will Deacon <will@kernel.org>
To: Brian Ruley <brian.ruley@gehealthcare.com>
Cc: "Russell King (Oracle)" <linux@armlinux.org.uk>,
Steve Capper <steve.capper@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, catalin.marinas@arm.com
Subject: Re: [PATCH] mm/arm: pgtable: remove young bit check for pte_valid_user
Date: Mon, 13 Apr 2026 11:58:01 +0100 [thread overview]
Message-ID: <adzMOdySgMIePcue@willie-the-truck> (raw)
In-Reply-To: <adjYlUk8_JjPivNi@zoo11.fihel.lab.ge-healthcare.net>
On Fri, Apr 10, 2026 at 02:01:41PM +0300, Brian Ruley wrote:
> On Apr 09, Russell King (Oracle) wrote:
> >
> > On Thu, Apr 09, 2026 at 06:17:36PM +0300, Brian Ruley wrote:
> > > However, in the case I describe, if VA_B is mapped immediately to pfn_q
> > > after it been has unmapped and freed for VA_A, then it's quite possible
> > > that the page is still indexed in the cache.
> >
> > True.
> >
> > > The hypothesis is that if
> > > VA_A and VA_B land in the same I-cache set and VA_A old cache entry
> > > still exists (tagged with pfn_q), then the CPU can fetch stale
> > > instructions because the tag will match. That's one reason why we need
> > > to invalidate the cache, but that will be skipped in the path:
> > >
> > > migrate_pages
> > > migrate_pages_batch
> > > migrate_folio_move
> > > remove_migration_ptes
> > > remove_migration_pte
> > > set_pte_at
> > > set_ptes
> > > __sync_icache_dcache (skipped if !young)
> > > set_pte_ext
> >
> > In this case, if the old PTE was marked !young, then the new PTE will
> > have:
> > pte = pte_mkold(pte);
> >
> > on it, which marks it !young. As you say, __sync_icache_dcache() will
> > be skipped. While a PTE entry will be set for the kernel, the code in
> > set_pte_ext() will *not* establish a hardware PTE entry. For the
> > 2-level pte code:
> >
> > tst r1, #L_PTE_YOUNG @ <- results in Z being set
> > tstne r1, #L_PTE_VALID @ <- not executed
> > eorne r1, r1, #L_PTE_NONE @ <- not executed
> > tstne r1, #L_PTE_NONE @ <- not executed
> > moveq r3, #0 @ <- hardware PTE value
> > ARM( str r3, [r0, #2048]! ) @ <- writes hardware PTE
> >
> > So, for a !young PTE, the hardware PTE entry is written as zero,
> > which means accesses should fault, which will then cause the PTE to
> > be marked young.
> >
> > For the 3-level case, the L_PTE_YOUNG bit corresponds with the AF bit
> > in the PTE, and there aren't split Linux / hardware PTE entries. AF
> > being clear should result in a page fault being generated for the
> > kernel to handle making the PTE young.
> >
> > In both of these cases, set_ptes() will need to be called with the
> > updated PTE which will now be marked young, and that will result in
> > the I-cache being flushed.
>
> Hi Russell,
>
> Thank you for the clarification, this is very educational for me.
> I understand your scepticism, and I can't explain what's going on based
> on what you replied. However, I do honestly believe there is a problem
> here. I'll share the exact testing details and the instrumentation
> we added that convinced us to reach out at the end. One idea we also
> had was that could cache aliasing be happening here.
I thought a bit more about this over the weekend and started to wonder
if there's a potential race where multiple CPUs try to write the same
PTE and don't synchronise properly on the cache-maintenance.
In particular, PG_dcache_clean is manipulated with a test_and_set_bit()
operation _before_ the cache maintenance is performed, so there's a
small window where the flag is set but the page is _not_ clean. I don't
think that matters with regards to invalid migration entries, but
perhaps the migration just means that we end up putting down a bunch of
'old' entries and are then more likely to see concurrent faults trying
to make the thing young again, potentially hitting this race.
Looking at arm64 this morning, I noticed that we split the flag
manipulation so that it's set with a set_bit() after the maintenance has
been performed. Git then points to 588a513d3425 ("arm64: Fix race
condition on PG_dcache_clean in __sync_icache_dcache()") which seems to
talk about the same race. In fact, the mailing list posting:
https://lore.kernel.org/all/20210514095001.13236-1-catalin.marinas@arm.com/
points out that arch/arm/ is also affected but we forgot to CC Russell
because I think this all came out of the MTE-enablement work [1] and it
sounds like Catalin was trying to fix it in the core mprotect() code.
Brian, can you try something like 588a513d3425?
Will
[1] https://lore.kernel.org/all/YJGHApOCXl811VK3@arm.com/
next prev parent reply other threads:[~2026-04-13 10:58 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 12:54 [PATCH] mm/arm: pgtable: remove young bit check for pte_valid_user Brian Ruley
2026-04-09 13:56 ` Will Deacon
2026-04-09 14:21 ` Russell King (Oracle)
2026-04-09 14:43 ` Russell King (Oracle)
2026-04-09 15:17 ` Brian Ruley
2026-04-09 16:00 ` Russell King (Oracle)
2026-04-10 11:01 ` Brian Ruley
2026-04-10 11:18 ` Russell King (Oracle)
2026-04-10 11:43 ` [RFC PATCH] test: " Brian Ruley
2026-04-13 10:58 ` Will Deacon [this message]
2026-04-13 11:17 ` [PATCH] " Brian Ruley
2026-04-13 14:42 ` Russell King (Oracle)
2026-04-13 15:24 ` Brian Ruley
2026-04-14 6:28 ` Brian Ruley
2026-04-14 7:44 ` Brian Ruley
2026-04-14 11:08 ` Will Deacon
2026-04-14 11:43 ` Brian Ruley
2026-04-09 14:15 ` Russell King (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adzMOdySgMIePcue@willie-the-truck \
--to=will@kernel.org \
--cc=brian.ruley@gehealthcare.com \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=steve.capper@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox