From: Pedro Falcato <pfalcato@suse.de>
To: Muhammad Usama Anjum <usama.anjum@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <ljs@kernel.org>,
David Hildenbrand <david@kernel.org>,
"Liam R. Howlett" <liam@infradead.org>,
Mike Rapoport <rppt@kernel.org>,
Ryan Roberts <ryan.roberts@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Samuel Holland <samuel.holland@sifive.com>,
linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: mm: opaque hardware page-table entry handles
Date: Thu, 25 Jun 2026 12:08:19 +0100 [thread overview]
Message-ID: <aj0JU-AzsEQBOiQ1@pedro-suse> (raw)
In-Reply-To: <66310292-f618-4497-bcaa-2a4b1240566c@arm.com>
On Thu, Jun 25, 2026 at 11:50:28AM +0100, Muhammad Usama Anjum wrote:
> On 24/06/2026 8:25 pm, Pedro Falcato wrote:
> > On Wed, Jun 24, 2026 at 03:09:08PM +0100, Usama Anjum wrote:
> >> Hi all,
> >>
> >> This is a direction-check with the wider community before spending time on the
> >> development. This picks up the idea that was raised and broadly agreed in the
> >> earlier thread (Ryan Roberts, Lorenzo Stoakes, David Hildenbrand) [1].
> >>
> >> The problem
> >> -----------
> >> Core MM code reaches page-table entries by raw pointer dereference (pte_t *,
> >> pmd_t *, *pud, ...) in places, implicitly assuming a single, uniform
> >> representation. Sprinkling getters wouldn't solve the problem entirely. The
> >> problem is one level up: the *pointer type* itself is overloaded. At each level
> >> there are really three distinct things:
> >>
> >> 1. a page-table entry value (pte_t, pmd_t, ...)
> >> 2. a pointer to an entry value, e.g. a pXX_t on the stack
> >> 3. a pointer to a live entry in the hardware page table
> >>
> >> Today (2) and (3) share the same type - pte_t *, pmd_t *, and so on. Nothing
> >> distinguishes a pointer into a live table from a pointer to a stack copy.
> >>
> >> A pointer to an on-stack entry value and a pointer to a live hardware entry have
> >> the same type, so the compiler cannot distinguish them. Passing the stack
> >> pointer to an arch helper that expects a hardware-entry pointer compiles fine,
> >> but is wrong - a bug class the type system makes invisible. It also blocks
> >> evolution: an arch helper may need to read beyond the addressed entry (e.g.
> >> adjacent or contiguous entries), which only makes sense for a real page-table
> >> pointer, not a stack copy.
> >>
> >> The idea
> >> --------
> >> Give (3) its own opaque type that cannot be dereferenced:
> >>
> >> /* opaque handle to a HW page-table entry; not dereferenceable */
> >> typedef struct {
> >> pte_t *ptr;
> >> } hw_ptep;
> >
> > I don't love typedefs that hide pointers.
> Nobody likes them. This is the only way so that by mistake stack pointers
> don't get reintroduced. Its also hard to catch such cases during review.
That's not true, you could have:
typedef struct { pteval_t pte; } sw_pte_t;
and
/* only usable by arch code and whoever wants to interpret these
* types */
static inline sw_to_ptep(sw_pte_t *swptep)
{
return (pte_t *) swptep;
}
and so on... Also, see Documentation/process/coding-style.rst 5) typedefs, it
explicitly warns against pointer typedefs.
>
> >
> >>
> >> With this:
> >>
> >> - a stack value can no longer masquerade as a hardware table entry,
> >> - a hardware handle can no longer be raw-dereferenced,
> >> - cases that genuinely operate on a value can be refactored to pass the value
> >> and let the caller, which knows whether it holds a handle or a stack copy,
> >> read it once.
> >
> > Just a small passing comment: how about doing it differently? like
> >
> > typedef struct {
> > pte_t *ptep;
> > } sw_ptep_t;
> >
> > or something like that. Were I to guess, referring to a pte_t on the stack
> > is much rarer than all the pte_t references to actual page tables. But maybe
> > reality doesn't match up with my guess :)
> We want to fix the current usages and future usages as well. sw_ptep_t can work
> for current usages, but it'll not force the new code to be written using correct
> notations.
I don't understand what you mean. pte_t is a perfectly correct notation,
it's just currently maybe too ambiguously overloaded.
> Apart from different types, another benefit of hw_pXXp would be that
> it'll become an opaque object which only architecture can manipulate. Hence
> architecture can decide howeverever it wants to manage them in certain cases.
That's already the case. pte_t is fully opaque apart from the little fact
that you can declare one on your stack. Introducing a different sw_pte_t
would further reinforce that. And if you want ways to find raw derefs on
pointers, we can simply slap on __attribute__((noderef)) (available in
sparse and clang) on those types after sw_pte_t is introduced and pte_t
is unambiguously a "hardware" PTE.
I dunno, I'm not convinced that changing around ~450 files is worth it, and
_if_ we want to do something like this I would strongly prefer the way that
is less churny.
--
Pedro
next prev parent reply other threads:[~2026-06-25 11:08 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-24 14:09 mm: opaque hardware page-table entry handles Usama Anjum
2026-06-24 15:52 ` Zi Yan
2026-06-24 22:39 ` Muhammad Usama Anjum
2026-06-24 19:25 ` Pedro Falcato
2026-06-25 10:50 ` Muhammad Usama Anjum
2026-06-25 11:08 ` Pedro Falcato [this message]
2026-06-25 12:15 ` Muhammad Usama Anjum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aj0JU-AzsEQBOiQ1@pedro-suse \
--to=pfalcato@suse.de \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=liam@infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=samuel.holland@sifive.com \
--cc=usama.anjum@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox