From: "Alexandre Courbot" <acourbot@nvidia.com>
To: "Joel Fernandes" <joelagnelf@nvidia.com>
Cc: "Eliot Courtney" <ecourtney@nvidia.com>,
"Danilo Krummrich" <dakr@kernel.org>,
<linux-kernel@vger.kernel.org>, "Miguel Ojeda" <ojeda@kernel.org>,
"Boqun Feng" <boqun@kernel.org>, "Gary Guo" <gary@garyguo.net>,
"Bjorn Roy Baron" <bjorn3_gh@protonmail.com>,
"Benno Lossin" <lossin@kernel.org>,
"Andreas Hindborg" <a.hindborg@kernel.org>,
"Alice Ryhl" <aliceryhl@google.com>,
"Trevor Gross" <tmgross@umich.edu>,
"Dave Airlie" <airlied@redhat.com>,
"Daniel Almeida" <daniel.almeida@collabora.com>,
"Koen Koning" <koen.koning@linux.intel.com>,
<dri-devel@lists.freedesktop.org>,
<rust-for-linux@vger.kernel.org>,
"Nikola Djukic" <ndjukic@nvidia.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Jonathan Corbet" <corbet@lwn.net>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian Koenig" <christian.koenig@amd.com>,
"Jani Nikula" <jani.nikula@linux.intel.com>,
"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>,
"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
"Tvrtko Ursulin" <tursulin@ursulin.net>,
"Huang Rui" <ray.huang@amd.com>,
"Matthew Auld" <matthew.auld@intel.com>,
"Matthew Brost" <matthew.brost@intel.com>,
"Lucas De Marchi" <lucas.demarchi@intel.com>,
"Thomas Hellstrom" <thomas.hellstrom@linux.intel.com>,
"Helge Deller" <deller@gmx.de>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"John Hubbard" <jhubbard@nvidia.com>,
"Alistair Popple" <apopple@nvidia.com>,
"Timur Tabi" <ttabi@nvidia.com>, "Edwin Peer" <epeer@nvidia.com>,
"Andrea Righi" <arighi@nvidia.com>,
"Andy Ritger" <aritger@nvidia.com>, "Zhi Wang" <zhiw@nvidia.com>,
"Balbir Singh" <balbirs@nvidia.com>,
"Philipp Stanner" <phasta@kernel.org>,
"Elle Rhumsaa" <elle@weathered-steel.dev>, <alexeyi@nvidia.com>,
<joel@joelfernandes.org>, <linux-doc@vger.kernel.org>,
<amd-gfx@lists.freedesktop.org>,
<intel-gfx@lists.freedesktop.org>,
<intel-xe@lists.freedesktop.org>, <linux-fbdev@vger.kernel.org>
Subject: Re: [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums
Date: Thu, 09 Apr 2026 19:56:41 +0900 [thread overview]
Message-ID: <DHOKJ3MJNO5P.SXKOAYKX13JL@nvidia.com> (raw)
In-Reply-To: <2f004511-61d1-4197-84b6-cddcdd275e55@nvidia.com>
On Thu Apr 9, 2026 at 5:19 AM JST, Joel Fernandes wrote:
> Hi Alex, Eliot, Danilo,
>
> Thanks for taking a look. Let me respond to the specific points below.
>
> On Wed, 08 Apr 2026, Alexandre Courbot wrote:
>> After a quick look I'd say that having a trait here would actually be
>> *good* for correctness and maintainability.
>>
>> The current design implies that every operation on a page table (most
>> likely using the walker) goes through a branching point. Just looking at
>> `PtWalk::read_pte_at_level`, there are already at least 6
>> `if version == 2 { } else { }` branches that all resolve to the same
>> result. Include walking down the PDEs and you have at least a dozen of
>> these just to resolve a virtual address. I know CPUs are fast, but this
>> is still wasted cycles for no good reason.
>
> I did some measurements and there is no notieceable difference in both
> approaches. I ran perf and loaded nova with self-tests running. The extra
> potential branching is lost in the noise. In both cases, loading nova and
> running the self-tests has ~119.7M branch instructions on my Ampere. The total
> instruction count is also identical (~615M).
That's expected - as I said, CPUs are fast - and that's also not my
point. My issue is that we are doing countless tests that all resolve to
the code path, a code path that is already known during probe time.
That's a huge code smell.
When we create the GPU, we know whether we will be using v2 or v3 page
tables. That we need to test that again 12 times per address resolution
is a design issue, irrespective of performance. There are 24 version
match sites in patch 12 alone.
And that's precisely a good justification for using monomorphization. v2
and v3 are technically two different page table implementations (they
even have their own distinct module in your series), we just use
generics to factorize the (source) code a bit.
>
> I measured like this:
> perf stat -e
> branches,branch-misses,cache-references,cache-misses,instructions,cycles --
> modprobe nova_core
>
> So I think the branching argument is not a strong one. I also did more
> measurements and the dominant time taken is MMIO. During the map prep and
> execute, page table walks are done. A TLB flush alone costs ~1.4 microseconds.
> And PRAMIN BAR0 writes to write the PTE is also about 1 microsecond. Considering
> this, I don't think the extra branching argument holds (even without branch
> prediction and speculation).
>
> Also some branches cannot be eliminated even with parameterization:
>
> if level == self.mmu_version.dual_pde_level() {
> // 128-bit dual PDE read
> } else {
> // Regular 64-bit PDE read
> }
>
> This isn't really a version branch -- it's a structural branch that
> distinguishes between 64-bit PDE and 128-bit dual PDE entries. Any MMU
> version with a dual PDE level would need this same distinction.
The dual PDE level should be an associated constant - you still need to
do the test, but note that you would also do it if there was only a
single page table version. It's orthogonal to whether we use a trait or
not here.
>
> I also did code-generation size analysis (see diff of code used below):
>
> Code generation analysis:
>
> Module .ko size: Before: 511,792 bytes After: 524,464 bytes (+2.5%)
> .text section: Before: 112,620 bytes After: 116,628 bytes (+4,008 bytes)
>
> The +4K .text growth is the monomorphization cost: every generic function
> is compiled twice (once for MmuV2, once for MmuV3).
I would say this is working as intended then.
>
>> If you use a trait here, and make `PtWalk` generic against it, you can
>> optimize this away. We had a similar situation when we introduced Turing
>> support and the v2 ucode header, and tried both approaches: the
>> trait-based one was slightly shorter, and arguably more readable.
>
> Actually I was the one who suggested traits for Falcon ucode descriptor if you
> see this thread [1]. So basically you and Eliot are telling me to do what I
> suggested in [1]. :-) However, I disagree that it is the right choice for this code.
>
> [1] https://lore.kernel.org/all/20251117231028.GA1095236@joelbox2/
>
> I think the two cases are quite different in complexity:
Exactly. The complexity is different (this one involves multiple traits
and associated types) but the pattern is the same - and that's a pattern
traits are designed to address. If we were supposed to stop applying it
when things go beyond a certain level of complexity, the conceptors of
Rust would not have bothered addings things like associated types.
These traits are nothing new, they simply formalize a reality that
already exists in your code, which is that each version of the page
table needs to implement a given set of methods. It's already there with
the version doing dispatches, only it is not articulated clearly to the
reader. So in that respect, having traits make the code *more* readable
imho.
>
> The falcon ucode descriptor is essentially a set of flat field accessors
> and a few params (imem_sec_load_params, dmem_load_params).
> The trait has ~10 simple getter methods. There's no multi-level hierarchy,
> no walker, and no generic propagation.
>
> The MMU page table case is structurally different. Making PtWalk generic
> over an Mmu trait would require:
>
> - PtWalk<M: Mmu> (the walker)
> - Plus all the associated types: M::Pte, M::Pde, M::DualPde each
> needing their own trait bounds
>
> And we would also need:
> - Vmm<M: Mmu> (which creates PtWalk)
> - BarUser<M: Mmu> (which creates Vmm)
>
> I am also against making Vmm an enum as Eliot suggested:
> enum Vmm {
> V2(VmmInner<MmuV2>),
> V3(VmmInner<MmuV3>),
> }
>
> That moves the version complexity up to the reader. Code complexity IMO should
> decrease as we go up abstractions, making it easier for users (Vmm/Bar).
>
> If you look at the the changes in vmm.rs to handle version dispatch there [2]:
> Added: +109
> Removed: -28
>
> [2]
> https://github.com/Edgeworth/linux/commit/3627af550b61256184d589e7ec666c1108971f0e
>
> The main benefit of my approach is version-specific dispatch complexity is
> completely isolated inside MmuVersion thus making the code outside of
> pagetable.rs much more readable, without having to parametrize anything, and
> without code size increase. I think that is worth considering.
>
>> But the main argument to use a trait here IMO is that it enables
>> associated types and constants. That's particularly critical since some
>> equivalent fields have different lengths between v2 and v3. An
>> associated `Bounded` type for these would force the caller to validate
>> the length of these fields before calling a non-fallible operation,
>> which is exactly the level of caution that we want when dealing with
>> page tables.
>
> I think Bounded validation is orthogonal to the dispatch model.
> We can add Bounded to the current design without restructuring
> into traits. For example:
>
> // In ver2::Pte
> pub fn new_vram(pfn: Bounded<Pfn, 25>, writable: bool) -> Self { ... }
>
> // In ver3::Pte
> pub fn new_vram(pfn: Bounded<Pfn, 40>, writable: bool) -> Self { ... }
>
> The unified Pte enum wrapper already dispatches to the correct
> version-specific constructor, which would enforce the correct Bounded
> constraint for that version.
But then what type does the `new_vram` dispatch method take? Generic
code lets us expose the expected `Bounded` type to the caller, which can
do the proper validation. This is a small example, but I expect this
pattern to come up in other parts of the code as well.
>
>> In order to fully benefit from it, we will need the bitfield macro from
>> the `kernel` crate so the PDE/PTE fields can be `Bounded`, I will try to
>> make it available quickly in a patch that you can depend on.
>
> That would be great, and I'd be happy to integrate Bounded validation once
> the macro is available. I just don't think we need to restructure the
> dispatch model in order to benefit from it.
I'll finish the series and hopefully send it a bit later today. That's
another significant rework for the series (sorry about that) but it
should be worth the effort for the added correctness.
next prev parent reply other threads:[~2026-04-09 10:56 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 0:39 [PATCH v9 00/23] gpu: nova-core: Add memory management support Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 01/23] gpu: nova-core: Select GPU_BUDDY for VRAM allocation Joel Fernandes
2026-03-12 6:34 ` Eliot Courtney
2026-03-16 13:17 ` Alexandre Courbot
2026-03-16 16:28 ` Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 02/23] gpu: nova-core: Kconfig: Sort select statements alphabetically Joel Fernandes
2026-03-12 6:35 ` Eliot Courtney
2026-03-16 13:17 ` Alexandre Courbot
2026-03-16 16:28 ` Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 03/23] gpu: nova-core: gsp: Return GspStaticInfo from boot() Joel Fernandes
2026-03-12 6:37 ` Eliot Courtney
2026-03-11 0:39 ` [PATCH v9 04/23] gpu: nova-core: gsp: Extract usable FB region from GSP Joel Fernandes
2026-03-13 6:58 ` Eliot Courtney
2026-04-01 23:23 ` Joel Fernandes
2026-03-16 13:18 ` Alexandre Courbot
2026-03-16 16:57 ` Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 05/23] gpu: nova-core: gsp: Expose total physical VRAM end from FB region info Joel Fernandes
2026-03-16 13:19 ` Alexandre Courbot
2026-03-16 17:00 ` Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 06/23] gpu: nova-core: mm: Add support to use PRAMIN windows to write to VRAM Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 07/23] docs: gpu: nova-core: Document the PRAMIN aperture mechanism Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 08/23] gpu: nova-core: mm: Add common memory management types Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 09/23] gpu: nova-core: mm: Add TLB flush support Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 10/23] gpu: nova-core: mm: Add GpuMm centralized memory manager Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 11/23] gpu: nova-core: mm: Add common types for all page table formats Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 12/23] gpu: nova-core: mm: Add MMU v2 page table types Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 13/23] gpu: nova-core: mm: Add MMU v3 " Joel Fernandes
2026-03-11 0:39 ` [PATCH v9 14/23] gpu: nova-core: mm: Add unified page table entry wrapper enums Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 15/23] gpu: nova-core: mm: Add page table walker for MMU v2/v3 Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 16/23] gpu: nova-core: mm: Add Virtual Memory Manager Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 17/23] gpu: nova-core: mm: Add virtual address range tracking to VMM Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 18/23] gpu: nova-core: mm: Add multi-page mapping API " Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 19/23] gpu: nova-core: Add BAR1 aperture type and size constant Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 20/23] gpu: nova-core: mm: Add BAR1 user interface Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 21/23] gpu: nova-core: mm: Add BAR1 memory management self-tests Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 22/23] gpu: nova-core: mm: Add PRAMIN aperture self-tests Joel Fernandes
2026-03-11 0:40 ` [PATCH v9 23/23] gpu: nova-core: Use runtime BAR1 size instead of hardcoded 256MB Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 00/21] gpu: nova-core: Add memory management support Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 01/21] gpu: nova-core: gsp: Return GspStaticInfo from boot() Joel Fernandes
2026-04-01 8:25 ` Eliot Courtney
2026-04-08 7:34 ` Alexandre Courbot
2026-03-31 21:20 ` [PATCH v10 02/21] gpu: nova-core: gsp: Extract usable FB region from GSP Joel Fernandes
2026-04-01 8:27 ` Eliot Courtney
2026-04-01 23:24 ` Joel Fernandes
2026-04-02 5:49 ` Eliot Courtney
2026-04-06 18:56 ` Joel Fernandes
2026-04-08 7:33 ` Alexandre Courbot
2026-03-31 21:20 ` [PATCH v10 03/21] gpu: nova-core: gsp: Expose total physical VRAM end from FB region info Joel Fernandes
2026-04-02 5:37 ` Eliot Courtney
2026-04-06 19:42 ` Joel Fernandes
2026-04-06 21:08 ` Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 04/21] gpu: nova-core: mm: Add support to use PRAMIN windows to write to VRAM Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 05/21] docs: gpu: nova-core: Document the PRAMIN aperture mechanism Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 06/21] gpu: nova-core: mm: Add common memory management types Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 07/21] gpu: nova-core: mm: Add TLB flush support Joel Fernandes
2026-04-02 5:49 ` Eliot Courtney
2026-04-06 20:50 ` Joel Fernandes
2026-04-02 5:59 ` Matthew Brost
2026-04-06 21:24 ` Joel Fernandes
2026-04-06 22:10 ` Joel Fernandes
2026-04-07 5:14 ` Matthew Brost
2026-04-08 7:40 ` Alexandre Courbot
2026-03-31 21:20 ` [PATCH v10 08/21] gpu: nova-core: mm: Add GpuMm centralized memory manager Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 09/21] gpu: nova-core: mm: Add common types for all page table formats Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 10/21] gpu: nova-core: mm: Add MMU v2 page table types Joel Fernandes
2026-04-02 5:41 ` Eliot Courtney
2026-04-06 21:14 ` Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 11/21] gpu: nova-core: mm: Add MMU v3 " Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums Joel Fernandes
2026-04-02 5:40 ` Eliot Courtney
2026-04-06 21:55 ` Joel Fernandes
2026-04-07 13:42 ` Eliot Courtney
2026-04-07 13:59 ` Joel Fernandes
2026-04-08 7:03 ` Alexandre Courbot
2026-04-08 20:19 ` Joel Fernandes
2026-04-09 10:56 ` Alexandre Courbot [this message]
2026-04-09 11:02 ` Gary Guo
2026-04-08 13:26 ` Eliot Courtney
2026-04-08 16:58 ` Joel Fernandes
2026-04-08 18:01 ` Danilo Krummrich
2026-04-08 19:04 ` Joel Fernandes
2026-04-08 23:13 ` John Hubbard
2026-04-09 10:33 ` Joel Fernandes
2026-04-09 11:00 ` Danilo Krummrich
2026-04-09 18:02 ` John Hubbard
2026-03-31 21:20 ` [PATCH v10 13/21] gpu: nova-core: mm: Add page table walker for MMU v2/v3 Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 14/21] gpu: nova-core: mm: Add Virtual Memory Manager Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 15/21] gpu: nova-core: mm: Add virtual address range tracking to VMM Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 16/21] gpu: nova-core: mm: Add multi-page mapping API " Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 17/21] gpu: nova-core: Add BAR1 aperture type and size constant Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 18/21] gpu: nova-core: mm: Add BAR1 user interface Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 19/21] gpu: nova-core: mm: Add BAR1 memory management self-tests Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 20/21] gpu: nova-core: mm: Add PRAMIN aperture self-tests Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 21/21] gpu: nova-core: Use runtime BAR1 size instead of hardcoded 256MB Joel Fernandes
2026-04-02 5:54 ` Eliot Courtney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DHOKJ3MJNO5P.SXKOAYKX13JL@nvidia.com \
--to=acourbot@nvidia.com \
--cc=a.hindborg@kernel.org \
--cc=airlied@gmail.com \
--cc=airlied@redhat.com \
--cc=alex.gaynor@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=alexeyi@nvidia.com \
--cc=aliceryhl@google.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=apopple@nvidia.com \
--cc=arighi@nvidia.com \
--cc=aritger@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=boqun@kernel.org \
--cc=christian.koenig@amd.com \
--cc=corbet@lwn.net \
--cc=dakr@kernel.org \
--cc=daniel.almeida@collabora.com \
--cc=deller@gmx.de \
--cc=dri-devel@lists.freedesktop.org \
--cc=ecourtney@nvidia.com \
--cc=elle@weathered-steel.dev \
--cc=epeer@nvidia.com \
--cc=gary@garyguo.net \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=jhubbard@nvidia.com \
--cc=joel@joelfernandes.org \
--cc=joelagnelf@nvidia.com \
--cc=joonas.lahtinen@linux.intel.com \
--cc=koen.koning@linux.intel.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fbdev@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lossin@kernel.org \
--cc=lucas.demarchi@intel.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=ndjukic@nvidia.com \
--cc=ojeda@kernel.org \
--cc=phasta@kernel.org \
--cc=ray.huang@amd.com \
--cc=rodrigo.vivi@intel.com \
--cc=rust-for-linux@vger.kernel.org \
--cc=simona@ffwll.ch \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tmgross@umich.edu \
--cc=ttabi@nvidia.com \
--cc=tursulin@ursulin.net \
--cc=tzimmermann@suse.de \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox