Re: [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums

public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed

From: Joel Fernandes <joelagnelf@nvidia.com>
To: Gary Guo <gary@garyguo.net>,
	Alexandre Courbot <acourbot@nvidia.com>,
	Eliot Courtney <ecourtney@nvidia.com>,
	Danilo Krummrich <dakr@kernel.org>
Cc: linux-kernel@vger.kernel.org, Miguel Ojeda <ojeda@kernel.org>,
	Boqun Feng <boqun@kernel.org>,
	Bjorn Roy Baron <bjorn3_gh@protonmail.com>,
	Benno Lossin <lossin@kernel.org>,
	Andreas Hindborg <a.hindborg@kernel.org>,
	Alice Ryhl <aliceryhl@google.com>,
	Trevor Gross <tmgross@umich.edu>,
	Dave Airlie <airlied@redhat.com>,
	Daniel Almeida <daniel.almeida@collabora.com>,
	Koen Koning <koen.koning@linux.intel.com>,
	dri-devel@lists.freedesktop.org, rust-for-linux@vger.kernel.org,
	Nikola Djukic <ndjukic@nvidia.com>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	Jonathan Corbet <corbet@lwn.net>,
	Alex Deucher <alexander.deucher@amd.com>,
	Christian Koenig <christian.koenig@amd.com>,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Tvrtko Ursulin <tursulin@ursulin.net>,
	Huang Rui <ray.huang@amd.com>,
	Matthew Auld <matthew.auld@intel.com>,
	Matthew Brost <matthew.brost@intel.com>,
	Lucas De Marchi <lucas.demarchi@intel.com>,
	Thomas Hellstrom <thomas.hellstrom@linux.intel.com>,
	Helge Deller <deller@gmx.de>, Alex Gaynor <alex.gaynor@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Timur Tabi <ttabi@nvidia.com>, Edwin Peer <epeer@nvidia.com>,
	Andrea Righi <arighi@nvidia.com>,
	Andy Ritger <aritger@nvidia.com>, Zhi Wang <zhiw@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>,
	Philipp Stanner <phasta@kernel.org>,
	Elle Rhumsaa <elle@weathered-steel.dev>,
	alexeyi@nvidia.com, joel@joelfernandes.org,
	linux-doc@vger.kernel.org, amd-gfx@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	linux-fbdev@vger.kernel.org
Subject: Re: [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums
Date: Mon, 13 Apr 2026 16:25:28 -0400	[thread overview]
Message-ID: <a675d5fa-29d2-4420-9b82-18a7027fc769@nvidia.com> (raw)
In-Reply-To: <DHOKN9XVOTIB.1A4JY7CDJFWPS@garyguo.net>



On 4/9/2026 7:02 AM, Gary Guo wrote:
> On Wed Apr 8, 2026 at 9:19 PM BST, Joel Fernandes wrote:
>> Hi Alex, Eliot, Danilo,
>>
>> Thanks for taking a look. Let me respond to the specific points below.
>>
>> On Wed, 08 Apr 2026, Alexandre Courbot wrote:
>>> After a quick look I'd say that having a trait here would actually be
>>> *good* for correctness and maintainability.
>>>
>>> The current design implies that every operation on a page table (most
>>> likely using the walker) goes through a branching point. Just looking at
>>> `PtWalk::read_pte_at_level`, there are already at least 6
>>> `if version == 2 { } else { }` branches that all resolve to the same
>>> result. Include walking down the PDEs and you have at least a dozen of
>>> these just to resolve a virtual address. I know CPUs are fast, but this
>>> is still wasted cycles for no good reason.
>>
>> I did some measurements and there is no notieceable difference in both
>> approaches. I ran perf and loaded nova with self-tests running. The extra
>> potential branching is lost in the noise. In both cases, loading nova and
>> running the self-tests has ~119.7M branch instructions on my Ampere. The total
>> instruction count is also identical (~615M).
>>
>> I measured like this:
>> perf stat -e
>> branches,branch-misses,cache-references,cache-misses,instructions,cycles --
>> modprobe nova_core
>>
>> So I think the branching argument is not a strong one. I also did more
>> measurements and the dominant time taken is MMIO. During the map prep and
>> execute, page table walks are done. A TLB flush alone costs ~1.4 microseconds.
>> And PRAMIN BAR0 writes to write the PTE is also about 1 microsecond. Considering
>> this, I don't think the extra branching argument holds (even without branch
>> prediction and speculation).
>>
>> Also some branches cannot be eliminated even with parameterization:
>>
>>     if level == self.mmu_version.dual_pde_level() {
>>         // 128-bit dual PDE read
>>     } else {
>>         // Regular 64-bit PDE read
>>     }
>>
>> This isn't really a version branch -- it's a structural branch that
>> distinguishes between 64-bit PDE and 128-bit dual PDE entries. Any MMU
>> version with a dual PDE level would need this same distinction.
>>
>> I also did code-generation size analysis (see diff of code used below):
>>
>> Code generation analysis:
>>
>>   Module .ko size:   Before: 511,792 bytes   After: 524,464 bytes  (+2.5%)
>>   .text section:     Before: 112,620 bytes   After: 116,628 bytes  (+4,008 bytes)
>>
>>   The +4K .text growth is the monomorphization cost: every generic function
>>   is compiled twice (once for MmuV2, once for MmuV3).
>>
>>> If you use a trait here, and make `PtWalk` generic against it, you can
>>> optimize this away. We had a similar situation when we introduced Turing
>>> support and the v2 ucode header, and tried both approaches: the
>>> trait-based one was slightly shorter, and arguably more readable.
>>
>> Actually I was the one who suggested traits for Falcon ucode descriptor if you
>> see this thread [1]. So basically you and Eliot are telling me to do what I
>> suggested in [1]. :-) However, I disagree that it is the right choice for this code.
>>
>> [1] https://lore.kernel.org/all/20251117231028.GA1095236@joelbox2/
>>
>> I think the two cases are quite different in complexity:
>>
>> The falcon ucode descriptor is essentially a set of flat field accessors
>> and a few params (imem_sec_load_params, dmem_load_params).
>> The trait has ~10 simple getter methods. There's no multi-level hierarchy,
>> no walker, and no generic propagation.
>>
>> The MMU page table case is structurally different. Making PtWalk generic
>> over an Mmu trait would require:
>>
>>   - PtWalk<M: Mmu> (the walker)
>>   - Plus all the associated types: M::Pte, M::Pde, M::DualPde each
>>     needing their own trait bounds
>>
>> And we would also need:
>>   - Vmm<M: Mmu> (which creates PtWalk)
>>   - BarUser<M: Mmu> (which creates Vmm)
>>
>> I am also against making Vmm an enum as Eliot suggested:
>>        enum Vmm {
>>            V2(VmmInner<MmuV2>),
>>            V3(VmmInner<MmuV3>),
>>        }
>>
>> That moves the version complexity up to the reader. Code complexity IMO should
>> decrease as we go up abstractions, making it easier for users (Vmm/Bar).
>>
>> If you look at the the changes in vmm.rs to handle version dispatch there [2]:
>> Added: +109
>> Removed: -28
>>
>> [2]
>> https://github.com/Edgeworth/linux/commit/3627af550b61256184d589e7ec666c1108971f0e
>>
>> The main benefit of my approach is version-specific dispatch complexity is
>> completely isolated inside MmuVersion thus making the code outside of
>> pagetable.rs much more readable, without having to parametrize anything, and
>> without code size increase. I think that is worth considering.
>>
>>> But the main argument to use a trait here IMO is that it enables
>>> associated types and constants. That's particularly critical since some
>>> equivalent fields have different lengths between v2 and v3. An
>>> associated `Bounded` type for these would force the caller to validate
>>> the length of these fields before calling a non-fallible operation,
>>> which is exactly the level of caution that we want when dealing with
>>> page tables.
>>
>> I think Bounded validation is orthogonal to the dispatch model.
>> We can add Bounded to the current design without restructuring
>> into traits. For example:
>>
>>     // In ver2::Pte
>>     pub fn new_vram(pfn: Bounded<Pfn, 25>, writable: bool) -> Self { ... }
>>
>>     // In ver3::Pte
>>     pub fn new_vram(pfn: Bounded<Pfn, 40>, writable: bool) -> Self { ... }
>>
>> The unified Pte enum wrapper already dispatches to the correct
>> version-specific constructor, which would enforce the correct Bounded
>> constraint for that version.
>>
>>> In order to fully benefit from it, we will need the bitfield macro from
>>> the `kernel` crate so the PDE/PTE fields can be `Bounded`, I will try to
>>> make it available quickly in a patch that you can depend on.
>>
>> That would be great, and I'd be happy to integrate Bounded validation once
>> the macro is available. I just don't think we need to restructure the
>> dispatch model in order to benefit from it.
>>
>>> But long story short, and although I need to dive deeper into the code,
>>> this looks like a good candidate for using a trait and associated types.
>>
>> The walker code (walk.rs) is already version-agnostic and reads cleanly.
>> The version dispatch is encapsulated behind method calls, not exposed as
>> inline if/else blocks.
>>
>> Generic propagation (or version-specific dispatch at higher levels) adds more
>> complexity at higher layers.
>>
>> Enclosed below [3] is the diff I used for my testing with the data, I don't
>> really see a net readability win there (IMO, it is a net-loss in readability).
>>
>> [3]
>> https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=trait-pt-dispatch&id=5eb0e98af11ba608ff4d0f7a06065ee863f5066a
> 
> IMO this diff is quite has got me quite in favour of trait approach.
> 
> I wanted about to purpose something similar (or maybe I had already?) trait
> approach some versions ago but didn't due to the eventual need of `match` like
> dispatch (like you had with `vmm_dispatch`), but your code made that looks not
> as bad as I thought it would be.
> 

That's the drawback right, now vmm_dispatch has to deal with version difference
where as before, the lower layers would. Maybe we can keep the vmm layer the way
it is now, but do the dispatch itself at the lower layers, while still using
traits like in the diff. I'll try that as well. :)

thanks,

--
Joel Fernandes

next prev parent reply	other threads:[~2026-04-13 20:25 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11  0:39 [PATCH v9 00/23] gpu: nova-core: Add memory management support Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 01/23] gpu: nova-core: Select GPU_BUDDY for VRAM allocation Joel Fernandes
2026-03-12  6:34   ` Eliot Courtney
2026-03-16 13:17   ` Alexandre Courbot
2026-03-16 16:28     ` Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 02/23] gpu: nova-core: Kconfig: Sort select statements alphabetically Joel Fernandes
2026-03-12  6:35   ` Eliot Courtney
2026-03-16 13:17   ` Alexandre Courbot
2026-03-16 16:28     ` Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 03/23] gpu: nova-core: gsp: Return GspStaticInfo from boot() Joel Fernandes
2026-03-12  6:37   ` Eliot Courtney
2026-03-11  0:39 ` [PATCH v9 04/23] gpu: nova-core: gsp: Extract usable FB region from GSP Joel Fernandes
2026-03-13  6:58   ` Eliot Courtney
2026-04-01 23:23     ` Joel Fernandes
2026-03-16 13:18   ` Alexandre Courbot
2026-03-16 16:57     ` Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 05/23] gpu: nova-core: gsp: Expose total physical VRAM end from FB region info Joel Fernandes
2026-03-16 13:19   ` Alexandre Courbot
2026-03-16 17:00     ` Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 06/23] gpu: nova-core: mm: Add support to use PRAMIN windows to write to VRAM Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 07/23] docs: gpu: nova-core: Document the PRAMIN aperture mechanism Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 08/23] gpu: nova-core: mm: Add common memory management types Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 09/23] gpu: nova-core: mm: Add TLB flush support Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 10/23] gpu: nova-core: mm: Add GpuMm centralized memory manager Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 11/23] gpu: nova-core: mm: Add common types for all page table formats Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 12/23] gpu: nova-core: mm: Add MMU v2 page table types Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 13/23] gpu: nova-core: mm: Add MMU v3 " Joel Fernandes
2026-03-11  0:39 ` [PATCH v9 14/23] gpu: nova-core: mm: Add unified page table entry wrapper enums Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 15/23] gpu: nova-core: mm: Add page table walker for MMU v2/v3 Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 16/23] gpu: nova-core: mm: Add Virtual Memory Manager Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 17/23] gpu: nova-core: mm: Add virtual address range tracking to VMM Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 18/23] gpu: nova-core: mm: Add multi-page mapping API " Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 19/23] gpu: nova-core: Add BAR1 aperture type and size constant Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 20/23] gpu: nova-core: mm: Add BAR1 user interface Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 21/23] gpu: nova-core: mm: Add BAR1 memory management self-tests Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 22/23] gpu: nova-core: mm: Add PRAMIN aperture self-tests Joel Fernandes
2026-03-11  0:40 ` [PATCH v9 23/23] gpu: nova-core: Use runtime BAR1 size instead of hardcoded 256MB Joel Fernandes
2026-03-31 21:20 ` [PATCH v10 00/21] gpu: nova-core: Add memory management support Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 01/21] gpu: nova-core: gsp: Return GspStaticInfo from boot() Joel Fernandes
2026-04-01  8:25     ` Eliot Courtney
2026-04-08  7:34     ` Alexandre Courbot
2026-03-31 21:20   ` [PATCH v10 02/21] gpu: nova-core: gsp: Extract usable FB region from GSP Joel Fernandes
2026-04-01  8:27     ` Eliot Courtney
2026-04-01 23:24       ` Joel Fernandes
2026-04-02  5:49         ` Eliot Courtney
2026-04-06 18:56           ` Joel Fernandes
2026-04-08  7:33     ` Alexandre Courbot
2026-03-31 21:20   ` [PATCH v10 03/21] gpu: nova-core: gsp: Expose total physical VRAM end from FB region info Joel Fernandes
2026-04-02  5:37     ` Eliot Courtney
2026-04-06 19:42       ` Joel Fernandes
2026-04-06 21:08       ` Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 04/21] gpu: nova-core: mm: Add support to use PRAMIN windows to write to VRAM Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 05/21] docs: gpu: nova-core: Document the PRAMIN aperture mechanism Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 06/21] gpu: nova-core: mm: Add common memory management types Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 07/21] gpu: nova-core: mm: Add TLB flush support Joel Fernandes
2026-04-02  5:49     ` Eliot Courtney
2026-04-06 20:50       ` Joel Fernandes
2026-04-02  5:59     ` Matthew Brost
2026-04-06 21:24       ` Joel Fernandes
2026-04-06 22:10         ` Joel Fernandes
2026-04-07  5:14           ` Matthew Brost
2026-04-08  7:40             ` Alexandre Courbot
2026-03-31 21:20   ` [PATCH v10 08/21] gpu: nova-core: mm: Add GpuMm centralized memory manager Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 09/21] gpu: nova-core: mm: Add common types for all page table formats Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 10/21] gpu: nova-core: mm: Add MMU v2 page table types Joel Fernandes
2026-04-02  5:41     ` Eliot Courtney
2026-04-06 21:14       ` Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 11/21] gpu: nova-core: mm: Add MMU v3 " Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums Joel Fernandes
2026-04-02  5:40     ` Eliot Courtney
2026-04-06 21:55       ` Joel Fernandes
2026-04-07 13:42         ` Eliot Courtney
2026-04-07 13:59           ` Joel Fernandes
2026-04-08  7:03             ` Alexandre Courbot
2026-04-08 20:19               ` Joel Fernandes
2026-04-09 10:56                 ` Alexandre Courbot
2026-04-13 20:04                   ` Joel Fernandes
2026-04-09 11:02                 ` Gary Guo
2026-04-13 20:25                   ` Joel Fernandes [this message]
2026-04-08 13:26             ` Eliot Courtney
2026-04-08 16:58               ` Joel Fernandes
2026-04-08 18:01                 ` Danilo Krummrich
2026-04-08 19:04                   ` Joel Fernandes
2026-04-08 23:13                 ` John Hubbard
2026-04-09 10:33                   ` Joel Fernandes
2026-04-09 11:00                     ` Danilo Krummrich
2026-04-13 20:10                       ` Joel Fernandes
2026-04-13 22:27                         ` Joel Fernandes
2026-04-13 22:50                           ` John Hubbard
2026-04-09 18:02                     ` John Hubbard
2026-03-31 21:20   ` [PATCH v10 13/21] gpu: nova-core: mm: Add page table walker for MMU v2/v3 Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 14/21] gpu: nova-core: mm: Add Virtual Memory Manager Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 15/21] gpu: nova-core: mm: Add virtual address range tracking to VMM Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 16/21] gpu: nova-core: mm: Add multi-page mapping API " Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 17/21] gpu: nova-core: Add BAR1 aperture type and size constant Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 18/21] gpu: nova-core: mm: Add BAR1 user interface Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 19/21] gpu: nova-core: mm: Add BAR1 memory management self-tests Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 20/21] gpu: nova-core: mm: Add PRAMIN aperture self-tests Joel Fernandes
2026-03-31 21:20   ` [PATCH v10 21/21] gpu: nova-core: Use runtime BAR1 size instead of hardcoded 256MB Joel Fernandes
2026-04-02  5:54     ` Eliot Courtney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a675d5fa-29d2-4420-9b82-18a7027fc769@nvidia.com \
    --to=joelagnelf@nvidia.com \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=airlied@redhat.com \
    --cc=alex.gaynor@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=alexeyi@nvidia.com \
    --cc=aliceryhl@google.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=apopple@nvidia.com \
    --cc=arighi@nvidia.com \
    --cc=aritger@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=boqun@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=deller@gmx.de \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ecourtney@nvidia.com \
    --cc=elle@weathered-steel.dev \
    --cc=epeer@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=jhubbard@nvidia.com \
    --cc=joel@joelfernandes.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=koen.koning@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fbdev@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=lucas.demarchi@intel.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=ndjukic@nvidia.com \
    --cc=ojeda@kernel.org \
    --cc=phasta@kernel.org \
    --cc=ray.huang@amd.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=simona@ffwll.ch \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tmgross@umich.edu \
    --cc=ttabi@nvidia.com \
    --cc=tursulin@ursulin.net \
    --cc=tzimmermann@suse.de \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox