NVIDIA GPU driver infrastructure
 help / color / mirror / Atom feed
From: Joel Fernandes <joelagnelf@nvidia.com>
To: Alexandre Courbot <acourbot@nvidia.com>
Cc: linux-kernel@vger.kernel.org, Miguel Ojeda <ojeda@kernel.org>,
	Boqun Feng <boqun@kernel.org>, Gary Guo <gary@garyguo.net>,
	Bjorn Roy Baron <bjorn3_gh@protonmail.com>,
	Benno Lossin <lossin@kernel.org>,
	Andreas Hindborg <a.hindborg@kernel.org>,
	Alice Ryhl <aliceryhl@google.com>,
	Trevor Gross <tmgross@umich.edu>,
	Danilo Krummrich <dakr@kernel.org>,
	Dave Airlie <airlied@redhat.com>,
	Daniel Almeida <daniel.almeida@collabora.com>,
	dri-devel@lists.freedesktop.org, rust-for-linux@vger.kernel.org,
	nova-gpu@lists.linux.dev, Nikola Djukic <ndjukic@nvidia.com>,
	David Airlie <airlied@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Timur Tabi <ttabi@nvidia.com>, Edwin Peer <epeer@nvidia.com>,
	Andrea Righi <arighi@nvidia.com>,
	Andy Ritger <aritger@nvidia.com>, Zhi Wang <zhiw@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>,
	Philipp Stanner <phasta@kernel.org>,
	alexeyi@nvidia.com, Eliot Courtney <ecourtney@nvidia.com>,
	joel@joelfernandes.org, linux-doc@vger.kernel.org
Subject: Re: [PATCH v12 04/22] gpu: nova-core: mm: Add support to use PRAMIN windows to write to VRAM
Date: Thu, 7 May 2026 15:44:11 -0400	[thread overview]
Message-ID: <296ae420-5400-4903-8d36-62b471862065@nvidia.com> (raw)
In-Reply-To: <DI8B0103X0HQ.7C99YLBMS2X5@nvidia.com>

Hi Alex,

On Sun, 03 May 2026, Alexandre Courbot wrote:
> `Pramin::new` takes different arguments (also check the other examples).

Fixed.

> The HRT series [1] will allow you to greatly simplify all this by
> storing a `&'a Bar0` directly in this structure. It will most likely
> land this cycle, so I think it's a good idea to proactively depend on
> them. It should apply cleanly (modulo the Tyr patches IIRC, but you can
> skip them if you don't build the driver) on top of `drm-rust-next` - on
> top of which this should also be rebased anyway.
>
> [1] https://lore.kernel.org/all/20260427221155.2144848-1-dakr@kernel.org/

Actually, I was going in the direction of making Bar0, GpuMm and Bar1
use Arc instead of lifetimes initially. The reason is, some of these are
long lived references. Directly using `&'a Bar0` means the lifetimes
also gets threaded to Pramin and then threads to GpuMm which is long
lived in some of my later patches. Some of the cases where we will have
long lived mappings are also vGPU and channels. I spoke to Danilo about
this, and we discussed that an initial approach could just use Arc since
refcounts are cheap and it should be just as performant, then once we
have the design ironed out, we can migrate to using lifetimes in this
code where appropriate. Even before HRT, I already used lifetimes and
analyzed the tradeoffs.

> I don't think that checking the range in this type is particularly
> useful, because this abstraction cannot validate VRAM ownership
> completely: VRAM can be sparse, reserved, or protected, and with the
> current bound access to these invalid areas is not prevented.

I think there's value in keeping this. The range check is similar in
spirit to what `Bar0` accessors do -- bounds are checked, but a passing
check doesn't guarantee the IO will succeed at the hardware level (an
address in a valid range could still cause issues). That's not a reason
to remove the guard; it still catches obvious bugs early. Having a sanity
check that says "this address is at least within the declared VRAM
region" is more useful than having no check at all, even if ownership
semantics aren't enforced at this layer.

>
> PRAMIN should only model the hardware windowing mechanism; address
> validity belongs to the higher layers that allocate or obtain those
> addresses. IIUC nothing will explode if we try to read or write into
> areas that are not VRAM.
> The problem with this lock is that it is buried, alongside its usage
> guide, deep into this type, ensuring that users won't ever see it. It
> will create deadlocks rather than preventing them.

Having an internal lock isn't inherently problematic -- the GPU buddy
allocator bindings follow the same pattern, for example. Let us go by usecase,
lets not complicate design based on a hypothetical. I wouldn't move the mutex
out of Pramin just based on guessing and complicate/thread the mutable reference
to layers that don't need to handle that complexity.

Further, what concrete design or use case requires mutex to be external? I
wouldn't add that complexity based on hypotheticals -- we go by actual use
cases, and there are actual users of this code in the patches that follow. For
what it's worth, I did try the outer-mutex approach myself before you mentioned
it and decided against it for this and the buddy code based on the reasons
mentioned.

> And when you look closer at it, you realize it is actually acquired for
> any use of `Pramin`, since the only method it exposes is `get_window`,
> which acquires the lock, and returns a `PraminWindow` which has exactly
> the same layout as `Pramin`, except that the lock is acquired.
>
> So what's the point? Let's just make `get_window` require a `&mut self`
> and let the owner of `Pramin` decide how to manage concurrent accesses.
> At least the doccomment for the lock will be visible from a higher
> layer.

The point is to keep it simple. What you're suggesting requires the
caller to do: lock + get_window + access window. My design is get_window
+ access window -- you need the latter two anyway, so the locking is
absorbed without any extra burden. Propagating lock outward adds
complexity at every layer that calls into `Pramin`; handling the lock
internally, as done elsewhere in the codebase, avoids that.

> Currently `Pramin` mixes elements of hardware access, memory management,
> and synchronization into the same type, squashing what should be
> different layers into a single one. Now that the `Io` trait is merged
> and available in `drm-rust-next`, it is a good time to implement the
> feedback I gave on v8. Roughly:
>
> - `Pramin` should just be the owner and arbiter of the
>   `NV_PBUS_BAR0_WINDOW` register. Its `get_window(&mut self,
>   window_base: Bounded<u64, 24>)` method simply returns a window to the
>   1MB area starting from `base << 16`. It's the simplest possible
>   abstraction.
> - The returned window implements `Io` and `IoKnownSize`. This will allow
>   users to do all the fancy stuff that comes with `Io`, including
>   projections.
> - Random accesses to VRAM, which are needed for walking the page tables
>   and updating them, can be built on top of this simple API. Either by
>   adding read/write ops directly to `Pramin`, or by defining another
>   type that owns a `&mut Pramin` and moves the window automatically if
>   the next access is out of bounds. This should result in window
>   management code that is simpler than the current `compute_window`.
>
> That's really all we need. Then `GpuMm` can manage concurrency similarly
> to the current model by wrapping `Pramin` inside a `Mutex`, and is also
> free to experiment with different locking strategies, something the
> current design doesn't allow.

I am looking into implementing `Io`/`IoKnownSize` on `PraminWindow` --
that part makes sense independently. The `Io` impl would operate on the
current fixed 1MB window view; the auto-repositioning methods stay as a
there is a usecase (more later). None of this requires the locking changes.

I'm familiar with the v8 feedback, but I disagree with parts of it. The
current design isn't complex -- it's the external locking + split into
multiple types approach that I think would make it more complex. I'm
adopting the parts of your suggestions that I agreed with.

The auto-repositioning is actively used and not hypothetical.
`install_mappings` (mm/pagetable/map.rs) holds a single `PraminWindow`
while writing PDEs at scattered buddy-allocated VRAM addresses and then
walking up to 4 PDE levels per VFN -- each level a separately-allocated
4KB page that can be many MB apart. `invalidate_ptes` similarly holds one
window across all VFNs: for a range mapping more than ~256 PT pages the
PTE region alone spans > 1MB. Removing auto-repositioning would push that
window arithmetic onto every caller.

> > +        // TODO: Convert to Bounded<u64, 40> when available.
>
> It's available. :)

Changed to use Bounded, thanks for that primitive.

> I think this method (and `write_window_base`) should be methods of the
> `NV_PBUS_BAR0_WINDOW` register (`window_base` and `set_window_base`)

Agreed, I will move it to the registers.

> > +        vram_offset: usize,
>
> This should be a `u64`, since it is a VRAM address.
>
> Actually, I noticed that you introduced a `VramAddress` type in a latter
> patch. It might be worth introducing it earlier and using it everywhere
> a VRAM address is involved, that way we cannot make any mistake - not
> critical for now, but feel free to do it if you think it helps.

Cool, I will look into it. Sounds reasonable to do.

thanks,
-- 
Joel Fernandes

  parent reply	other threads:[~2026-05-07 19:44 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-25 21:14 [PATCH v12 00/22] gpu: nova-core: Add memory management support Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 01/22] gpu: nova-core: gsp: Return GspStaticInfo from boot() Joel Fernandes
2026-05-02 15:41   ` Alexandre Courbot
2026-05-05 18:25     ` Joel Fernandes
2026-05-06  0:46       ` Alexandre Courbot
2026-04-25 21:14 ` [PATCH v12 02/22] gpu: nova-core: gsp: Extract usable FB region from GSP Joel Fernandes
2026-05-02 15:41   ` Alexandre Courbot
2026-05-05 18:54     ` Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 03/22] gpu: nova-core: gsp: Expose total physical VRAM end from FB region info Joel Fernandes
2026-05-02 15:41   ` Alexandre Courbot
2026-05-05 20:17     ` Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 04/22] gpu: nova-core: mm: Add support to use PRAMIN windows to write to VRAM Joel Fernandes
2026-05-02 15:41   ` Alexandre Courbot
2026-05-05 22:59     ` Joel Fernandes
2026-05-06  0:47       ` Alexandre Courbot
2026-05-06 14:38         ` Joel Fernandes
2026-05-07 19:44     ` Joel Fernandes [this message]
2026-04-25 21:14 ` [PATCH v12 05/22] docs: gpu: nova-core: Document the PRAMIN aperture mechanism Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 06/22] gpu: nova-core: mm: Add common memory management types Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 07/22] gpu: nova-core: mm: Add TLB flush support Joel Fernandes
2026-05-02 15:42   ` Alexandre Courbot
2026-04-25 21:14 ` [PATCH v12 08/22] gpu: nova-core: mm: Add GpuMm centralized memory manager Joel Fernandes
2026-05-02 15:42   ` Alexandre Courbot
2026-04-25 21:14 ` [PATCH v12 09/22] gpu: nova-core: mm: Add common types for all page table formats Joel Fernandes
2026-05-02 15:42   ` Alexandre Courbot
2026-05-02 17:55     ` Joel Fernandes
2026-05-04 14:27       ` Alexandre Courbot
2026-04-25 21:14 ` [PATCH v12 10/22] gpu: nova-core: mm: Add MMU v2 page table types Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 11/22] gpu: nova-core: mm: Add MMU v3 " Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 12/22] gpu: nova-core: mm: Add page table entry operation traits Joel Fernandes
2026-05-02 15:42   ` Alexandre Courbot
2026-05-02 19:19     ` Joel Fernandes
2026-05-04 14:31       ` Alexandre Courbot
2026-05-04 19:28         ` Joel Fernandes
2026-05-04 19:42           ` Danilo Krummrich
2026-05-04 19:50             ` Joel Fernandes
2026-05-04 23:50               ` Joel Fernandes
2026-05-05  5:37           ` Alexandre Courbot
2026-04-25 21:14 ` [PATCH v12 13/22] gpu: nova-core: mm: Add page table walker for MMU v2/v3 Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 14/22] gpu: nova-core: mm: Add Virtual Memory Manager Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 15/22] gpu: nova-core: mm: Add virtual address range tracking to VMM Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 16/22] gpu: nova-core: mm: Add multi-page mapping API " Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 17/22] gpu: nova-core: Add BAR1 aperture type and size constant Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 18/22] gpu: nova-core: mm: Add BAR1 user interface Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 19/22] gpu: nova-core: mm: Add BAR1 memory management self-tests Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 20/22] gpu: nova-core: mm: Add PRAMIN aperture self-tests Joel Fernandes
2026-05-02 15:42   ` Alexandre Courbot
2026-05-05 21:18     ` Joel Fernandes
2026-04-25 21:14 ` [PATCH v12 21/22] gpu: nova-core: mm: pramin: drop useless as_ref() in run_self_test Joel Fernandes
2026-05-02 15:42   ` Alexandre Courbot
2026-04-25 21:14 ` [PATCH v12 22/22] rust: maple_tree: implement Send and Sync for MapleTree Joel Fernandes
2026-05-02 15:42   ` Alexandre Courbot
2026-05-02 17:36     ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=296ae420-5400-4903-8d36-62b471862065@nvidia.com \
    --to=joelagnelf@nvidia.com \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=airlied@redhat.com \
    --cc=alexeyi@nvidia.com \
    --cc=aliceryhl@google.com \
    --cc=apopple@nvidia.com \
    --cc=arighi@nvidia.com \
    --cc=aritger@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=boqun@kernel.org \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ecourtney@nvidia.com \
    --cc=epeer@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=ndjukic@nvidia.com \
    --cc=nova-gpu@lists.linux.dev \
    --cc=ojeda@kernel.org \
    --cc=phasta@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tmgross@umich.edu \
    --cc=ttabi@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox