From: "Alexandre Courbot" <acourbot@nvidia.com>
To: "Joel Fernandes" <joelagnelf@nvidia.com>,
<linux-kernel@vger.kernel.org>, <rust-for-linux@vger.kernel.org>,
<dri-devel@lists.freedesktop.org>, <dakr@kernel.org>,
<acourbot@nvidia.com>
Cc: "Alistair Popple" <apopple@nvidia.com>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Gary Guo" <gary@garyguo.net>, <bjorn3_gh@protonmail.com>,
"Benno Lossin" <lossin@kernel.org>,
"Andreas Hindborg" <a.hindborg@kernel.org>,
"Alice Ryhl" <aliceryhl@google.com>,
"Trevor Gross" <tmgross@umich.edu>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"John Hubbard" <jhubbard@nvidia.com>,
"Timur Tabi" <ttabi@nvidia.com>, <joel@joelfernandes.org>,
"Elle Rhumsaa" <elle@weathered-steel.dev>,
"Daniel Almeida" <daniel.almeida@collabora.com>,
<nouveau@lists.freedesktop.org>
Subject: Re: [PATCH 6/7] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
Date: Wed, 22 Oct 2025 19:41:13 +0900 [thread overview]
Message-ID: <DDOSD746PCSR.CNAYZSTFR9XR@nvidia.com> (raw)
In-Reply-To: <20251020185539.49986-7-joelagnelf@nvidia.com>
On Tue Oct 21, 2025 at 3:55 AM JST, Joel Fernandes wrote:
> Required for writing page tables directly to VRAM physical memory,
> before page tables and MMU are setup.
>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> ---
> drivers/gpu/nova-core/mm/mod.rs | 3 +
> drivers/gpu/nova-core/mm/pramin.rs | 241 +++++++++++++++++++++++++++++
> drivers/gpu/nova-core/nova_core.rs | 1 +
> drivers/gpu/nova-core/regs.rs | 29 +++-
> 4 files changed, 273 insertions(+), 1 deletion(-)
> create mode 100644 drivers/gpu/nova-core/mm/mod.rs
> create mode 100644 drivers/gpu/nova-core/mm/pramin.rs
>
> diff --git a/drivers/gpu/nova-core/mm/mod.rs b/drivers/gpu/nova-core/mm/mod.rs
> new file mode 100644
> index 000000000000..54c7cd9416a9
> --- /dev/null
> +++ b/drivers/gpu/nova-core/mm/mod.rs
> @@ -0,0 +1,3 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +pub(crate) mod pramin;
> diff --git a/drivers/gpu/nova-core/mm/pramin.rs b/drivers/gpu/nova-core/mm/pramin.rs
> new file mode 100644
> index 000000000000..4f4e1b8c0b9b
> --- /dev/null
> +++ b/drivers/gpu/nova-core/mm/pramin.rs
> @@ -0,0 +1,241 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Direct VRAM access through PRAMIN window before page tables are set up.
> +//! PRAMIN can also write to system memory, however for simplicty we only
s/simplicty/simplicity
> +//! support VRAM access.
> +//!
> +//! # Examples
> +//!
> +//! ## Writing u32 data to VRAM
> +//!
> +//! ```no_run
> +//! use crate::driver::Bar0;
> +//! use crate::mm::pramin::PraminVram;
> +//!
> +//! fn write_data_to_vram(bar: &Bar0) -> Result {
> +//! let pramin = PraminVram::new(bar);
> +//! // Write 4 32-bit words to VRAM at offset 0x10000
> +//! let data: [u32; 4] = [0xDEADBEEF, 0xCAFEBABE, 0x12345678, 0x87654321];
> +//! pramin.write::<u32>(0x10000, &data)?;
> +//! Ok(())
> +//! }
> +//! ```
> +//!
> +//! ## Reading bytes from VRAM
> +//!
> +//! ```no_run
> +//! use crate::driver::Bar0;
> +//! use crate::mm::pramin::PraminVram;
> +//!
> +//! fn read_data_from_vram(bar: &Bar0, buffer: &mut KVec<u8>) -> Result {
> +//! let pramin = PraminVram::new(bar);
> +//! // Read a u8 from VRAM starting at offset 0x20000
> +//! pramin.read::<u8>(0x20000, buffer)?;
> +//! Ok(())
> +//! }
> +//! ```
> +
> +#![expect(dead_code)]
> +
> +use crate::driver::Bar0;
> +use crate::regs;
> +use core::mem;
> +use kernel::prelude::*;
> +
> +/// PRAMIN is a window into the VRAM (not a hardware block) that is used to access
> +/// the VRAM directly. These addresses are consistent across all GPUs.
> +const PRAMIN_BASE: usize = 0x700000; // PRAMIN is always at BAR0 + 0x700000
This definition looks like it could be an array of registers - that way
we could use its `BASE` associated constant and keep the hardware
offsets into the `regs` module.
Even if we don't use the array of registers for convenience, it is good
to have it defined in `regs` for consistency.
> +const PRAMIN_SIZE: usize = 0x100000; // 1MB aperture - max access per window position
You can use `kernel::sizes::SZ_1M` here.
> +
> +/// Trait for types that can be read/written through PRAMIN.
> +pub(crate) trait PraminNum: Copy + Default + Sized {
> + fn read_from_bar(bar: &Bar0, offset: usize) -> Result<Self>;
> +
> + fn write_to_bar(self, bar: &Bar0, offset: usize) -> Result;
> +
> + fn size_bytes() -> usize {
> + mem::size_of::<Self>()
> + }
> +
> + fn alignment() -> usize {
> + Self::size_bytes()
> + }
> +}
Since this trait requires `Sized`, you can use `size_of` and `align_of`
directly, making the `size_bytes` and `alignment` methods redundant.
Only `write_to_bar` should remain.
I also wonder whether we couldn't get rid of this trait entirely by
leveragin `FromBytes` and `AsBytes`. Since the size of the type is
known, we could have read/write methods in Pramin that write its content
by using Io accessors of decreasing size (first 64-bit, then 32, etc)
until all the data is written.
> +
> +/// Macro to implement PraminNum trait for unsigned integer types.
> +macro_rules! impl_pramin_unsigned_num {
> + ($bits:literal) => {
> + ::kernel::macros::paste! {
> + impl PraminNum for [<u $bits>] {
> + fn read_from_bar(bar: &Bar0, offset: usize) -> Result<Self> {
> + bar.[<try_read $bits>](offset)
> + }
> +
> + fn write_to_bar(self, bar: &Bar0, offset: usize) -> Result {
> + bar.[<try_write $bits>](self, offset)
> + }
> + }
> + }
> + };
> +}
> +
> +impl_pramin_unsigned_num!(8);
> +impl_pramin_unsigned_num!(16);
> +impl_pramin_unsigned_num!(32);
> +impl_pramin_unsigned_num!(64);
> +
> +/// Direct VRAM access through PRAMIN window before page tables are set up.
> +pub(crate) struct PraminVram<'a> {
Let's use the shorter name `Pramin` - the limitation to VRAM is a
reasonable one (since the CPU can access its own system memory), it is
not necessary to encode it into the name.
> + bar: &'a Bar0,
> + saved_window_addr: usize,
> +}
> +
> +impl<'a> PraminVram<'a> {
> + /// Create a new PRAMIN VRAM accessor, saving current window state,
> + /// the state is restored when the accessor is dropped.
> + ///
> + /// The BAR0 window base must be 64KB aligned but provides 1MB of VRAM access.
> + /// Window is repositioned automatically when accessing data beyond 1MB boundaries.
> + pub(crate) fn new(bar: &'a Bar0) -> Self {
> + let saved_window_addr = Self::get_window_addr(bar);
> + Self {
> + bar,
> + saved_window_addr,
> + }
> + }
> +
> + /// Set BAR0 window to point to specific FB region.
> + ///
> + /// # Arguments
> + ///
> + /// * `fb_offset` - VRAM byte offset where the window should be positioned.
> + /// Must be 64KB aligned (lower 16 bits zero).
Let's follow the rust doccomment guidelines for the arguments.
> + fn set_window_addr(&self, fb_offset: usize) -> Result {
> + // FB offset must be 64KB aligned (hardware requirement for window_base field)
> + // Once positioned, the window provides access to 1MB of VRAM through PRAMIN aperture
> + if fb_offset & 0xFFFF != 0 {
> + return Err(EINVAL);
> + }
Since this method is private and called from controlled contexts for
which `fb_offset` should always be valid, we can request callers to
give us a "window index" (e.g. the `window_base` of the
`NV_PBUS_BAR0_WINDOW` register) directly and remove this check. That
will also let us remove the impl block on `NV_PBUS_BAR0_WINDOW`.
> +
> + let window_reg = regs::NV_PBUS_BAR0_WINDOW::default().set_window_addr(fb_offset);
> + window_reg.write(self.bar);
> +
> + // Read back to ensure it took effect
> + let readback = regs::NV_PBUS_BAR0_WINDOW::read(self.bar);
> + if readback.window_base() != window_reg.window_base() {
> + return Err(EIO);
> + }
> +
> + Ok(())
> + }
> +
> + /// Get current BAR0 window offset.
> + ///
> + /// # Returns
> + ///
> + /// The byte offset in VRAM where the PRAMIN window is currently positioned.
> + /// This offset is always 64KB aligned.
> + fn get_window_addr(bar: &Bar0) -> usize {
> + let window_reg = regs::NV_PBUS_BAR0_WINDOW::read(bar);
> + window_reg.get_window_addr()
> + }
> +
> + /// Common logic for accessing VRAM data through PRAMIN with windowing.
> + ///
> + /// # Arguments
> + ///
> + /// * `fb_offset` - Starting byte offset in VRAM (framebuffer) where access begins.
> + /// Must be aligned to `T::alignment()`.
> + /// * `num_items` - Number of items of type `T` to process.
> + /// * `operation` - Closure called for each item to perform the actual read/write.
> + /// Takes two parameters:
> + /// - `data_idx`: Index of the item in the data array (0..num_items)
> + /// - `pramin_offset`: BAR0 offset in the PRAMIN aperture to access
Formatting of arguments is strange here as well.
> + ///
> + /// The function automatically handles PRAMIN window repositioning when accessing
> + /// data that spans multiple 1MB windows.
Inversely, this large method is under-documented. Understanding what
`operation` is supposed to do would be helpful.
> + fn access_vram<T: PraminNum, F>(
> + &self,
> + fb_offset: usize,
> + num_items: usize,
> + mut operation: F,
> + ) -> Result
> + where
> + F: FnMut(usize, usize) -> Result,
> + {
> + // FB offset must be aligned to the size of T
> + if fb_offset & (T::alignment() - 1) != 0 {
> + return Err(EINVAL);
> + }
> +
> + let mut offset_bytes = fb_offset;
> + let mut remaining_items = num_items;
> + let mut data_index = 0;
> +
> + while remaining_items > 0 {
> + // Align the window to 64KB boundary
> + let target_window = offset_bytes & !0xFFFF;
> + let window_offset = offset_bytes - target_window;
> +
> + // Set window if needed
> + if target_window != Self::get_window_addr(self.bar) {
> + self.set_window_addr(target_window)?;
> + }
> +
> + // Calculate how many items we can access from this window position
> + // We can access up to 1MB total, minus the offset within the window
> + let remaining_in_window = PRAMIN_SIZE - window_offset;
> + let max_items_in_window = remaining_in_window / T::size_bytes();
> + let items_to_write = core::cmp::min(remaining_items, max_items_in_window);
> +
> + // Process data through PRAMIN
> + for i in 0..items_to_write {
> + // Calculate the byte offset in the PRAMIN window to write to.
> + let pramin_offset_bytes = PRAMIN_BASE + window_offset + (i * T::size_bytes());
> + operation(data_index + i, pramin_offset_bytes)?;
> + }
> +
> + // Move to next chunk.
> + data_index += items_to_write;
> + offset_bytes += items_to_write * T::size_bytes();
> + remaining_items -= items_to_write;
> + }
> +
> + Ok(())
> + }
> +
> + /// Generic write for data to VRAM through PRAMIN.
> + ///
> + /// # Arguments
> + ///
> + /// * `fb_offset` - Starting byte offset in VRAM where data will be written.
> + /// Must be aligned to `T::alignment()`.
> + /// * `data` - Slice of items to write to VRAM. All items will be written sequentially
> + /// starting at `fb_offset`.
> + pub(crate) fn write<T: PraminNum>(&self, fb_offset: usize, data: &[T]) -> Result {
> + self.access_vram::<T, _>(fb_offset, data.len(), |data_idx, pramin_offset| {
> + data[data_idx].write_to_bar(self.bar, pramin_offset)
> + })
> + }
> +
> + /// Generic read data from VRAM through PRAMIN.
> + ///
> + /// # Arguments
> + ///
> + /// * `fb_offset` - Starting byte offset in VRAM where data will be read from.
> + /// Must be aligned to `T::alignment()`.
> + /// * `data` - Mutable slice that will be filled with data read from VRAM.
> + /// The number of items read equals `data.len()`.
> + pub(crate) fn read<T: PraminNum>(&self, fb_offset: usize, data: &mut [T]) -> Result {
> + self.access_vram::<T, _>(fb_offset, data.len(), |data_idx, pramin_offset| {
> + data[data_idx] = T::read_from_bar(self.bar, pramin_offset)?;
> + Ok(())
> + })
> + }
> +}
> +
> +impl<'a> Drop for PraminVram<'a> {
> + fn drop(&mut self) {
> + let _ = self.set_window_addr(self.saved_window_addr); // Restore original window.
> + }
> +}
> diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
> index 112277c7921e..6bd9fc3372d6 100644
> --- a/drivers/gpu/nova-core/nova_core.rs
> +++ b/drivers/gpu/nova-core/nova_core.rs
> @@ -13,6 +13,7 @@
> mod gfw;
> mod gpu;
> mod gsp;
> +mod mm;
> mod regs;
> mod util;
> mod vbios;
> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
> index a3836a01996b..ba09da7e1541 100644
> --- a/drivers/gpu/nova-core/regs.rs
> +++ b/drivers/gpu/nova-core/regs.rs
> @@ -12,6 +12,7 @@
> FalconModSelAlgo, FalconSecurityModel, PFalcon2Base, PFalconBase, PeregrineCoreSelect,
> };
> use crate::gpu::{Architecture, Chipset};
> +use kernel::bits::genmask_u32;
> use kernel::prelude::*;
>
> // PMC
> @@ -43,7 +44,8 @@ pub(crate) fn chipset(self) -> Result<Chipset> {
> }
> }
>
> -// PBUS
> +// PBUS - PBUS is a bus control unit, that helps the GPU communicate with the PCI bus.
> +// Handles the BAR windows, decoding of MMIO read/writes on the BARs, etc.
>
> register!(NV_PBUS_SW_SCRATCH @ 0x00001400[64] {});
>
> @@ -52,6 +54,31 @@ pub(crate) fn chipset(self) -> Result<Chipset> {
> 31:16 frts_err_code as u16;
> });
>
> +// BAR0 window control register to configure the BAR0 window for PRAMIN access
> +// (direct physical VRAM access).
> +register!(NV_PBUS_BAR0_WINDOW @ 0x00001700, "BAR0 window control register" {
> + 25:24 target as u8, "Target (0=VID_MEM, 1=SYS_MEM_COHERENT, 2=SYS_MEM_NONCOHERENT)";
> + 23:0 window_base as u32, "Window base address (bits 39:16 of FB addr)";
> +});
> +
> +impl NV_PBUS_BAR0_WINDOW {
> + /// Returns the 64-bit aligned VRAM address of the window.
> + pub(crate) fn get_window_addr(self) -> usize {
> + (self.window_base() as usize) << 16
> + }
> +
> + /// Sets the window address from a framebuffer offset.
> + /// The fb_offset must be 64KB aligned (lower bits discared).
> + pub(crate) fn set_window_addr(self, fb_offset: usize) -> Self {
> + // Calculate window base (bits 39:16 of FB address)
> + // The total FB address is 40 bits, mask anything above. Since we are
> + // right shifting the offset by 16 bits, the mask is only 24 bits.
> + let mask = genmask_u32(0..=23) as usize;
> + let window_base = ((fb_offset >> 16) & mask) as u32;
> + self.set_window_base(window_base)
> + }
> +}
If you work directly with `window_base` as suggested above, this impl
block can be dropped altogether.
next prev parent reply other threads:[~2025-10-22 10:41 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 18:55 [PATCH 0/7] Pre-requisite patches for mm and irq in nova-core Joel Fernandes
2025-10-20 18:55 ` [PATCH 1/7] docs: rust: Fix a few grammatical errors Joel Fernandes
2025-10-20 21:21 ` John Hubbard
2025-10-20 21:33 ` Miguel Ojeda
2025-10-20 23:23 ` Joel Fernandes
2025-10-20 18:55 ` [PATCH 2/7] gpu: nova-core: Add support to convert bitfield to underlying type Joel Fernandes
2025-10-20 21:25 ` John Hubbard
2025-10-22 6:25 ` Alexandre Courbot
2025-10-22 17:51 ` Joel Fernandes
2025-10-20 18:55 ` [PATCH 3/7] docs: gpu: nova-core: Document GSP RPC message queue architecture Joel Fernandes
2025-10-20 21:49 ` John Hubbard
2025-10-22 1:43 ` Bagas Sanjaya
2025-10-22 11:16 ` Alexandre Courbot
2025-10-20 18:55 ` [PATCH 4/7] docs: gpu: nova-core: Document the PRAMIN aperture mechanism Joel Fernandes
2025-10-20 19:36 ` John Hubbard
2025-10-20 19:48 ` Joel Fernandes
2025-10-20 20:42 ` John Hubbard
2025-10-20 20:45 ` Joel Fernandes
2025-10-20 22:08 ` John Hubbard
2025-10-22 2:09 ` Bagas Sanjaya
2025-10-20 18:55 ` [PATCH 5/7] gpu: nova-core: Add support for managing GSP falcon interrupts Joel Fernandes
2025-10-20 22:35 ` John Hubbard
2025-10-21 18:42 ` Joel Fernandes
2025-10-22 6:48 ` Alexandre Courbot
2025-10-22 21:09 ` Joel Fernandes
2025-10-22 23:16 ` John Hubbard
2025-10-22 6:47 ` Alexandre Courbot
2025-10-22 21:05 ` Joel Fernandes
2025-10-20 18:55 ` [PATCH 6/7] nova-core: mm: Add support to use PRAMIN windows to write to VRAM Joel Fernandes
2025-10-22 2:18 ` John Hubbard
2025-10-22 17:48 ` Joel Fernandes
2025-10-22 20:43 ` Joel Fernandes
2025-10-24 11:31 ` Alexandre Courbot
2025-10-22 10:41 ` Alexandre Courbot [this message]
2025-10-22 22:04 ` Joel Fernandes
2025-10-24 11:39 ` Alexandre Courbot
2025-10-20 18:55 ` [PATCH 7/7] nova-core: mm: Add data structures for page table management Joel Fernandes
2025-10-20 20:59 ` John Hubbard
2025-10-21 18:26 ` Joel Fernandes
2025-10-21 20:30 ` John Hubbard
2025-10-21 21:58 ` Joel Fernandes
2025-10-20 21:30 ` Miguel Ojeda
2025-11-03 19:21 ` Joel Fernandes
2025-11-04 17:54 ` Miguel Ojeda
2025-11-04 18:18 ` Danilo Krummrich
2025-11-03 19:29 ` John Hubbard
2025-11-04 17:56 ` Miguel Ojeda
2025-11-05 2:25 ` John Hubbard
2025-10-22 11:21 ` Alexandre Courbot
2025-10-22 19:13 ` Joel Fernandes
2025-10-20 21:20 ` [PATCH 0/7] Pre-requisite patches for mm and irq in nova-core John Hubbard
2025-10-21 18:29 ` Joel Fernandes
2025-10-22 6:57 ` Alexandre Courbot
2025-10-22 21:30 ` Joel Fernandes
2025-10-24 11:51 ` Alexandre Courbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DDOSD746PCSR.CNAYZSTFR9XR@nvidia.com \
--to=acourbot@nvidia.com \
--cc=a.hindborg@kernel.org \
--cc=airlied@gmail.com \
--cc=alex.gaynor@gmail.com \
--cc=aliceryhl@google.com \
--cc=apopple@nvidia.com \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=dakr@kernel.org \
--cc=daniel.almeida@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=elle@weathered-steel.dev \
--cc=gary@garyguo.net \
--cc=jhubbard@nvidia.com \
--cc=joel@joelfernandes.org \
--cc=joelagnelf@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lossin@kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=ojeda@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=simona@ffwll.ch \
--cc=tmgross@umich.edu \
--cc=ttabi@nvidia.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox