[PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support
@ 2026-02-21  2:09 John Hubbard
  2026-02-21  2:09 ` [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
                   ` (37 more replies)
  0 siblings, 38 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hi,

This is based on today's linux.git. A git branch with this (plus a fix
for a CLIPPY warning on a core Rust for Linux issue which I suspect
others have already found and fixed) is here:

    https://github.com/johnhubbard/linux/tree/nova-core-blackwell-v5

This is quite a large overhaul, multiple passes to fix up a lot of
issues found during review, and then I found more while doing the fixes.

Patch 1 is going to be merged separately, but is included here in order
to allow people to apply the series.

Patch 2 is going to come from Gary Guo, not here, but is included for
the same reason.

The last two patches, 37 and 38, do not need to be part of this series,
but are best applied *after* the series, in order to catch all the
cases.

There are a also a few rust/ patches that might need/want to get merged
separately.

It's been tested on Ampere and Blackwell, one each:

    NovaCore 0000:e1:00.0: GPU name: NVIDIA RTX A4000
    NovaCore 0000:01:00.0: GPU name: NVIDIA RTX PRO 6000 Blackwell Max-Q
    Workstation Edition

Changes in v5 (in highly condensed and summarized form):

* Rebased onto linux.git master.

* Split MCTP protocol into its own module and file.

* Many Rust-based improvements: more use of types, especially. Also
  used Result and Option more.

* Lots of cleanup of comments and print output and error handling.

* Added const_align_up() to rust/ and used it in nova-core. This
  required enabling a Rust feature: inline_const, as recommended by
  Miguel Ojeda.

* Refactoring various things, such as Gpu::new() to own Spec creation,
  and several more such things.

* Fixed three Delta::ZERO busy-polls (patches 21, 24, 31) to use
  non-zero sleep intervals (after just realizing that it was a bad
  choice to have zero in there).

* Reduced GH100/GB100 HAL duplication. Made FSP_PKEY_SIZE/FSP_SIG_SIZE
  consistent across patches. Replaced fragile architecture checks with
  chipset.arch(). Renamed LIBOS_BLACKWELL.

* Narrowed the scope of some of the #![expect(dead_code)] cases,
  although that really only matters within the series, not once it is
  fully applied.

John Hubbard (38):
  gpu: nova-core: fix aux device registration for multi-GPU systems
  gpu: nova-core: pass pdev directly to dev_* logging macros
  gpu: nova-core: print FB sizes, along with ranges
  gpu: nova-core: add FbRange.len() and use it in boot.rs
  gpu: nova-core: Hopper/Blackwell: basic GPU identification
  gpu: nova-core: factor .fwsignature* selection into a new
    find_gsp_sigs_section()
  gpu: nova-core: use GPU Architecture to simplify HAL selections
  gpu: nova-core: apply the one "use" item per line policy to
    commands.rs
  gpu: nova-core: move GPU init and DMA mask setup into Gpu::new()
  gpu: nova-core: set DMA mask width based on GPU architecture
  gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  gpu: nova-core: move firmware image parsing code to firmware.rs
  gpu: nova-core: factor out an elf_str() function
  gpu: nova-core: don't assume 64-bit firmware images
  gpu: nova-core: add support for 32-bit firmware images
  gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
    of FSP
  gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  rust: ptr: add const_align_up() and enable inline_const feature
  gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  gpu: nova-core: add MCTP/NVDM protocol types for firmware
    communication
  gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
    waiting
  gpu: nova-core: Hopper/Blackwell: add FSP message structures
  gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  gpu: nova-core: Blackwell: use correct sysmem flush registers
  gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  gpu: nova-core: refactor SEC2 booter loading into
    BooterFirmware::run()
  gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()
  rust: sizes: add u64 variants of SZ_* constants
  gpu: nova-core: use SZ_*_U64 constants from kernel::sizes

 drivers/gpu/nova-core/driver.rs          |  32 +-
 drivers/gpu/nova-core/falcon.rs          |   1 +
 drivers/gpu/nova-core/falcon/fsp.rs      | 222 ++++++++++
 drivers/gpu/nova-core/falcon/hal.rs      |  20 +-
 drivers/gpu/nova-core/fb.rs              | 123 ++++--
 drivers/gpu/nova-core/fb/hal.rs          |  38 +-
 drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs    |  75 ++++
 drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
 drivers/gpu/nova-core/fb/hal/gh100.rs    |  38 ++
 drivers/gpu/nova-core/firmware.rs        | 186 ++++++++
 drivers/gpu/nova-core/firmware/booter.rs |  35 +-
 drivers/gpu/nova-core/firmware/fsp.rs    |  46 ++
 drivers/gpu/nova-core/firmware/gsp.rs    | 140 ++----
 drivers/gpu/nova-core/fsp.rs             | 525 +++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs             | 119 ++++-
 drivers/gpu/nova-core/gsp/boot.rs        | 318 ++++++++++----
 drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
 drivers/gpu/nova-core/gsp/fw.rs          |  95 ++--
 drivers/gpu/nova-core/gsp/fw/commands.rs |  32 +-
 drivers/gpu/nova-core/mctp.rs            | 105 +++++
 drivers/gpu/nova-core/nova_core.rs       |   2 +
 drivers/gpu/nova-core/regs.rs            | 103 ++++-
 rust/kernel/ptr.rs                       |  27 ++
 rust/kernel/sizes.rs                     |  51 +++
 scripts/Makefile.build                   |   2 +-
 26 files changed, 2098 insertions(+), 309 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp.rs
 create mode 100644 drivers/gpu/nova-core/mctp.rs


base-commit: a95f71ad3e2e224277508e006580c333d0a5fe36
prerequisite-patch-id: 1ec0faa352dab8fa7c0f209474b75cd21931340d
-- 
2.53.0


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-24 14:47   ` Danilo Krummrich
  2026-02-21  2:09 ` [PATCH v5 02/38] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
                   ` (36 subsequent siblings)
  37 siblings, 1 reply; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

The auxiliary device registration was using a hardcoded ID of 0, which
caused probe() to fail on multi-GPU systems with:

   sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'

Fix this by using an atomic counter to generate unique IDs for each
GPU's aux device registration. The TODO item to eventually use XArray
for recycling aux device IDs is retained, but for now, this works very
nicely.

This has the side effect of making debugfs[1] work on multi-GPU systems.

[1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 5a4cc047bcfc..fb54f28f6da1 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -1,5 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::sync::atomic::{
+    AtomicU32,
+    Ordering, //
+};
+
 use kernel::{
     auxiliary,
     device::Core,
@@ -19,6 +24,9 @@
 
 use crate::gpu::Gpu;
 
+/// Counter for generating unique auxiliary device IDs.
+static AUXILIARY_ID_COUNTER: AtomicU32 = AtomicU32::new(0);
+
 #[pin_data]
 pub(crate) struct NovaCore {
     #[pin]
@@ -85,12 +93,17 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
                 GFP_KERNEL,
             )?;
 
+            // TODO[XARR]: Use XArray for proper ID allocation/recycling. Until then, use a simple
+            // atomic counter which never recycles IDs. A unique ID is required for multi-GPU
+            // systems, because without it, probe() would fail for all but the first GPU.
+            let aux_id = AUXILIARY_ID_COUNTER.fetch_add(1, Ordering::Relaxed);
+
             Ok(try_pin_init!(Self {
                 gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
                 _reg <- auxiliary::Registration::new(
                     pdev.as_ref(),
                     c"nova-drm",
-                    0, // TODO[XARR]: Once it lands, use XArray; for now we don't use the ID.
+                    aux_id,
                     crate::MODULE_NAME
                 ),
             }))
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 02/38] gpu: nova-core: pass pdev directly to dev_* logging macros
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-02-21  2:09 ` [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 03/38] gpu: nova-core: print FB sizes, along with ranges John Hubbard
                   ` (35 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

The dev_dbg!, dev_info!, dev_err!, and dev_warn! macros now accept
pci::Device directly without requiring an explicit .as_ref()
conversion to device::Device, thanks to commit a38cd1fea989
("rust: device: support `dev_printk` on all devices").

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs   |  2 +-
 drivers/gpu/nova-core/gpu.rs      |  4 ++--
 drivers/gpu/nova-core/gsp/boot.rs | 14 +++++++-------
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index fb54f28f6da1..e887bcc3187f 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -78,7 +78,7 @@ impl pci::Driver for NovaCore {
 
     fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, Error> {
         pin_init::pin_init_scope(move || {
-            dev_dbg!(pdev.as_ref(), "Probe Nova Core GPU driver.\n");
+            dev_dbg!(pdev, "Probe Nova Core GPU driver.\n");
 
             pdev.enable_device_mem()?;
             pdev.set_master();
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 9b042ef1a308..f5907c31a66d 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -262,13 +262,13 @@ pub(crate) fn new<'a>(
     ) -> impl PinInit<Self, Error> + 'a {
         try_pin_init!(Self {
             spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| {
-                dev_info!(pdev.as_ref(),"NVIDIA ({})\n", spec);
+                dev_info!(pdev, "NVIDIA ({})\n", spec);
             })?,
 
             // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
             _: {
                 gfw::wait_gfw_boot_completion(bar)
-                    .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete\n"))?;
+                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
             },
 
             sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index be427fe26a58..bd6e6dc57e85 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -171,14 +171,14 @@ pub(crate) fn boot(
             Some((libos_handle >> 32) as u32),
         )?;
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "GSP MBOX0: {:#x}, MBOX1: {:#x}\n",
             mbox0,
             mbox1
         );
 
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "Using SEC2 to load and run the booter_load firmware...\n"
         );
 
@@ -191,7 +191,7 @@ pub(crate) fn boot(
             Some((wpr_handle >> 32) as u32),
         )?;
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n",
             mbox0,
             mbox1
@@ -199,7 +199,7 @@ pub(crate) fn boot(
 
         if mbox0 != 0 {
             dev_err!(
-                pdev.as_ref(),
+                pdev,
                 "Booter-load failed with error {:#x}\n",
                 mbox0
             );
@@ -217,7 +217,7 @@ pub(crate) fn boot(
         )?;
 
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "RISC-V active? {}\n",
             gsp_falcon.is_riscv_active(bar),
         );
@@ -239,8 +239,8 @@ pub(crate) fn boot(
         // Obtain and display basic GPU information.
         let info = commands::get_gsp_info(&mut self.cmdq, bar)?;
         match info.gpu_name() {
-            Ok(name) => dev_info!(pdev.as_ref(), "GPU name: {}\n", name),
-            Err(e) => dev_warn!(pdev.as_ref(), "GPU name unavailable: {:?}\n", e),
+            Ok(name) => dev_info!(pdev, "GPU name: {}\n", name),
+            Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e),
         }
 
         Ok(())
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 03/38] gpu: nova-core: print FB sizes, along with ranges
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-02-21  2:09 ` [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
  2026-02-21  2:09 ` [PATCH v5 02/38] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 04/38] gpu: nova-core: add FbRange.len() and use it in boot.rs John Hubbard
                   ` (34 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

For convenience of the reader: now you can directly see the sizes of
each range. It is surprising just how much this helps.

Sample output (using an Ampere GA104):

NovaCore 0000:e1:00.0: FbLayout {
    fb: 0x0..0x3ff800000 (16376 MiB),
    vga_workspace: 0x3ff700000..0x3ff800000 (1 MiB),
    frts: 0x3ff600000..0x3ff700000 (1 MiB),
    boot: 0x3ff5fa000..0x3ff600000 (24 KiB),
    elf: 0x3fb960000..0x3ff5f9000 (60 MiB),
    wpr2_heap: 0x3f3900000..0x3fb900000 (128 MiB),
    wpr2: 0x3f3800000..0x3ff700000 (191 MiB),
    heap: 0x3f3700000..0x3f3800000 (1 MiB),
    vf_partition_count: 0x0,
}

Cc: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs | 83 +++++++++++++++++++++++++++++--------
 1 file changed, 66 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index c62abcaed547..6fb804c118c6 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -1,9 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use core::ops::Range;
+use core::ops::{
+    Deref,
+    Range, //
+};
 
 use kernel::{
     device,
+    fmt,
     prelude::*,
     ptr::{
         Alignable,
@@ -94,26 +98,71 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
     }
 }
 
+pub(crate) struct FbRange(Range<u64>);
+
+impl From<Range<u64>> for FbRange {
+    fn from(range: Range<u64>) -> Self {
+        Self(range)
+    }
+}
+
+impl Deref for FbRange {
+    type Target = Range<u64>;
+
+    fn deref(&self) -> &Self::Target {
+        &self.0
+    }
+}
+
+impl fmt::Debug for FbRange {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        // Use alternate format ({:#?}) to include size, compact format ({:?}) for just the range.
+        if f.alternate() {
+            let size = self.0.end - self.0.start;
+
+            if size < usize_as_u64(SZ_1M) {
+                let size_kib = size / usize_as_u64(SZ_1K);
+                f.write_fmt(fmt!(
+                    "{:#x}..{:#x} ({} KiB)",
+                    self.0.start,
+                    self.0.end,
+                    size_kib
+                ))
+            } else {
+                let size_mib = size / usize_as_u64(SZ_1M);
+                f.write_fmt(fmt!(
+                    "{:#x}..{:#x} ({} MiB)",
+                    self.0.start,
+                    self.0.end,
+                    size_mib
+                ))
+            }
+        } else {
+            f.write_fmt(fmt!("{:#x}..{:#x}", self.0.start, self.0.end))
+        }
+    }
+}
+
 /// Layout of the GPU framebuffer memory.
 ///
 /// Contains ranges of GPU memory reserved for a given purpose during the GSP boot process.
 #[derive(Debug)]
 pub(crate) struct FbLayout {
     /// Range of the framebuffer. Starts at `0`.
-    pub(crate) fb: Range<u64>,
+    pub(crate) fb: FbRange,
     /// VGA workspace, small area of reserved memory at the end of the framebuffer.
-    pub(crate) vga_workspace: Range<u64>,
+    pub(crate) vga_workspace: FbRange,
     /// FRTS range.
-    pub(crate) frts: Range<u64>,
+    pub(crate) frts: FbRange,
     /// Memory area containing the GSP bootloader image.
-    pub(crate) boot: Range<u64>,
+    pub(crate) boot: FbRange,
     /// Memory area containing the GSP firmware image.
-    pub(crate) elf: Range<u64>,
+    pub(crate) elf: FbRange,
     /// WPR2 heap.
-    pub(crate) wpr2_heap: Range<u64>,
+    pub(crate) wpr2_heap: FbRange,
     /// WPR2 region range, starting with an instance of `GspFwWprMeta`.
-    pub(crate) wpr2: Range<u64>,
-    pub(crate) heap: Range<u64>,
+    pub(crate) wpr2: FbRange,
+    pub(crate) heap: FbRange,
     pub(crate) vf_partition_count: u8,
 }
 
@@ -125,7 +174,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         let fb = {
             let fb_size = hal.vidmem_size(bar);
 
-            0..fb_size
+            FbRange(0..fb_size)
         };
 
         let vga_workspace = {
@@ -152,7 +201,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
                 }
             };
 
-            vga_base..fb.end
+            FbRange(vga_base..fb.end)
         };
 
         let frts = {
@@ -160,7 +209,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             const FRTS_SIZE: u64 = usize_as_u64(SZ_1M);
             let frts_base = vga_workspace.start.align_down(FRTS_DOWN_ALIGN) - FRTS_SIZE;
 
-            frts_base..frts_base + FRTS_SIZE
+            FbRange(frts_base..frts_base + FRTS_SIZE)
         };
 
         let boot = {
@@ -168,7 +217,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             let bootloader_size = u64::from_safe_cast(gsp_fw.bootloader.ucode.size());
             let bootloader_base = (frts.start - bootloader_size).align_down(BOOTLOADER_DOWN_ALIGN);
 
-            bootloader_base..bootloader_base + bootloader_size
+            FbRange(bootloader_base..bootloader_base + bootloader_size)
         };
 
         let elf = {
@@ -176,7 +225,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             let elf_size = u64::from_safe_cast(gsp_fw.size);
             let elf_addr = (boot.start - elf_size).align_down(ELF_DOWN_ALIGN);
 
-            elf_addr..elf_addr + elf_size
+            FbRange(elf_addr..elf_addr + elf_size)
         };
 
         let wpr2_heap = {
@@ -185,7 +234,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
                 gsp::LibosParams::from_chipset(chipset).wpr_heap_size(chipset, fb.end);
             let wpr2_heap_addr = (elf.start - wpr2_heap_size).align_down(WPR2_HEAP_DOWN_ALIGN);
 
-            wpr2_heap_addr..(elf.start).align_down(WPR2_HEAP_DOWN_ALIGN)
+            FbRange(wpr2_heap_addr..(elf.start).align_down(WPR2_HEAP_DOWN_ALIGN))
         };
 
         let wpr2 = {
@@ -193,13 +242,13 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             let wpr2_addr = (wpr2_heap.start - u64::from_safe_cast(size_of::<gsp::GspFwWprMeta>()))
                 .align_down(WPR2_DOWN_ALIGN);
 
-            wpr2_addr..frts.end
+            FbRange(wpr2_addr..frts.end)
         };
 
         let heap = {
             const HEAP_SIZE: u64 = usize_as_u64(SZ_1M);
 
-            wpr2.start - HEAP_SIZE..wpr2.start
+            FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
         };
 
         Ok(Self {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 04/38] gpu: nova-core: add FbRange.len() and use it in boot.rs
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (2 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 03/38] gpu: nova-core: print FB sizes, along with ranges John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 05/38] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

A tiny simplification: now that FbLayout uses its own specific FbRange
type, add an FbRange.len() method, and use that to (very slightly)
simplify the calculation of Frts::frts_size initialization.

Suggested-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs       | 8 +++++++-
 drivers/gpu/nova-core/gsp/boot.rs | 2 +-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 6fb804c118c6..6536d0035cb1 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -100,6 +100,12 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
 
 pub(crate) struct FbRange(Range<u64>);
 
+impl FbRange {
+    pub(crate) fn len(&self) -> u64 {
+        self.0.end - self.0.start
+    }
+}
+
 impl From<Range<u64>> for FbRange {
     fn from(range: Range<u64>) -> Self {
         Self(range)
@@ -118,7 +124,7 @@ impl fmt::Debug for FbRange {
     fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
         // Use alternate format ({:#?}) to include size, compact format ({:?}) for just the range.
         if f.alternate() {
-            let size = self.0.end - self.0.start;
+            let size = self.len();
 
             if size < usize_as_u64(SZ_1M) {
                 let size_kib = size / usize_as_u64(SZ_1K);
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index bd6e6dc57e85..465c18e4c888 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -70,7 +70,7 @@ fn run_fwsec_frts(
             bios,
             FwsecCommand::Frts {
                 frts_addr: fb_layout.frts.start,
-                frts_size: fb_layout.frts.end - fb_layout.frts.start,
+                frts_size: fb_layout.frts.len(),
             },
         )?;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 05/38] gpu: nova-core: Hopper/Blackwell: basic GPU identification
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (3 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 04/38] gpu: nova-core: add FbRange.len() and use it in boot.rs John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 06/38] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section() John Hubbard
                   ` (32 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper (GH100) and Blackwell identification, including ELF
.fwsignature_* items.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/hal.rs   |  3 ++-
 drivers/gpu/nova-core/fb/hal.rs       |  5 ++---
 drivers/gpu/nova-core/firmware/gsp.rs | 17 +++++++++++++++++
 drivers/gpu/nova-core/gpu.rs          | 22 ++++++++++++++++++++++
 4 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index 89babd5f9325..444c95fd4ece 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -76,7 +76,8 @@ pub(super) fn falcon_hal<E: FalconEngine + 'static>(
         TU102 | TU104 | TU106 | TU116 | TU117 => {
             KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
+        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
+        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => {
             KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
         _ => return Err(ENOTSUPP),
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index aba0abd8ee00..e709affaa7e8 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -34,8 +34,7 @@ pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset {
         TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
         GA100 => ga100::GA100_HAL,
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
-            ga102::GA102_HAL
-        }
+        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
+        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => ga102::GA102_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 9488a626352f..bc2243450989 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -222,6 +222,23 @@ pub(crate) fn new<'a>(
                         Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
                         Architecture::Ampere => ".fwsignature_ga10x",
                         Architecture::Ada => ".fwsignature_ad10x",
+                        Architecture::Hopper => ".fwsignature_gh10x",
+                        Architecture::Blackwell => {
+                            // Distinguish between GB10x and GB20x series
+                            match chipset {
+                                // GB10x series: GB100, GB102
+                                Chipset::GB100 | Chipset::GB102 => ".fwsignature_gb10x",
+                                // GB20x series: GB202, GB203, GB205, GB206, GB207
+                                Chipset::GB202
+                                | Chipset::GB203
+                                | Chipset::GB205
+                                | Chipset::GB206
+                                | Chipset::GB207 => ".fwsignature_gb20x",
+                                // It's not possible to get here with a non-Blackwell chipset, but
+                                // Rust doesn't know that.
+                                _ => return Err(ENOTSUPP),
+                            }
+                        }
                     };
 
                     elf::elf64_section(firmware.data(), sigs_section)
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index f5907c31a66d..b6a898008a59 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -83,12 +83,22 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
     GA104 = 0x174,
     GA106 = 0x176,
     GA107 = 0x177,
+    // Hopper
+    GH100 = 0x180,
     // Ada
     AD102 = 0x192,
     AD103 = 0x193,
     AD104 = 0x194,
     AD106 = 0x196,
     AD107 = 0x197,
+    // Blackwell
+    GB100 = 0x1a0,
+    GB102 = 0x1a2,
+    GB202 = 0x1b2,
+    GB203 = 0x1b3,
+    GB205 = 0x1b5,
+    GB206 = 0x1b6,
+    GB207 = 0x1b7,
 });
 
 impl Chipset {
@@ -100,9 +110,17 @@ pub(crate) fn arch(&self) -> Architecture {
             Self::GA100 | Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => {
                 Architecture::Ampere
             }
+            Self::GH100 => Architecture::Hopper,
             Self::AD102 | Self::AD103 | Self::AD104 | Self::AD106 | Self::AD107 => {
                 Architecture::Ada
             }
+            Self::GB100
+            | Self::GB102
+            | Self::GB202
+            | Self::GB203
+            | Self::GB205
+            | Self::GB206
+            | Self::GB207 => Architecture::Blackwell,
         }
     }
 }
@@ -132,7 +150,9 @@ pub(crate) enum Architecture {
     #[default]
     Turing = 0x16,
     Ampere = 0x17,
+    Hopper = 0x18,
     Ada = 0x19,
+    Blackwell = 0x1b,
 }
 
 impl TryFrom<u8> for Architecture {
@@ -142,7 +162,9 @@ fn try_from(value: u8) -> Result<Self> {
         match value {
             0x16 => Ok(Self::Turing),
             0x17 => Ok(Self::Ampere),
+            0x18 => Ok(Self::Hopper),
             0x19 => Ok(Self::Ada),
+            0x1b => Ok(Self::Blackwell),
             _ => Err(ENODEV),
         }
     }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 06/38] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section()
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (4 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 05/38] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 07/38] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Keep Gsp::new() from getting too cluttered, by factoring out the
selection of .fwsignature* items. This will continue to grow as we add
GPUs.

Cc: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/gsp.rs | 60 ++++++++++++++-------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index bc2243450989..468f4b43574a 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -146,6 +146,36 @@ pub(crate) struct GspFirmware {
 }
 
 impl GspFirmware {
+    fn find_gsp_sigs_section(chipset: Chipset) -> Option<&'static str> {
+        match chipset.arch() {
+            Architecture::Turing if matches!(chipset, Chipset::TU116 | Chipset::TU117) => {
+                Some(".fwsignature_tu11x")
+            }
+            Architecture::Turing => Some(".fwsignature_tu10x"),
+            // GA100 uses the same firmware as Turing
+            Architecture::Ampere if chipset == Chipset::GA100 => Some(".fwsignature_tu10x"),
+            Architecture::Ampere => Some(".fwsignature_ga10x"),
+            Architecture::Ada => Some(".fwsignature_ad10x"),
+            Architecture::Hopper => Some(".fwsignature_gh10x"),
+            Architecture::Blackwell => {
+                // Distinguish between GB10x and GB20x series
+                match chipset {
+                    // GB10x series: GB100, GB102
+                    Chipset::GB100 | Chipset::GB102 => Some(".fwsignature_gb10x"),
+                    // GB20x series: GB202, GB203, GB205, GB206, GB207
+                    Chipset::GB202
+                    | Chipset::GB203
+                    | Chipset::GB205
+                    | Chipset::GB206
+                    | Chipset::GB207 => Some(".fwsignature_gb20x"),
+                    // It's not possible to get here with a non-Blackwell chipset, but Rust doesn't
+                    // know that.
+                    _ => None,
+                }
+            }
+        }
+    }
+
     /// Loads the GSP firmware binaries, map them into `dev`'s address-space, and creates the page
     /// tables expected by the GSP bootloader to load it.
     pub(crate) fn new<'a>(
@@ -211,35 +241,7 @@ pub(crate) fn new<'a>(
                 },
                 size,
                 signatures: {
-                    let sigs_section = match chipset.arch() {
-                        Architecture::Turing
-                            if matches!(chipset, Chipset::TU116 | Chipset::TU117) =>
-                        {
-                            ".fwsignature_tu11x"
-                        }
-                        Architecture::Turing => ".fwsignature_tu10x",
-                        // GA100 uses the same firmware as Turing
-                        Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
-                        Architecture::Ampere => ".fwsignature_ga10x",
-                        Architecture::Ada => ".fwsignature_ad10x",
-                        Architecture::Hopper => ".fwsignature_gh10x",
-                        Architecture::Blackwell => {
-                            // Distinguish between GB10x and GB20x series
-                            match chipset {
-                                // GB10x series: GB100, GB102
-                                Chipset::GB100 | Chipset::GB102 => ".fwsignature_gb10x",
-                                // GB20x series: GB202, GB203, GB205, GB206, GB207
-                                Chipset::GB202
-                                | Chipset::GB203
-                                | Chipset::GB205
-                                | Chipset::GB206
-                                | Chipset::GB207 => ".fwsignature_gb20x",
-                                // It's not possible to get here with a non-Blackwell chipset, but
-                                // Rust doesn't know that.
-                                _ => return Err(ENOTSUPP),
-                            }
-                        }
-                    };
+                    let sigs_section = Self::find_gsp_sigs_section(chipset).ok_or(ENOTSUPP)?;
 
                     elf::elf64_section(firmware.data(), sigs_section)
                         .ok_or(EINVAL)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 07/38] gpu: nova-core: use GPU Architecture to simplify HAL selections
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (5 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 06/38] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section() John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 08/38] gpu: nova-core: apply the one "use" item per line policy to commands.rs John Hubbard
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Replace per-chipset match arms with Architecture-based matching in the
falcon and FB HAL selection functions. This reduces the number of match
arms that need updating when new chipsets are added within an existing
architecture.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/hal.rs | 21 +++++++++++++--------
 drivers/gpu/nova-core/fb/hal.rs     | 17 +++++++++--------
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index 444c95fd4ece..edf4d27d54f7 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -9,7 +9,10 @@
         FalconBromParams,
         FalconEngine, //
     },
-    gpu::Chipset,
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
 };
 
 mod ga102;
@@ -70,17 +73,19 @@ fn signature_reg_fuse_version(
 pub(super) fn falcon_hal<E: FalconEngine + 'static>(
     chipset: Chipset,
 ) -> Result<KBox<dyn FalconHal<E>>> {
-    use Chipset::*;
-
-    let hal = match chipset {
-        TU102 | TU104 | TU106 | TU116 | TU117 => {
+    let hal = match chipset.arch() {
+        Architecture::Turing => {
             KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
-        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => {
+        // TODO: support GA100. Its boot sequence is a lot like Turing, except that it handles the
+        // FRTS steps differently (specifically, it skips FWSEC-FRTS).
+        Architecture::Ampere if chipset == Chipset::GA100 => return Err(ENOTSUPP),
+        Architecture::Ampere
+        | Architecture::Hopper
+        | Architecture::Ada
+        | Architecture::Blackwell => {
             KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        _ => return Err(ENOTSUPP),
     };
 
     Ok(hal)
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index e709affaa7e8..d33ca0f96417 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -4,7 +4,10 @@
 
 use crate::{
     driver::Bar0,
-    gpu::Chipset, //
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
 };
 
 mod ga100;
@@ -29,12 +32,10 @@ pub(crate) trait FbHal {
 
 /// Returns the HAL corresponding to `chipset`.
 pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
-    use Chipset::*;
-
-    match chipset {
-        TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
-        GA100 => ga100::GA100_HAL,
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
-        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => ga102::GA102_HAL,
+    match chipset.arch() {
+        Architecture::Turing => tu102::TU102_HAL,
+        Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
+        Architecture::Ampere => ga102::GA102_HAL,
+        Architecture::Ada | Architecture::Hopper | Architecture::Blackwell => ga102::GA102_HAL,
     }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 08/38] gpu: nova-core: apply the one "use" item per line policy to commands.rs
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (6 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 07/38] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 09/38] gpu: nova-core: move GPU init and DMA mask setup into Gpu::new() John Hubbard
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

As per [1], we need one "use" item per line, in order to reduce merge
conflicts. Furthermore, we need a trailing ", //" in order to tell
rustfmt(1) to leave it alone.

This does that for commands.rs, which is the only file in nova-core that
has any remaining instances of the old style.

[1] https://docs.kernel.org/rust/coding-guidelines.html#imports

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/fw/commands.rs | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs
index 21be44199693..470d8edb62ff 100644
--- a/drivers/gpu/nova-core/gsp/fw/commands.rs
+++ b/drivers/gpu/nova-core/gsp/fw/commands.rs
@@ -1,8 +1,14 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use kernel::prelude::*;
-use kernel::transmute::{AsBytes, FromBytes};
-use kernel::{device, pci};
+use kernel::{
+    device,
+    pci,
+    prelude::*,
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    }, //
+};
 
 use crate::gsp::GSP_PAGE_SIZE;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 09/38] gpu: nova-core: move GPU init and DMA mask setup into Gpu::new()
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (7 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 08/38] gpu: nova-core: apply the one "use" item per line policy to commands.rs John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 10/38] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Move Spec creation, the dev_info log, and DMA mask setup from the
driver's probe() into Gpu::new(), so that all GPU-specific
initialization lives in the Gpu constructor.

This restructures Gpu::new() to use pin_init_scope wrapping
try_pin_init!, which allows running fallible setup code (Spec::new,
dma_set_mask_and_coherent) before the pin-initializer. The parameter
type changes from pci::Device<device::Bound> to pci::Device<device::Core>
because the DMA call requires the Core device state.

Also makes Chipset::arch() const, adds Spec::chipset() accessor, and
makes Spec::new() pub(crate) for use by later patches.

No functional change: the same 47-bit DMA mask is applied.

Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs | 15 --------
 drivers/gpu/nova-core/gpu.rs    | 66 ++++++++++++++++++++++-----------
 2 files changed, 44 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index e887bcc3187f..a26777552710 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -9,8 +9,6 @@
     auxiliary,
     device::Core,
     devres::Devres,
-    dma::Device,
-    dma::DmaMask,
     pci,
     pci::{
         Class,
@@ -37,14 +35,6 @@ pub(crate) struct NovaCore {
 
 const BAR0_SIZE: usize = SZ_16M;
 
-// For now we only support Ampere which can use up to 47-bit DMA addresses.
-//
-// TODO: Add an abstraction for this to support newer GPUs which may support
-// larger DMA addresses. Limiting these GPUs to smaller address widths won't
-// have any adverse affects, unless installed on systems which require larger
-// DMA addresses. These systems should be quite rare.
-const GPU_DMA_BITS: u32 = 47;
-
 pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
 
 kernel::pci_device_table!(
@@ -83,11 +73,6 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
             pdev.enable_device_mem()?;
             pdev.set_master();
 
-            // SAFETY: No concurrent DMA allocations or mappings can be made because
-            // the device is still being probed and therefore isn't being used by
-            // other threads of execution.
-            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
-
             let bar = Arc::pin_init(
                 pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
                 GFP_KERNEL,
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index b6a898008a59..93bf1c7b3ea1 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -3,6 +3,10 @@
 use kernel::{
     device,
     devres::Devres,
+    dma::{
+        Device,
+        DmaMask, //
+    },
     fmt,
     pci,
     prelude::*,
@@ -102,7 +106,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
 });
 
 impl Chipset {
-    pub(crate) fn arch(&self) -> Architecture {
+    pub(crate) const fn arch(&self) -> Architecture {
         match self {
             Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => {
                 Architecture::Turing
@@ -155,6 +159,10 @@ pub(crate) enum Architecture {
     Blackwell = 0x1b,
 }
 
+// TODO: Set the DMA mask per-architecture. Hopper and Blackwell support 52-bit
+// DMA addresses. For now, use 47-bit which is correct for Turing, Ampere, and Ada.
+const GPU_DMA_BITS: u32 = 47;
+
 impl TryFrom<u8> for Architecture {
     type Error = Error;
 
@@ -204,7 +212,7 @@ pub(crate) struct Spec {
 }
 
 impl Spec {
-    fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
+    pub(crate) fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
         // Some brief notes about boot0 and boot42, in chronological order:
         //
         // NV04 through NV50:
@@ -234,6 +242,10 @@ fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
             dev_err!(dev, "Unsupported chipset: {}\n", boot42);
         })
     }
+
+    pub(crate) fn chipset(&self) -> Chipset {
+        self.chipset
+    }
 }
 
 impl TryFrom<regs::NV_PMC_BOOT_42> for Spec {
@@ -278,36 +290,46 @@ pub(crate) struct Gpu {
 
 impl Gpu {
     pub(crate) fn new<'a>(
-        pdev: &'a pci::Device<device::Bound>,
+        pdev: &'a pci::Device<device::Core>,
         devres_bar: Arc<Devres<Bar0>>,
         bar: &'a Bar0,
     ) -> impl PinInit<Self, Error> + 'a {
-        try_pin_init!(Self {
-            spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| {
-                dev_info!(pdev, "NVIDIA ({})\n", spec);
-            })?,
+        pin_init::pin_init_scope(move || {
+            let spec = Spec::new(pdev.as_ref(), bar)?;
+            dev_info!(pdev, "NVIDIA ({})\n", spec);
+
+            // SAFETY: No concurrent DMA allocations or mappings can be made because
+            // the device is still being probed and therefore isn't being used by
+            // other threads of execution.
+            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
+
+            let chipset = spec.chipset();
 
-            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
-            _: {
-                gfw::wait_gfw_boot_completion(bar)
-                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
-            },
+            Ok(try_pin_init!(Self {
+                // We must wait for GFW_BOOT completion before doing any significant setup
+                // on the GPU.
+                _: {
+                    gfw::wait_gfw_boot_completion(bar)
+                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                },
 
-            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
+                sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
 
-            gsp_falcon: Falcon::new(
-                pdev.as_ref(),
-                spec.chipset,
-            )
-            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
+                gsp_falcon: Falcon::new(
+                    pdev.as_ref(),
+                    chipset,
+                )
+                .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
 
-            sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?,
+                sec2_falcon: Falcon::new(pdev.as_ref(), chipset)?,
 
-            gsp <- Gsp::new(pdev),
+                gsp <- Gsp::new(pdev),
 
-            _: { gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)? },
+                _: { gsp.boot(pdev, bar, chipset, gsp_falcon, sec2_falcon)? },
 
-            bar: devres_bar,
+                bar: devres_bar,
+                spec,
+            }))
         })
     }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 10/38] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (8 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 09/38] gpu: nova-core: move GPU init and DMA mask setup into Gpu::new() John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 11/38] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
                   ` (27 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Replace the hardcoded 47-bit DMA mask with per-architecture values.
Hopper and Blackwell support 52-bit DMA addresses, while Turing,
Ampere, and Ada use 47-bit.

Add Architecture::dma_mask() as a const method with an exhaustive
match, so that new architectures will get a compile-time reminder
to specify their DMA mask width.

Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 93bf1c7b3ea1..f6af75656861 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -159,9 +159,18 @@ pub(crate) enum Architecture {
     Blackwell = 0x1b,
 }
 
-// TODO: Set the DMA mask per-architecture. Hopper and Blackwell support 52-bit
-// DMA addresses. For now, use 47-bit which is correct for Turing, Ampere, and Ada.
-const GPU_DMA_BITS: u32 = 47;
+impl Architecture {
+    /// Returns the DMA mask supported by this architecture.
+    ///
+    /// Hopper and Blackwell support 52-bit DMA addresses, while earlier architectures
+    /// (Turing, Ampere, Ada) support 47-bit DMA addresses.
+    pub(crate) const fn dma_mask(&self) -> DmaMask {
+        match self {
+            Self::Turing | Self::Ampere | Self::Ada => DmaMask::new::<47>(),
+            Self::Hopper | Self::Blackwell => DmaMask::new::<52>(),
+        }
+    }
+}
 
 impl TryFrom<u8> for Architecture {
     type Error = Error;
@@ -301,7 +310,7 @@ pub(crate) fn new<'a>(
             // SAFETY: No concurrent DMA allocations or mappings can be made because
             // the device is still being probed and therefore isn't being used by
             // other threads of execution.
-            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
+            unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
 
             let chipset = spec.chipset();
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 11/38] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (9 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 10/38] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 12/38] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell GPUs use FSP-based secure boot and do not require
waiting for GFW_BOOT completion. Skip this step for these architectures,
and in fact for all future architectures, because we have moved on:
there will not be any future GPUs using the older GFW_BOOT system.

Cc: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index f6af75656861..50bf351b64cc 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -170,6 +170,15 @@ pub(crate) const fn dma_mask(&self) -> DmaMask {
             Self::Hopper | Self::Blackwell => DmaMask::new::<52>(),
         }
     }
+
+    /// Returns whether the GPU uses GFW_BOOT for firmware loading.
+    ///
+    /// Pre-Hopper architectures (Turing, Ampere, Ada) require waiting for GFW_BOOT completion
+    /// before any significant GPU setup. Hopper and later use the FSP Chain of Trust boot path
+    /// instead.
+    pub(crate) const fn needs_gfw_boot(&self) -> bool {
+        matches!(self, Self::Turing | Self::Ampere | Self::Ada)
+    }
 }
 
 impl TryFrom<u8> for Architecture {
@@ -315,11 +324,11 @@ pub(crate) fn new<'a>(
             let chipset = spec.chipset();
 
             Ok(try_pin_init!(Self {
-                // We must wait for GFW_BOOT completion before doing any significant setup
-                // on the GPU.
                 _: {
-                    gfw::wait_gfw_boot_completion(bar)
-                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                    if chipset.arch().needs_gfw_boot() {
+                        gfw::wait_gfw_boot_completion(bar)
+                            .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                    }
                 },
 
                 sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 12/38] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (10 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 11/38] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 13/38] gpu: nova-core: factor out an elf_str() function John Hubbard
                   ` (25 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Up until now, only the GSP required parsing of its firmware headers.
However, upcoming support for Hopper/Blackwell+ adds another firmware
image (FMC), along with another format (ELF32).

Therefore, the current ELF64 section parsing support needs to be moved
up a level, so that both of the above can use it.

There are no functional changes. This is pure code movement.

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     | 88 +++++++++++++++++++++++++
 drivers/gpu/nova-core/firmware/gsp.rs | 93 ++-------------------------
 2 files changed, 94 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 68779540aa28..a0201ac8ccb4 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -437,3 +437,91 @@ pub(crate) const fn create(
         this.0
     }
 }
+
+/// Ad-hoc and temporary module to extract sections from ELF images.
+///
+/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
+/// to specific and related bits of data. Future firmware versions are scheduled to move away from
+/// that scheme before nova-core becomes stable, which means this module will eventually be
+/// removed.
+mod elf {
+    use core::mem::size_of;
+
+    use kernel::{
+        bindings,
+        str::CStr,
+        transmute::FromBytes, //
+    };
+
+    /// Newtype to provide a [`FromBytes`] implementation.
+    #[repr(transparent)]
+    struct Elf64Hdr(bindings::elf64_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64Hdr {}
+
+    #[repr(transparent)]
+    struct Elf64SHdr(bindings::elf64_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64SHdr {}
+
+    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
+    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
+        let hdr = &elf
+            .get(0..size_of::<bindings::elf64_hdr>())
+            .and_then(Elf64Hdr::from_bytes)?
+            .0;
+
+        // Get all the section headers.
+        let mut shdr = {
+            let shdr_num = usize::from(hdr.e_shnum);
+            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
+            let shdr_end = shdr_num
+                .checked_mul(size_of::<Elf64SHdr>())
+                .and_then(|v| v.checked_add(shdr_start))?;
+
+            elf.get(shdr_start..shdr_end)
+                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
+        };
+
+        // Get the strings table.
+        let strhdr = shdr
+            .clone()
+            .nth(usize::from(hdr.e_shstrndx))
+            .and_then(Elf64SHdr::from_bytes)?;
+
+        // Find the section which name matches `name` and return it.
+        shdr.find(|&sh| {
+            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
+                return false;
+            };
+
+            let Some(name_idx) = strhdr
+                .0
+                .sh_offset
+                .checked_add(u64::from(hdr.0.sh_name))
+                .and_then(|idx| usize::try_from(idx).ok())
+            else {
+                return false;
+            };
+
+            // Get the start of the name.
+            elf.get(name_idx..)
+                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
+                // Convert into str.
+                .and_then(|c_str| c_str.to_str().ok())
+                // Check that the name matches.
+                .map(|str| str == name)
+                .unwrap_or(false)
+        })
+        // Return the slice containing the section.
+        .and_then(|sh| {
+            let hdr = Elf64SHdr::from_bytes(sh)?;
+            let start = usize::try_from(hdr.0.sh_offset).ok()?;
+            let end = usize::try_from(hdr.0.sh_size)
+                .ok()
+                .and_then(|sh_size| start.checked_add(sh_size))?;
+
+            elf.get(start..end)
+        })
+    }
+}
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 468f4b43574a..f247deb06633 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::mem::size_of_val;
+
 use kernel::{
     device,
     dma::{
@@ -16,7 +18,10 @@
 
 use crate::{
     dma::DmaObject,
-    firmware::riscv::RiscvFirmware,
+    firmware::{
+        elf,
+        riscv::RiscvFirmware, //
+    },
     gpu::{
         Architecture,
         Chipset, //
@@ -25,92 +30,6 @@
     num::FromSafeCast,
 };
 
-/// Ad-hoc and temporary module to extract sections from ELF images.
-///
-/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
-/// to specific and related bits of data. Future firmware versions are scheduled to move away from
-/// that scheme before nova-core becomes stable, which means this module will eventually be
-/// removed.
-mod elf {
-    use kernel::{
-        bindings,
-        prelude::*,
-        transmute::FromBytes, //
-    };
-
-    /// Newtype to provide a [`FromBytes`] implementation.
-    #[repr(transparent)]
-    struct Elf64Hdr(bindings::elf64_hdr);
-    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
-    unsafe impl FromBytes for Elf64Hdr {}
-
-    #[repr(transparent)]
-    struct Elf64SHdr(bindings::elf64_shdr);
-    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
-    unsafe impl FromBytes for Elf64SHdr {}
-
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
-        let hdr = &elf
-            .get(0..size_of::<bindings::elf64_hdr>())
-            .and_then(Elf64Hdr::from_bytes)?
-            .0;
-
-        // Get all the section headers.
-        let mut shdr = {
-            let shdr_num = usize::from(hdr.e_shnum);
-            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
-            let shdr_end = shdr_num
-                .checked_mul(size_of::<Elf64SHdr>())
-                .and_then(|v| v.checked_add(shdr_start))?;
-
-            elf.get(shdr_start..shdr_end)
-                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
-        };
-
-        // Get the strings table.
-        let strhdr = shdr
-            .clone()
-            .nth(usize::from(hdr.e_shstrndx))
-            .and_then(Elf64SHdr::from_bytes)?;
-
-        // Find the section which name matches `name` and return it.
-        shdr.find(|&sh| {
-            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
-                return false;
-            };
-
-            let Some(name_idx) = strhdr
-                .0
-                .sh_offset
-                .checked_add(u64::from(hdr.0.sh_name))
-                .and_then(|idx| usize::try_from(idx).ok())
-            else {
-                return false;
-            };
-
-            // Get the start of the name.
-            elf.get(name_idx..)
-                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
-                // Convert into str.
-                .and_then(|c_str| c_str.to_str().ok())
-                // Check that the name matches.
-                .map(|str| str == name)
-                .unwrap_or(false)
-        })
-        // Return the slice containing the section.
-        .and_then(|sh| {
-            let hdr = Elf64SHdr::from_bytes(sh)?;
-            let start = usize::try_from(hdr.0.sh_offset).ok()?;
-            let end = usize::try_from(hdr.0.sh_size)
-                .ok()
-                .and_then(|sh_size| start.checked_add(sh_size))?;
-
-            elf.get(start..end)
-        })
-    }
-}
-
 /// GSP firmware with 3-level radix page tables for the GSP bootloader.
 ///
 /// The bootloader expects firmware to be mapped starting at address 0 in GSP's virtual address
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 13/38] gpu: nova-core: factor out an elf_str() function
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (11 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 12/38] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 14/38] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Factor out a chunk of complexity into a new subroutine. This is an
incremental step in adding ELF32 support to the existing ELF64 section
support, for handling GPU firmware.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 40 ++++++++++++-------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index a0201ac8ccb4..72cefc3142ea 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -464,6 +464,13 @@ unsafe impl FromBytes for Elf64Hdr {}
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64SHdr {}
 
+    /// Returns a NULL-terminated string from the ELF image at `offset`.
+    fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
+        let idx = usize::try_from(offset).ok()?;
+        let bytes = elf.get(idx..)?;
+        CStr::from_bytes_until_nul(bytes).ok()?.to_str().ok()
+    }
+
     /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
     pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
         let hdr = &elf
@@ -490,32 +497,15 @@ pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a
             .and_then(Elf64SHdr::from_bytes)?;
 
         // Find the section which name matches `name` and return it.
-        shdr.find(|&sh| {
-            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
-                return false;
-            };
-
-            let Some(name_idx) = strhdr
-                .0
-                .sh_offset
-                .checked_add(u64::from(hdr.0.sh_name))
-                .and_then(|idx| usize::try_from(idx).ok())
-            else {
-                return false;
-            };
-
-            // Get the start of the name.
-            elf.get(name_idx..)
-                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
-                // Convert into str.
-                .and_then(|c_str| c_str.to_str().ok())
-                // Check that the name matches.
-                .map(|str| str == name)
-                .unwrap_or(false)
-        })
-        // Return the slice containing the section.
-        .and_then(|sh| {
+        shdr.find_map(|sh| {
             let hdr = Elf64SHdr::from_bytes(sh)?;
+            let name_offset = strhdr.0.sh_offset.checked_add(u64::from(hdr.0.sh_name))?;
+            let section_name = elf_str(elf, name_offset)?;
+
+            if section_name != name {
+                return None;
+            }
+
             let start = usize::try_from(hdr.0.sh_offset).ok()?;
             let end = usize::try_from(hdr.0.sh_size)
                 .ok()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 14/38] gpu: nova-core: don't assume 64-bit firmware images
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (12 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 13/38] gpu: nova-core: factor out an elf_str() function John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 15/38] gpu: nova-core: add support for 32-bit " John Hubbard
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add ElfHeader and ElfSectionHeader traits to abstract out differences
between ELF32 and ELF64. Implement these for ELF64.

This is in preparation for upcoming ELF32 section support, and for
auto-selecting ELF32 or ELF64.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 99 ++++++++++++++++++++++---------
 1 file changed, 72 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 72cefc3142ea..6ed76a7e15f1 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -453,17 +453,60 @@ mod elf {
         transmute::FromBytes, //
     };
 
-    /// Newtype to provide a [`FromBytes`] implementation.
+    /// Trait to abstract over ELF header differences (32-bit vs 64-bit).
+    trait ElfHeader: FromBytes {
+        fn shnum(&self) -> u16;
+        fn shoff(&self) -> u64;
+        fn shstrndx(&self) -> u16;
+    }
+
+    /// Trait to abstract over ELF section header differences (32-bit vs 64-bit).
+    trait ElfSectionHeader: FromBytes {
+        fn name(&self) -> u32;
+        fn offset(&self) -> u64;
+        fn size(&self) -> u64;
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfHeader`] implementations.
     #[repr(transparent)]
     struct Elf64Hdr(bindings::elf64_hdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64Hdr {}
 
+    impl ElfHeader for Elf64Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            self.0.e_shoff
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfSectionHeader`] implementations.
     #[repr(transparent)]
     struct Elf64SHdr(bindings::elf64_shdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64SHdr {}
 
+    impl ElfSectionHeader for Elf64SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            self.0.sh_offset
+        }
+
+        fn size(&self) -> u64 {
+            self.0.sh_size
+        }
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -471,47 +514,49 @@ fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         CStr::from_bytes_until_nul(bytes).ok()?.to_str().ok()
     }
 
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
-        let hdr = &elf
-            .get(0..size_of::<bindings::elf64_hdr>())
-            .and_then(Elf64Hdr::from_bytes)?
-            .0;
+    fn elf_section_generic<'a, H, S>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
+    where
+        H: ElfHeader,
+        S: ElfSectionHeader,
+    {
+        let hdr = H::from_bytes(elf.get(0..size_of::<H>())?)?;
 
-        // Get all the section headers.
-        let mut shdr = {
-            let shdr_num = usize::from(hdr.e_shnum);
-            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
-            let shdr_end = shdr_num
-                .checked_mul(size_of::<Elf64SHdr>())
-                .and_then(|v| v.checked_add(shdr_start))?;
+        let shdr_num = usize::from(hdr.shnum());
+        let shdr_start = usize::try_from(hdr.shoff()).ok()?;
+        let shdr_end = shdr_num
+            .checked_mul(size_of::<S>())
+            .and_then(|v| v.checked_add(shdr_start))?;
 
-            elf.get(shdr_start..shdr_end)
-                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
-        };
+        // Get all the section headers as an iterator over byte chunks.
+        let shdr_bytes = elf.get(shdr_start..shdr_end)?;
+        let mut shdr_iter = shdr_bytes.chunks_exact(size_of::<S>());
 
         // Get the strings table.
-        let strhdr = shdr
+        let strhdr = shdr_iter
             .clone()
-            .nth(usize::from(hdr.e_shstrndx))
-            .and_then(Elf64SHdr::from_bytes)?;
+            .nth(usize::from(hdr.shstrndx()))
+            .and_then(S::from_bytes)?;
 
         // Find the section which name matches `name` and return it.
-        shdr.find_map(|sh| {
-            let hdr = Elf64SHdr::from_bytes(sh)?;
-            let name_offset = strhdr.0.sh_offset.checked_add(u64::from(hdr.0.sh_name))?;
+        shdr_iter.find_map(|sh_bytes| {
+            let sh = S::from_bytes(sh_bytes)?;
+            let name_offset = strhdr.offset().checked_add(u64::from(sh.name()))?;
             let section_name = elf_str(elf, name_offset)?;
 
             if section_name != name {
                 return None;
             }
 
-            let start = usize::try_from(hdr.0.sh_offset).ok()?;
-            let end = usize::try_from(hdr.0.sh_size)
+            let start = usize::try_from(sh.offset()).ok()?;
+            let end = usize::try_from(sh.size())
                 .ok()
-                .and_then(|sh_size| start.checked_add(sh_size))?;
-
+                .and_then(|sz| start.checked_add(sz))?;
             elf.get(start..end)
         })
     }
+
+    /// Extract the section with name `name` from the ELF64 image `elf`.
+    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf64Hdr, Elf64SHdr>(elf, name)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 15/38] gpu: nova-core: add support for 32-bit firmware images
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (13 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 14/38] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 16/38] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add ELF32 header and section header newtypes with ElfHeader and
ElfSectionHeader trait implementations, mirroring the existing ELF64
support. Add elf32_section() for extracting sections from ELF32 images.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 46 +++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 6ed76a7e15f1..d94dd3468f3c 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -507,6 +507,46 @@ fn size(&self) -> u64 {
         }
     }
 
+    /// Newtype to provide [`FromBytes`] and [`ElfHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32Hdr(bindings::elf32_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32Hdr {}
+
+    impl ElfHeader for Elf32Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            u64::from(self.0.e_shoff)
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfSectionHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32SHdr(bindings::elf32_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32SHdr {}
+
+    impl ElfSectionHeader for Elf32SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            u64::from(self.0.sh_offset)
+        }
+
+        fn size(&self) -> u64 {
+            u64::from(self.0.sh_size)
+        }
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -559,4 +599,10 @@ fn elf_section_generic<'a, H, S>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
     pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Hdr, Elf64SHdr>(elf, name)
     }
+
+    /// Extract the section with name `name` from the ELF32 image `elf`.
+    #[expect(dead_code)]
+    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf32Hdr, Elf32SHdr>(elf, name)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 16/38] gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (14 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 15/38] gpu: nova-core: add support for 32-bit " John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 17/38] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add elf_section() which automatically detects ELF32 vs ELF64 based on
the ELF header's class byte, and dispatches to the appropriate parser.
Switch gsp.rs callers from elf64_section() to elf_section(), making
both elf32_section() and elf64_section() private.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     | 20 +++++++++++++++++---
 drivers/gpu/nova-core/firmware/gsp.rs |  4 ++--
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index d94dd3468f3c..57a919b7e0e8 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -596,13 +596,27 @@ fn elf_section_generic<'a, H, S>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
     }
 
     /// Extract the section with name `name` from the ELF64 image `elf`.
-    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Hdr, Elf64SHdr>(elf, name)
     }
 
     /// Extract the section with name `name` from the ELF32 image `elf`.
-    #[expect(dead_code)]
-    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf32Hdr, Elf32SHdr>(elf, name)
     }
+
+    /// Automatically detects ELF32 vs ELF64 based on the ELF header.
+    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        // Check ELF magic.
+        if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
+            return None;
+        }
+
+        // Check ELF class: 1 = 32-bit, 2 = 64-bit.
+        match elf.get(4)? {
+            1 => elf32_section(elf, name),
+            2 => elf64_section(elf, name),
+            _ => None,
+        }
+    }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index f247deb06633..52e7337c041c 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -105,7 +105,7 @@ pub(crate) fn new<'a>(
         pin_init::pin_init_scope(move || {
             let firmware = super::request_firmware(dev, chipset, "gsp", ver)?;
 
-            let fw_section = elf::elf64_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
+            let fw_section = elf::elf_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
 
             let size = fw_section.len();
 
@@ -162,7 +162,7 @@ pub(crate) fn new<'a>(
                 signatures: {
                     let sigs_section = Self::find_gsp_sigs_section(chipset).ok_or(ENOTSUPP)?;
 
-                    elf::elf64_section(firmware.data(), sigs_section)
+                    elf::elf_section(firmware.data(), sigs_section)
                         .ok_or(EINVAL)
                         .and_then(|data| DmaObject::from_data(dev, data))?
                 },
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 17/38] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (15 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 16/38] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 18/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
                   ` (20 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

FSP is a hardware unit that runs FMC firmware. The FMC ELF file is
loaded and stored in two forms: the "image" ELF section alone (which
FSP uses for boot) and the full ELF (needed later for signature
extraction during Chain of Trust verification).

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     |  1 +
 drivers/gpu/nova-core/firmware/fsp.rs | 44 +++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 57a919b7e0e8..396f96716d6b 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -28,6 +28,7 @@
 };
 
 pub(crate) mod booter;
+pub(crate) mod fsp;
 pub(crate) mod fwsec;
 pub(crate) mod gsp;
 pub(crate) mod riscv;
diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
new file mode 100644
index 000000000000..cea9532ba5ff
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP is a hardware unit that runs FMC firmware.
+
+use kernel::{
+    device,
+    prelude::*, //
+};
+
+use crate::{
+    dma::DmaObject,
+    firmware::elf,
+    gpu::Chipset, //
+};
+
+#[expect(unused)]
+pub(crate) struct FspFirmware {
+    /// FMC firmware image data (only the "image" ELF section).
+    fmc_image: DmaObject,
+    /// Full FMC ELF data (for signature extraction).
+    fmc_full: DmaObject,
+}
+
+impl FspFirmware {
+    #[expect(unused)]
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: Chipset,
+        ver: &str,
+    ) -> Result<Self> {
+        let fw = super::request_firmware(dev, chipset, "fmc", ver)?;
+
+        // FSP expects only the "image" section, not the entire ELF file.
+        let fmc_image_data = elf::elf_section(fw.data(), "image").ok_or_else(|| {
+            dev_err!(dev, "FMC ELF file missing 'image' section\n");
+            EINVAL
+        })?;
+
+        Ok(Self {
+            fmc_image: DmaObject::from_data(dev, fmc_image_data)?,
+            fmc_full: DmaObject::from_data(dev, fw.data())?,
+        })
+    }
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 18/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (16 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 17/38] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 19/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
                   ` (19 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP (Firmware System Processor) falcon engine type that will
handle secure boot and Chain of Trust operations on Hopper and Blackwell
architectures.

The FSP falcon replaces SEC2's role in the boot sequence for these newer
architectures. This initial stub just defines the falcon type and its
base address.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs     |  1 +
 drivers/gpu/nova-core/falcon/fsp.rs | 30 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 37bfee1d0949..a0cfb4442df1 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -33,6 +33,7 @@
     regs::macros::RegisterBase, //
 };
 
+pub(crate) mod fsp;
 pub(crate) mod gsp;
 mod hal;
 pub(crate) mod sec2;
diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
new file mode 100644
index 000000000000..c5ba1c2412cd
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP (Firmware System Processor) falcon engine for Hopper/Blackwell GPUs.
+//!
+//! The FSP falcon handles secure boot and Chain of Trust operations
+//! on Hopper and Blackwell architectures, replacing SEC2's role.
+
+use crate::{
+    falcon::{
+        FalconEngine,
+        PFalcon2Base,
+        PFalconBase, //
+    },
+    regs::macros::RegisterBase,
+};
+
+/// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
+pub(crate) struct Fsp(());
+
+impl RegisterBase<PFalconBase> for Fsp {
+    const BASE: usize = 0x8f2000;
+}
+
+impl RegisterBase<PFalcon2Base> for Fsp {
+    const BASE: usize = 0x8f3000;
+}
+
+impl FalconEngine for Fsp {
+    const ID: Self = Fsp(());
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 19/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (17 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 18/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 20/38] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add external memory (EMEM) read/write operations to the GPU's FSP falcon
engine. These operations use Falcon PIO (Programmed I/O) to communicate
with the FSP through indirect memory access.

Cc: Gary Guo <gary@garyguo.net>
Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 122 +++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs       |  12 +++
 2 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index c5ba1c2412cd..4baeee68197b 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -5,13 +5,26 @@
 //! The FSP falcon handles secure boot and Chain of Trust operations
 //! on Hopper and Blackwell architectures, replacing SEC2's role.
 
+use kernel::{
+    io::{
+        Io,
+        IoCapable, //
+    },
+    prelude::*, //
+};
+
 use crate::{
+    driver::Bar0,
     falcon::{
+        Falcon,
         FalconEngine,
         PFalcon2Base,
         PFalconBase, //
     },
-    regs::macros::RegisterBase,
+    regs::{
+        self,
+        macros::RegisterBase, //
+    },
 };
 
 /// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
@@ -28,3 +41,110 @@ impl RegisterBase<PFalcon2Base> for Fsp {
 impl FalconEngine for Fsp {
     const ID: Self = Fsp(());
 }
+
+/// Maximum addressable EMEM size, derived from the 24-bit offset field
+/// in NV_PFALCON_FALCON_EMEM_CTL.
+const EMEM_MAX_SIZE: usize = 1 << 24;
+
+/// I/O backend for the FSP falcon's external memory (EMEM).
+///
+/// Each 32-bit access programs a byte offset via the EMEM_CTL register,
+/// then reads or writes through the EMEM_DATA register.
+pub(crate) struct Emem<'a> {
+    bar: &'a Bar0,
+}
+
+impl<'a> Emem<'a> {
+    fn new(bar: &'a Bar0) -> Self {
+        Self { bar }
+    }
+}
+
+impl IoCapable<u32> for Emem<'_> {}
+
+impl Io for Emem<'_> {
+    fn addr(&self) -> usize {
+        0
+    }
+
+    fn maxsize(&self) -> usize {
+        EMEM_MAX_SIZE
+    }
+
+    fn try_read32(&self, offset: usize) -> Result<u32> {
+        // io_addr validates offset < EMEM_MAX_SIZE (2^24), so the u32 cast is safe.
+        let offset = self.io_addr::<u32>(offset)? as u32;
+
+        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
+            .set_rd_mode(true)
+            .set_offset(offset)
+            .write(self.bar, &Fsp::ID);
+
+        Ok(regs::NV_PFALCON_FALCON_EMEM_DATA::read(self.bar, &Fsp::ID).data())
+    }
+
+    fn try_write32(&self, value: u32, offset: usize) -> Result {
+        // io_addr validates offset < EMEM_MAX_SIZE (2^24), so the u32 cast is safe.
+        let offset = self.io_addr::<u32>(offset)? as u32;
+
+        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
+            .set_wr_mode(true)
+            .set_offset(offset)
+            .write(self.bar, &Fsp::ID);
+
+        regs::NV_PFALCON_FALCON_EMEM_DATA::default()
+            .set_data(value)
+            .write(self.bar, &Fsp::ID);
+
+        Ok(())
+    }
+}
+
+impl Falcon<Fsp> {
+    /// Returns an EMEM I/O accessor for this FSP falcon.
+    pub(crate) fn emem<'a>(&self, bar: &'a Bar0) -> Emem<'a> {
+        Emem::new(bar)
+    }
+
+    /// Writes `data` to FSP external memory at byte `offset`.
+    ///
+    /// Data is interpreted as little-endian 32-bit words.
+    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
+    #[expect(unused)]
+    pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        let emem = self.emem(bar);
+        let mut off = offset as usize;
+        for chunk in data.chunks_exact(4) {
+            let word = u32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
+            emem.try_write32(word, off)?;
+            off += 4;
+        }
+
+        Ok(())
+    }
+
+    /// Reads FSP external memory at byte `offset` into `data`.
+    ///
+    /// Data is stored as little-endian 32-bit words.
+    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
+    #[expect(unused)]
+    pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        let emem = self.emem(bar);
+        let mut off = offset as usize;
+        for chunk in data.chunks_exact_mut(4) {
+            let word = emem.try_read32(off)?;
+            chunk.copy_from_slice(&word.to_le_bytes());
+            off += 4;
+        }
+
+        Ok(())
+    }
+}
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index ea0d32f5396c..b939ec2d5bec 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -431,6 +431,18 @@ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
     8:8     br_fetch as bool;
 });
 
+// Falcon EMEM PIO registers (used by FSP on Hopper/Blackwell).
+// These provide the falcon external memory communication interface.
+register!(NV_PFALCON_FALCON_EMEM_CTL @ PFalconBase[0x00000ac0] {
+    23:0    offset as u32;      // EMEM byte offset (must be 4-byte aligned)
+    24:24   wr_mode as bool;    // Write mode
+    25:25   rd_mode as bool;    // Read mode
+});
+
+register!(NV_PFALCON_FALCON_EMEM_DATA @ PFalconBase[0x00000ac4] {
+    31:0    data as u32;        // EMEM data register
+});
+
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 20/38] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (18 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 19/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature John Hubbard
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP messaging infrastructure needed for Chain of Trust
communication on Hopper/Blackwell GPUs.

Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 79 ++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs       | 48 ++++++++++++++++++
 2 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index 4baeee68197b..d68a75a121f0 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -110,7 +110,6 @@ pub(crate) fn emem<'a>(&self, bar: &'a Bar0) -> Emem<'a> {
     ///
     /// Data is interpreted as little-endian 32-bit words.
     /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
-    #[expect(unused)]
     pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
         if offset % 4 != 0 || data.len() % 4 != 0 {
             return Err(EINVAL);
@@ -131,7 +130,6 @@ pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result
     ///
     /// Data is stored as little-endian 32-bit words.
     /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
-    #[expect(unused)]
     pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
         if offset % 4 != 0 || data.len() % 4 != 0 {
             return Err(EINVAL);
@@ -147,4 +145,81 @@ pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Resu
 
         Ok(())
     }
+
+    /// Poll FSP for incoming data.
+    ///
+    /// Returns the size of available data in bytes, or 0 if no data is available.
+    ///
+    /// The FSP message queue is not circular - pointers are reset to 0 after each
+    /// message exchange, so `tail >= head` is always true when data is present.
+    #[expect(unused)]
+    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
+        let head = regs::NV_PFSP_MSGQ_HEAD::read(bar).address();
+        let tail = regs::NV_PFSP_MSGQ_TAIL::read(bar).address();
+
+        if head == tail {
+            return 0;
+        }
+
+        // TAIL points at last DWORD written, so add 4 to get total size
+        tail.saturating_sub(head) + 4
+    }
+
+    /// Send message to FSP.
+    ///
+    /// Writes a message to FSP EMEM and updates queue pointers to notify FSP.
+    ///
+    /// # Arguments
+    /// * `bar` - BAR0 memory mapping
+    /// * `packet` - Message data (must be 4-byte aligned in length)
+    ///
+    /// # Returns
+    /// `Ok(())` on success, `Err(EINVAL)` if packet is empty or not 4-byte aligned
+    #[expect(unused)]
+    pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
+        if packet.is_empty() {
+            return Err(EINVAL);
+        }
+
+        // Write message to EMEM at offset 0 (validates 4-byte alignment)
+        self.write_emem(bar, 0, packet)?;
+
+        // Update queue pointers - TAIL points at last DWORD written
+        let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
+        regs::NV_PFSP_QUEUE_TAIL::default()
+            .set_address(tail_offset)
+            .write(bar);
+        regs::NV_PFSP_QUEUE_HEAD::default()
+            .set_address(0)
+            .write(bar);
+
+        Ok(())
+    }
+
+    /// Receive message from FSP.
+    ///
+    /// Reads a message from FSP EMEM and resets queue pointers.
+    ///
+    /// # Arguments
+    /// * `bar` - BAR0 memory mapping
+    /// * `buffer` - Buffer to receive message data
+    /// * `size` - Size of message to read in bytes (from `poll_msgq`)
+    ///
+    /// # Returns
+    /// `Ok(bytes_read)` on success, `Err(EINVAL)` if size is 0, exceeds buffer, or not aligned
+    #[expect(unused)]
+    pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
+        if size == 0 || size > buffer.len() {
+            return Err(EINVAL);
+        }
+
+        // Read response from EMEM at offset 0 (validates 4-byte alignment)
+        self.read_emem(bar, 0, &mut buffer[..size])?;
+
+        // Reset message queue pointers after reading
+        regs::NV_PFSP_MSGQ_TAIL::default().set_address(0).write(bar);
+        regs::NV_PFSP_MSGQ_HEAD::default().set_address(0).write(bar);
+
+        Ok(size)
+    }
 }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index b939ec2d5bec..35639ea32e55 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -8,6 +8,7 @@
 pub(crate) mod macros;
 
 use kernel::{
+    io::Io,
     prelude::*,
     time, //
 };
@@ -443,6 +444,53 @@ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
     31:0    data as u32;        // EMEM data register
 });
 
+// FSP (Firmware System Processor) queue registers for Hopper/Blackwell Chain of Trust
+// These registers manage falcon EMEM communication queues
+register!(NV_PFSP_QUEUE_HEAD @ 0x008f2c00 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_QUEUE_TAIL @ 0x008f2c04 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_MSGQ_HEAD @ 0x008f2c80 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_MSGQ_TAIL @ 0x008f2c84 {
+    31:0    address as u32;
+});
+
+// PTHERM registers
+
+// FSP secure boot completion status register used by FSP to signal boot completion.
+// This is the NV_THERM_I2CS_SCRATCH register.
+// Different architectures use different addresses:
+// - Hopper (GH100): 0x000200bc
+// - Blackwell (GB202): 0x00ad00bc
+pub(crate) fn fsp_thermal_scratch_reg_addr(arch: Architecture) -> Result<usize> {
+    match arch {
+        Architecture::Hopper => Ok(0x000200bc),
+        Architecture::Blackwell => Ok(0x00ad00bc),
+        _ => Err(kernel::error::code::ENOTSUPP),
+    }
+}
+
+/// FSP writes this value to indicate successful boot completion.
+#[expect(unused)]
+pub(crate) const FSP_BOOT_COMPLETE_SUCCESS: u32 = 0xff;
+
+// Helper function to read FSP boot completion status from the correct register
+#[expect(unused)]
+pub(crate) fn read_fsp_boot_complete_status(
+    bar: &crate::driver::Bar0,
+    arch: Architecture,
+) -> Result<u32> {
+    let addr = fsp_thermal_scratch_reg_addr(arch)?;
+    Ok(bar.read32(addr))
+}
+
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (19 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 20/38] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21 20:50   ` Miguel Ojeda
                     ` (2 more replies)
  2026-02-21  2:09 ` [PATCH v5 22/38] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
                   ` (16 subsequent siblings)
  37 siblings, 3 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
equivalent of Alignable::align_up(). This uses inline_const to validate
the alignment at compile time with a clear error message.

Add inline_const to rust_allowed_features in scripts/Makefile.build,
following the approach in [1].

[1] https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/

Suggested-by: Danilo Krummrich <dakr@kernel.org>
Suggested-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 rust/kernel/ptr.rs     | 27 +++++++++++++++++++++++++++
 scripts/Makefile.build |  2 +-
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
index 5b6a382637fe..b3509caa5ad7 100644
--- a/rust/kernel/ptr.rs
+++ b/rust/kernel/ptr.rs
@@ -225,3 +225,30 @@ fn align_up(self, alignment: Alignment) -> Option<Self> {
 }
 
 impl_alignable_uint!(u8, u16, u32, u64, usize);
+
+/// Aligns `value` up to `ALIGN` at compile time.
+///
+/// This is the const-compatible equivalent of [`Alignable::align_up`].
+/// `ALIGN` must be a power of two (enforced at compile time).
+///
+/// Panics on overflow, which becomes a compile-time error when called in a
+/// const context.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::ptr::const_align_up;
+/// use kernel::sizes::SZ_4K;
+///
+/// assert_eq!(const_align_up::<16>(0x4f), 0x50);
+/// assert_eq!(const_align_up::<16>(0x40), 0x40);
+/// assert_eq!(const_align_up::<SZ_4K>(1), SZ_4K);
+/// ```
+#[inline(always)]
+pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
+    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
+    match value.checked_add(ALIGN - 1) {
+        Some(v) => v & !(ALIGN - 1),
+        None => panic!("const_align_up: overflow"),
+    }
+}
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 32e209bc7985..a58a7d079710 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -319,7 +319,7 @@ $(obj)/%.lst: $(obj)/%.c FORCE
 #
 # Please see https://github.com/Rust-for-Linux/linux/issues/2 for details on
 # the unstable features in use.
-rust_allowed_features := asm_const,asm_goto,arbitrary_self_types,lint_reasons,offset_of_nested,raw_ref_op,used_with_arg
+rust_allowed_features := asm_const,asm_goto,arbitrary_self_types,inline_const,lint_reasons,offset_of_nested,raw_ref_op,used_with_arg
 
 # `--out-dir` is required to avoid temporaries being created by `rustc` in the
 # current working directory, which may be not accessible in the out-of-tree
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 22/38] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (20 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 23/38] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Various "reserved" areas of FB (frame buffer: vidmem) have to be
calculated, because the GSP booting process needs this information.

PMU_RESERVED_SIZE is computed at compile time using const_align_up().
The total reserved size is computed at runtime using Alignable::align_up
because it depends on the heap layout.

Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs     | 7 ++++++-
 drivers/gpu/nova-core/gsp/fw.rs | 6 +++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 6536d0035cb1..0e3519e5ccc0 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -11,7 +11,8 @@
     prelude::*,
     ptr::{
         Alignable,
-        Alignment, //
+        Alignment,
+        const_align_up, //
     },
     sizes::*,
     sync::aref::ARef, //
@@ -270,3 +271,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         })
     }
 }
+
+/// PMU reserved size, aligned to 128KB.
+pub(crate) const PMU_RESERVED_SIZE: u32 =
+    const_align_up::<SZ_128K>(SZ_8M + SZ_16M + SZ_4K) as u32;
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 83ff91614e36..086153edfa86 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -27,7 +27,10 @@
 };
 
 use crate::{
-    fb::FbLayout,
+    fb::{
+        FbLayout,
+        PMU_RESERVED_SIZE, //
+    },
     firmware::gsp::GspFirmware,
     gpu::Chipset,
     gsp::{
@@ -183,6 +186,7 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
             fbSize: fb_layout.fb.end - fb_layout.fb.start,
             vgaWorkspaceOffset: fb_layout.vga_workspace.start,
             vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
+            pmuReservedSize: PMU_RESERVED_SIZE,
             ..Default::default()
         })
     }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 23/38] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (21 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 22/38] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 24/38] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the MCTP (Management Component Transport Protocol) and NVDM (NVIDIA
Device Management) wire-format types used for communication between the
kernel driver and GPU firmware processors.

This includes typed MCTP transport headers, NVDM message headers, and
NVDM message type identifiers. Both the FSP boot path and the upcoming
GSP RPC message queue share this protocol layer.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/mctp.rs      | 107 +++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 2 files changed, 108 insertions(+)
 create mode 100644 drivers/gpu/nova-core/mctp.rs

diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs
new file mode 100644
index 000000000000..0dafc31b230c
--- /dev/null
+++ b/drivers/gpu/nova-core/mctp.rs
@@ -0,0 +1,107 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! MCTP/NVDM protocol types for NVIDIA GPU firmware communication.
+//!
+//! MCTP (Management Component Transport Protocol) carries NVDM (NVIDIA
+//! Device Management) messages between the kernel driver and GPU firmware
+//! processors such as FSP and GSP.
+
+#![expect(dead_code)]
+
+/// NVDM message type identifiers carried over MCTP.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+#[repr(u32)]
+pub(crate) enum NvdmType {
+    /// Chain of Trust boot message.
+    Cot = 0x14,
+    /// FSP command response.
+    FspResponse = 0x15,
+}
+
+/// MCTP transport header for NVIDIA firmware messages.
+///
+/// Bit layout: `[31] SOM | [30] EOM | [29:28] SEQ | [23:16] SEID`.
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct MctpHeader(u32);
+
+impl MctpHeader {
+    const SOM_SHIFT: u32 = 31;
+    const EOM_SHIFT: u32 = 30;
+
+    /// Build a single-packet MCTP header (SOM=1, EOM=1, SEQ=0, SEID=0).
+    pub(crate) const fn single_packet() -> Self {
+        Self((1 << Self::SOM_SHIFT) | (1 << Self::EOM_SHIFT))
+    }
+
+    /// Return the raw packed u32.
+    pub(crate) const fn raw(self) -> u32 {
+        self.0
+    }
+
+    /// Check if this is a complete single-packet message (SOM=1 and EOM=1).
+    pub(crate) const fn is_single_packet(self) -> bool {
+        let som = (self.0 >> Self::SOM_SHIFT) & 1;
+        let eom = (self.0 >> Self::EOM_SHIFT) & 1;
+        som == 1 && eom == 1
+    }
+}
+
+impl From<u32> for MctpHeader {
+    fn from(raw: u32) -> Self {
+        Self(raw)
+    }
+}
+
+/// MCTP message type for PCI vendor-defined messages.
+const MSG_TYPE_VENDOR_PCI: u32 = 0x7e;
+
+/// NVIDIA PCI vendor ID.
+const VENDOR_ID_NV: u32 = 0x10de;
+
+/// NVIDIA Vendor-Defined Message (NVDM) header over MCTP.
+///
+/// Bit layout: `[6:0] msg_type | [23:8] vendor_id | [31:24] nvdm_type`.
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct NvdmHeader(u32);
+
+impl NvdmHeader {
+    const MSG_TYPE_MASK: u32 = 0x7f;
+    const VENDOR_ID_SHIFT: u32 = 8;
+    const VENDOR_ID_MASK: u32 = 0xffff;
+    const TYPE_SHIFT: u32 = 24;
+    const TYPE_MASK: u32 = 0xff;
+
+    /// Build an NVDM header for the given message type.
+    pub(crate) const fn new(nvdm_type: NvdmType) -> Self {
+        Self(
+            MSG_TYPE_VENDOR_PCI
+                | (VENDOR_ID_NV << Self::VENDOR_ID_SHIFT)
+                | ((nvdm_type as u32) << Self::TYPE_SHIFT),
+        )
+    }
+
+    /// Return the raw packed u32.
+    pub(crate) const fn raw(self) -> u32 {
+        self.0
+    }
+
+    /// Extract the NVDM type field as a raw value.
+    pub(crate) const fn nvdm_type_raw(self) -> u32 {
+        (self.0 >> Self::TYPE_SHIFT) & Self::TYPE_MASK
+    }
+
+    /// Validate this header against the expected NVIDIA NVDM format and type.
+    pub(crate) const fn validate(self, expected_type: NvdmType) -> bool {
+        let msg_type = self.0 & Self::MSG_TYPE_MASK;
+        let vendor_id = (self.0 >> Self::VENDOR_ID_SHIFT) & Self::VENDOR_ID_MASK;
+        msg_type == MSG_TYPE_VENDOR_PCI
+            && vendor_id == VENDOR_ID_NV
+            && self.nvdm_type_raw() == expected_type as u32
+    }
+}
+
+impl From<u32> for NvdmHeader {
+    fn from(raw: u32) -> Self {
+        Self(raw)
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index c1121e7c64c5..7350c2069bcc 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -13,6 +13,7 @@
 mod gfw;
 mod gpu;
 mod gsp;
+mod mctp;
 mod num;
 mod regs;
 mod sbuffer;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 24/38] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (22 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 23/38] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 25/38] gpu: nova-core: Hopper/Blackwell: add FSP message structures John Hubbard
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP (Firmware System Processor) module for Hopper/Blackwell GPUs.
These architectures use a simplified firmware boot sequence:

    FMC --> FSP --> GSP, with no SEC2 involvement.

This commit adds the ability to wait for FSP secure boot completion by
polling the I2CS thermal scratch register until FSP signals success.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs       | 141 +++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 drivers/gpu/nova-core/regs.rs      |  12 +--
 3 files changed, 148 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fsp.rs

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
new file mode 100644
index 000000000000..d464ad325881
--- /dev/null
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
+//!
+//! Hopper/Blackwell use a simplified firmware boot sequence: FMC --> FSP --> GSP.
+//! Unlike Turing/Ampere/Ada, there is NO SEC2 (Security Engine 2) usage.
+//! FSP handles secure boot directly using FMC firmware + Chain of Trust.
+
+use kernel::{
+    device,
+    io::poll::read_poll_timeout,
+    prelude::*,
+    time::Delta,
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    },
+};
+
+use crate::regs;
+
+/// FSP secure boot completion timeout in milliseconds.
+const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 4000;
+
+/// GSP FMC initialization parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspFmcInitParams {
+    /// CC initialization "registry keys".
+    regkeys: u32,
+}
+
+// SAFETY: GspFmcInitParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspFmcInitParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspFmcInitParams {}
+
+/// GSP ACR (Authenticated Code RAM) boot parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspAcrBootGspRmParams {
+    /// Physical memory aperture through which gspRmDescPa is accessed.
+    target: u32,
+    /// Size in bytes of the GSP-RM descriptor structure.
+    gsp_rm_desc_size: u32,
+    /// Physical offset in the target aperture of the GSP-RM descriptor structure.
+    gsp_rm_desc_offset: u64,
+    /// Physical offset in FB to set the start of the WPR containing GSP-RM.
+    wpr_carveout_offset: u64,
+    /// Size in bytes of the WPR containing GSP-RM.
+    wpr_carveout_size: u32,
+    /// Whether to boot GSP-RM or GSP-Proxy through ACR.
+    b_is_gsp_rm_boot: u32,
+}
+
+// SAFETY: GspAcrBootGspRmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspAcrBootGspRmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspAcrBootGspRmParams {}
+
+/// GSP RM boot parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspRmParams {
+    /// Physical memory aperture through which bootArgsOffset is accessed.
+    target: u32,
+    /// Physical offset in the memory aperture that will be passed to GSP-RM.
+    boot_args_offset: u64,
+}
+
+// SAFETY: GspRmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspRmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspRmParams {}
+
+/// GSP SPDM (Security Protocol and Data Model) parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspSpdmParams {
+    /// Physical memory aperture through which all addresses are accessed.
+    target: u32,
+    /// Physical offset in the memory aperture where SPDM payload buffer is stored.
+    payload_buffer_offset: u64,
+    /// Size of the above payload buffer.
+    payload_buffer_size: u32,
+}
+
+// SAFETY: GspSpdmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspSpdmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspSpdmParams {}
+
+/// Complete GSP FMC boot parameters passed to FSP.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+pub(crate) struct GspFmcBootParams {
+    init_params: GspFmcInitParams,
+    boot_gsp_rm_params: GspAcrBootGspRmParams,
+    gsp_rm_params: GspRmParams,
+    gsp_spdm_params: GspSpdmParams,
+}
+
+// SAFETY: GspFmcBootParams is composed of C structs with only primitive types.
+unsafe impl AsBytes for GspFmcBootParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspFmcBootParams {}
+
+/// FSP interface for Hopper/Blackwell GPUs.
+pub(crate) struct Fsp;
+
+impl Fsp {
+    /// Wait for FSP secure boot completion.
+    ///
+    /// Polls the thermal scratch register until FSP signals boot completion
+    /// or timeout occurs.
+    #[expect(dead_code)]
+    pub(crate) fn wait_secure_boot(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        arch: crate::gpu::Architecture,
+    ) -> Result {
+        debug_assert!(
+            regs::read_fsp_boot_complete_status(bar, arch).is_some(),
+            "wait_secure_boot called on non-FSP architecture"
+        );
+
+        let timeout = Delta::from_millis(FSP_SECURE_BOOT_TIMEOUT_MS);
+
+        read_poll_timeout(
+            || regs::read_fsp_boot_complete_status(bar, arch).ok_or(ENOTSUPP),
+            |&status| status == regs::FSP_BOOT_COMPLETE_SUCCESS,
+            Delta::from_millis(10),
+            timeout,
+        )
+        .map_err(|_| {
+            dev_err!(dev, "FSP secure boot completion timeout\n");
+            ETIMEDOUT
+        })
+        .map(|_| ())
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 7350c2069bcc..3b2109ebe9b6 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -10,6 +10,7 @@
 mod falcon;
 mod fb;
 mod firmware;
+mod fsp;
 mod gfw;
 mod gpu;
 mod gsp;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 35639ea32e55..77d590887ee7 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -478,17 +478,17 @@ pub(crate) fn fsp_thermal_scratch_reg_addr(arch: Architecture) -> Result<usize>
 }
 
 /// FSP writes this value to indicate successful boot completion.
-#[expect(unused)]
 pub(crate) const FSP_BOOT_COMPLETE_SUCCESS: u32 = 0xff;
 
-// Helper function to read FSP boot completion status from the correct register
-#[expect(unused)]
+/// Read FSP boot completion status from the architecture-specific thermal scratch register.
+///
+/// Returns `None` if the architecture does not have an FSP.
 pub(crate) fn read_fsp_boot_complete_status(
     bar: &crate::driver::Bar0,
     arch: Architecture,
-) -> Result<u32> {
-    let addr = fsp_thermal_scratch_reg_addr(arch)?;
-    Ok(bar.read32(addr))
+) -> Option<u32> {
+    let addr = fsp_thermal_scratch_reg_addr(arch).ok()?;
+    Some(bar.read32(addr))
 }
 
 // The modules below provide registers that are not identical on all supported chips. They should
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 25/38] gpu: nova-core: Hopper/Blackwell: add FSP message structures
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (23 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 24/38] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 26/38] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
                   ` (12 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the NVDM COT payload, FSP message, and FSP response structures
needed for FSP Chain of Trust communication. Also add FmcSignatures
to hold the hash, public key, and signature extracted from FMC firmware.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs |  5 +-
 drivers/gpu/nova-core/fsp.rs      | 78 +++++++++++++++++++++++++++++++
 2 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 396f96716d6b..823d2232081e 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -27,6 +27,9 @@
     },
 };
 
+#[expect(unused)]
+pub(crate) use elf::elf_section;
+
 pub(crate) mod booter;
 pub(crate) mod fsp;
 pub(crate) mod fwsec;
@@ -607,7 +610,7 @@ fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
     }
 
     /// Automatically detects ELF32 vs ELF64 based on the ELF header.
-    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    pub(crate) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         // Check ELF magic.
         if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
             return None;
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index d464ad325881..15731d24d0c5 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -105,6 +105,84 @@ unsafe impl AsBytes for GspFmcBootParams {}
 // SAFETY: All bit patterns are valid for the primitive fields.
 unsafe impl FromBytes for GspFmcBootParams {}
 
+/// Size constraints for FSP security signatures (Hopper/Blackwell).
+const FSP_HASH_SIZE: usize = 48; // SHA-384 hash
+const FSP_PKEY_SIZE: usize = 384; // RSA-3072 public key
+const FSP_SIG_SIZE: usize = 384; // RSA-3072 signature
+
+/// Structure to hold FMC signatures.
+#[derive(Debug, Clone, Copy)]
+#[expect(dead_code)]
+pub(crate) struct FmcSignatures {
+    hash384: [u8; FSP_HASH_SIZE],
+    public_key: [u8; FSP_PKEY_SIZE],
+    signature: [u8; FSP_SIG_SIZE],
+}
+
+impl Default for FmcSignatures {
+    fn default() -> Self {
+        Self {
+            hash384: [0u8; FSP_HASH_SIZE],
+            public_key: [0u8; FSP_PKEY_SIZE],
+            signature: [0u8; FSP_SIG_SIZE],
+        }
+    }
+}
+
+/// FSP Command Response payload structure.
+/// NVDM_PAYLOAD_COMMAND_RESPONSE structure.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCommandResponse {
+    task_id: u32,
+    command_nvdm_type: u32,
+    error_code: u32,
+}
+
+/// NVDM (NVIDIA Device Management) COT (Chain of Trust) payload structure.
+/// This is the main message payload sent to FSP for Chain of Trust.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCot {
+    version: u16,
+    size: u16,
+    gsp_fmc_sysmem_offset: u64,
+    frts_sysmem_offset: u64,
+    frts_sysmem_size: u32,
+    frts_vidmem_offset: u64,
+    frts_vidmem_size: u32,
+    hash384: [u8; FSP_HASH_SIZE],
+    public_key: [u8; FSP_PKEY_SIZE],
+    signature: [u8; FSP_SIG_SIZE],
+    gsp_boot_args_sysmem_offset: u64,
+}
+
+/// Complete FSP message structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+#[expect(dead_code)]
+struct FspMessage {
+    mctp_header: u32,
+    nvdm_header: u32,
+    cot: NvdmPayloadCot,
+}
+
+// SAFETY: FspMessage is a packed C struct with only integral fields.
+unsafe impl AsBytes for FspMessage {}
+
+/// Complete FSP response structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+#[expect(dead_code)]
+struct FspResponse {
+    mctp_header: u32,
+    nvdm_header: u32,
+    response: NvdmPayloadCommandResponse,
+}
+
+// SAFETY: FspResponse is a packed C struct with only integral fields.
+unsafe impl FromBytes for FspResponse {}
+
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 26/38] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (24 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 25/38] gpu: nova-core: Hopper/Blackwell: add FSP message structures John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 27/38] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add extract_fmc_signatures() which extracts SHA-384 hash, RSA public
key, and RSA signature from FMC ELF32 firmware sections. These are
needed for FSP Chain of Trust verification.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs |  1 -
 drivers/gpu/nova-core/fsp.rs      | 61 ++++++++++++++++++++++++++++++-
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 823d2232081e..eaced3d42728 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -27,7 +27,6 @@
     },
 };
 
-#[expect(unused)]
 pub(crate) use elf::elf_section;
 
 pub(crate) mod booter;
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 15731d24d0c5..29707578d4d4 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -112,7 +112,6 @@ unsafe impl FromBytes for GspFmcBootParams {}
 
 /// Structure to hold FMC signatures.
 #[derive(Debug, Clone, Copy)]
-#[expect(dead_code)]
 pub(crate) struct FmcSignatures {
     hash384: [u8; FSP_HASH_SIZE],
     public_key: [u8; FSP_PKEY_SIZE],
@@ -216,4 +215,64 @@ pub(crate) fn wait_secure_boot(
         })
         .map(|_| ())
     }
+
+    /// Extract FMC firmware signatures for Chain of Trust verification.
+    ///
+    /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
+    /// Returns signatures in a heap-allocated structure to prevent stack overflow.
+    #[expect(dead_code)]
+    pub(crate) fn extract_fmc_signatures(
+        dev: &device::Device<device::Bound>,
+        fmc_fw_data: &[u8],
+    ) -> Result<KBox<FmcSignatures>> {
+        let hash_section = crate::firmware::elf_section(fmc_fw_data, "hash")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'hash' section\n"))?;
+
+        let pkey_section = crate::firmware::elf_section(fmc_fw_data, "publickey")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'publickey' section\n"))?;
+
+        let sig_section = crate::firmware::elf_section(fmc_fw_data, "signature")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'signature' section\n"))?;
+
+        if hash_section.len() != FSP_HASH_SIZE {
+            dev_err!(
+                dev,
+                "FMC hash section size {} != expected {}\n",
+                hash_section.len(),
+                FSP_HASH_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        if pkey_section.len() > FSP_PKEY_SIZE {
+            dev_err!(
+                dev,
+                "FMC publickey section size {} > maximum {}\n",
+                pkey_section.len(),
+                FSP_PKEY_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        if sig_section.len() > FSP_SIG_SIZE {
+            dev_err!(
+                dev,
+                "FMC signature section size {} > maximum {}\n",
+                sig_section.len(),
+                FSP_SIG_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        let mut signatures = KBox::new(FmcSignatures::default(), GFP_KERNEL)?;
+
+        signatures.hash384.copy_from_slice(hash_section);
+        signatures.public_key[..pkey_section.len()].copy_from_slice(pkey_section);
+        signatures.signature[..sig_section.len()].copy_from_slice(sig_section);
+
+        Ok(signatures)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 27/38] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (25 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 26/38] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 28/38] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add send_sync_fsp() which sends an MCTP/NVDM message to FSP and waits
for the response. Response validation uses the typed MctpHeader and
NvdmHeader wrappers from the previous commit.

A MessageToFsp trait provides the NVDM type constant for each message
struct, so send_sync_fsp() can verify that the response matches the
request.

Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs |   3 -
 drivers/gpu/nova-core/fsp.rs        | 107 +++++++++++++++++++++++++++-
 2 files changed, 105 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index d68a75a121f0..b5a0a2631ec7 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -152,7 +152,6 @@ pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Resu
     ///
     /// The FSP message queue is not circular - pointers are reset to 0 after each
     /// message exchange, so `tail >= head` is always true when data is present.
-    #[expect(unused)]
     pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
         let head = regs::NV_PFSP_MSGQ_HEAD::read(bar).address();
         let tail = regs::NV_PFSP_MSGQ_TAIL::read(bar).address();
@@ -175,7 +174,6 @@ pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
     ///
     /// # Returns
     /// `Ok(())` on success, `Err(EINVAL)` if packet is empty or not 4-byte aligned
-    #[expect(unused)]
     pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
         if packet.is_empty() {
             return Err(EINVAL);
@@ -207,7 +205,6 @@ pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
     ///
     /// # Returns
     /// `Ok(bytes_read)` on success, `Err(EINVAL)` if size is 0, exceeds buffer, or not aligned
-    #[expect(unused)]
     pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
         if size == 0 || size > buffer.len() {
             return Err(EINVAL);
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 29707578d4d4..20c439fc7f7b 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -19,6 +19,15 @@
 
 use crate::regs;
 
+use crate::mctp::{
+    MctpHeader,
+    NvdmHeader,
+    NvdmType, //
+};
+
+/// FSP message timeout in milliseconds.
+const FSP_MSG_TIMEOUT_MS: i64 = 2000;
+
 /// FSP secure boot completion timeout in milliseconds.
 const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 4000;
 
@@ -159,7 +168,6 @@ struct NvdmPayloadCot {
 /// Complete FSP message structure with MCTP and NVDM headers.
 #[repr(C, packed)]
 #[derive(Clone, Copy)]
-#[expect(dead_code)]
 struct FspMessage {
     mctp_header: u32,
     nvdm_header: u32,
@@ -172,7 +180,6 @@ unsafe impl AsBytes for FspMessage {}
 /// Complete FSP response structure with MCTP and NVDM headers.
 #[repr(C, packed)]
 #[derive(Clone, Copy)]
-#[expect(dead_code)]
 struct FspResponse {
     mctp_header: u32,
     nvdm_header: u32,
@@ -182,6 +189,19 @@ struct FspResponse {
 // SAFETY: FspResponse is a packed C struct with only integral fields.
 unsafe impl FromBytes for FspResponse {}
 
+/// Trait implemented by types representing a message to send to FSP.
+///
+/// This provides [`Fsp::send_sync_fsp`] with the information it needs to send
+/// a given message, following the same pattern as GSP's `CommandToGsp`.
+pub(crate) trait MessageToFsp: AsBytes {
+    /// NVDM type identifying this message to FSP.
+    const NVDM_TYPE: u32;
+}
+
+impl MessageToFsp for FspMessage {
+    const NVDM_TYPE: u32 = NvdmType::Cot as u32;
+}
+
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -275,4 +295,87 @@ pub(crate) fn extract_fmc_signatures(
 
         Ok(signatures)
     }
+
+    /// Send message to FSP and wait for response.
+    #[expect(dead_code)]
+    fn send_sync_fsp<M>(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
+        msg: &M,
+    ) -> Result
+    where
+        M: MessageToFsp,
+    {
+        fsp_falcon.send_msg(bar, msg.as_bytes())?;
+
+        let timeout = Delta::from_millis(FSP_MSG_TIMEOUT_MS);
+        let packet_size = read_poll_timeout(
+            || Ok(fsp_falcon.poll_msgq(bar)),
+            |&size| size > 0,
+            Delta::from_millis(10),
+            timeout,
+        )
+        .map_err(|_| {
+            dev_err!(dev, "FSP response timeout\n");
+            ETIMEDOUT
+        })?;
+
+        let packet_size = packet_size as usize;
+        let mut response_buf = KVec::<u8>::new();
+        response_buf.resize(packet_size, 0, GFP_KERNEL)?;
+        fsp_falcon.recv_msg(bar, &mut response_buf, packet_size)?;
+
+        if response_buf.len() < core::mem::size_of::<FspResponse>() {
+            dev_err!(dev, "FSP response too small: {}\n", response_buf.len());
+            return Err(EIO);
+        }
+
+        let response = FspResponse::from_bytes(&response_buf[..]).ok_or(EIO)?;
+
+        let mctp_header: MctpHeader = response.mctp_header.into();
+        let nvdm_header: NvdmHeader = response.nvdm_header.into();
+        let command_nvdm_type = response.response.command_nvdm_type;
+        let error_code = response.response.error_code;
+
+        if !mctp_header.is_single_packet() {
+            dev_err!(
+                dev,
+                "Unexpected MCTP header in FSP reply: {:#x}\n",
+                mctp_header.raw()
+            );
+            return Err(EIO);
+        }
+
+        if !nvdm_header.validate(NvdmType::FspResponse) {
+            dev_err!(
+                dev,
+                "Unexpected NVDM header in FSP reply: {:#x}\n",
+                nvdm_header.raw()
+            );
+            return Err(EIO);
+        }
+
+        if command_nvdm_type != M::NVDM_TYPE {
+            dev_err!(
+                dev,
+                "Expected NVDM type {:#x} in reply, got {:#x}\n",
+                M::NVDM_TYPE,
+                command_nvdm_type
+            );
+            return Err(EIO);
+        }
+
+        if error_code != 0 {
+            dev_err!(
+                dev,
+                "NVDM command {:#x} failed with error {:#x}\n",
+                M::NVDM_TYPE,
+                error_code
+            );
+            return Err(EIO);
+        }
+
+        Ok(())
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 28/38] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (26 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 27/38] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 29/38] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add FspCotVersion to represent the FSP Chain of Trust protocol version,
and Chipset::fsp_cot_version() which returns the version for each
architecture. Hopper uses version 1, Blackwell uses version 2.
Non-FSP architectures return None.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs | 19 +++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs | 14 ++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 20c439fc7f7b..8926dd814a83 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -25,6 +25,25 @@
     NvdmType, //
 };
 
+/// FSP Chain of Trust protocol version.
+///
+/// Hopper (GH100) uses version 1, Blackwell uses version 2.
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct FspCotVersion(u16);
+
+impl FspCotVersion {
+    /// Create a new FSP COT version.
+    pub(crate) const fn new(version: u16) -> Self {
+        Self(version)
+    }
+
+    /// Return the raw protocol version number for the wire format.
+    #[expect(dead_code)]
+    pub(crate) const fn raw(self) -> u16 {
+        self.0
+    }
+}
+
 /// FSP message timeout in milliseconds.
 const FSP_MSG_TIMEOUT_MS: i64 = 2000;
 
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 50bf351b64cc..fc34c97a61fc 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -21,6 +21,7 @@
         Falcon, //
     },
     fb::SysmemFlush,
+    fsp::FspCotVersion,
     gfw,
     gsp::Gsp,
     regs,
@@ -127,6 +128,19 @@ pub(crate) const fn arch(&self) -> Architecture {
             | Self::GB207 => Architecture::Blackwell,
         }
     }
+
+    /// Returns the FSP Chain of Trust (COT) protocol version for this chipset.
+    ///
+    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
+    /// Returns `None` for architectures that do not use FSP.
+    #[expect(dead_code)]
+    pub(crate) const fn fsp_cot_version(&self) -> Option<FspCotVersion> {
+        match self.arch() {
+            Architecture::Hopper => Some(FspCotVersion::new(1)),
+            Architecture::Blackwell => Some(FspCotVersion::new(2)),
+            _ => None,
+        }
+    }
 }
 
 // TODO
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 29/38] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (27 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 28/38] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 30/38] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add dedicated FB HALs for Hopper (GH100) and Blackwell (GB100) with
architecture-specific non-WPR heap sizes. Hopper uses 2 MiB, Blackwell
uses 2 MiB + 128 KiB. These are needed for the larger reserved memory
regions that Hopper/Blackwell GPUs require.

Also adds the non_wpr_heap_size() method to the FbHal trait, and
the total_reserved_size field to FbLayout.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs           | 16 ++++++++---
 drivers/gpu/nova-core/fb/hal.rs       | 16 ++++++++---
 drivers/gpu/nova-core/fb/hal/ga102.rs |  2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs | 38 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/fb/hal/gh100.rs | 38 +++++++++++++++++++++++++++
 5 files changed, 102 insertions(+), 8 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 0e3519e5ccc0..8b3ba9c9f464 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -31,7 +31,7 @@
     regs,
 };
 
-mod hal;
+pub(crate) mod hal;
 
 /// Type holding the sysmem flush memory page, a page of memory to be written into the
 /// `NV_PFB_NISO_FLUSH_SYSMEM_ADDR*` registers and used to maintain memory coherency.
@@ -99,6 +99,15 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
     }
 }
 
+/// Calculate non-WPR heap size based on chipset architecture.
+/// This matches the logic used in FSP for consistency.
+pub(crate) fn calc_non_wpr_heap_size(chipset: Chipset) -> u64 {
+    hal::fb_hal(chipset)
+        .non_wpr_heap_size()
+        .map(u64::from)
+        .unwrap_or(usize_as_u64(SZ_1M))
+}
+
 pub(crate) struct FbRange(Range<u64>);
 
 impl FbRange {
@@ -253,9 +262,8 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         };
 
         let heap = {
-            const HEAP_SIZE: u64 = usize_as_u64(SZ_1M);
-
-            FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
+            let heap_size = calc_non_wpr_heap_size(chipset);
+            FbRange(wpr2.start - heap_size..wpr2.start)
         };
 
         Ok(Self {
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index d33ca0f96417..ebd12247f771 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -12,6 +12,8 @@
 
 mod ga100;
 mod ga102;
+mod gb100;
+mod gh100;
 mod tu102;
 
 pub(crate) trait FbHal {
@@ -28,14 +30,22 @@ pub(crate) trait FbHal {
 
     /// Returns the VRAM size, in bytes.
     fn vidmem_size(&self, bar: &Bar0) -> u64;
+
+    /// Returns the non-WPR heap size for GPUs that need large reserved memory.
+    ///
+    /// Returns `None` for GPUs that don't need extra reserved memory.
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        None
+    }
 }
 
 /// Returns the HAL corresponding to `chipset`.
-pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
+pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset.arch() {
         Architecture::Turing => tu102::TU102_HAL,
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
-        Architecture::Ampere => ga102::GA102_HAL,
-        Architecture::Ada | Architecture::Hopper | Architecture::Blackwell => ga102::GA102_HAL,
+        Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
+        Architecture::Hopper => gh100::GH100_HAL,
+        Architecture::Blackwell => gb100::GB100_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/ga102.rs b/drivers/gpu/nova-core/fb/hal/ga102.rs
index 734605905031..f8d8f01e3c5d 100644
--- a/drivers/gpu/nova-core/fb/hal/ga102.rs
+++ b/drivers/gpu/nova-core/fb/hal/ga102.rs
@@ -8,7 +8,7 @@
     regs, //
 };
 
-fn vidmem_size_ga102(bar: &Bar0) -> u64 {
+pub(super) fn vidmem_size_ga102(bar: &Bar0) -> u64 {
     regs::NV_USABLE_FB_SIZE_IN_MB::read(bar).usable_fb_size()
 }
 
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
new file mode 100644
index 000000000000..bead99a6ca76
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gb100;
+
+impl FbHal for Gb100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        // 2 MiB + 128 KiB non-WPR heap for Blackwell (see Open RM: kgspCalculateFbLayout_GB100).
+        Some(0x220000)
+    }
+}
+
+const GB100: Gb100 = Gb100;
+pub(super) const GB100_HAL: &dyn FbHal = &GB100;
diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
new file mode 100644
index 000000000000..32d7414e6243
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gh100;
+
+impl FbHal for Gh100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        // 2 MiB non-WPR heap for Hopper (see Open RM: kgspCalculateFbLayout_GH100).
+        Some(0x200000)
+    }
+}
+
+const GH100: Gh100 = Gh100;
+pub(super) const GH100_HAL: &dyn FbHal = &GH100;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 30/38] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (28 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 29/38] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 31/38] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add boot_fmc() which builds and sends the Chain of Trust message to FSP,
and FmcBootArgs which bundles the DMA-coherent boot parameters that FSP
reads at boot time. The FspFirmware struct fields become pub(crate) and
fmc_full changes from DmaObject to KVec<u8> for CPU-side signature
extraction.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/fsp.rs |  14 ++-
 drivers/gpu/nova-core/fsp.rs          | 134 +++++++++++++++++++++++++-
 drivers/gpu/nova-core/gpu.rs          |   1 -
 drivers/gpu/nova-core/mctp.rs         |   2 -
 4 files changed, 141 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index cea9532ba5ff..bb35f363b998 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -13,16 +13,16 @@
     gpu::Chipset, //
 };
 
-#[expect(unused)]
+#[expect(dead_code)]
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the "image" ELF section).
-    fmc_image: DmaObject,
+    pub(crate) fmc_image: DmaObject,
     /// Full FMC ELF data (for signature extraction).
-    fmc_full: DmaObject,
+    pub(crate) fmc_full: KVec<u8>,
 }
 
 impl FspFirmware {
-    #[expect(unused)]
+    #[expect(dead_code)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
         chipset: Chipset,
@@ -36,9 +36,13 @@ pub(crate) fn new(
             EINVAL
         })?;
 
+        // Copy the full ELF into a kernel vector for CPU-side signature extraction
+        let mut fmc_full = KVec::with_capacity(fw.data().len(), GFP_KERNEL)?;
+        fmc_full.extend_from_slice(fw.data(), GFP_KERNEL)?;
+
         Ok(Self {
             fmc_image: DmaObject::from_data(dev, fmc_image_data)?,
-            fmc_full: DmaObject::from_data(dev, fw.data())?,
+            fmc_full,
         })
     }
 }
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 8926dd814a83..c66ad0a102a6 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -8,8 +8,14 @@
 
 use kernel::{
     device,
+    dma::CoherentAllocation,
     io::poll::read_poll_timeout,
     prelude::*,
+    ptr::{
+        Alignable,
+        Alignment, //
+    },
+    sizes::{SZ_1M, SZ_2M},
     time::Delta,
     transmute::{
         AsBytes,
@@ -38,7 +44,6 @@ pub(crate) const fn new(version: u16) -> Self {
     }
 
     /// Return the raw protocol version number for the wire format.
-    #[expect(dead_code)]
     pub(crate) const fn raw(self) -> u16 {
         self.0
     }
@@ -221,6 +226,73 @@ impl MessageToFsp for FspMessage {
     const NVDM_TYPE: u32 = NvdmType::Cot as u32;
 }
 
+/// Bundled arguments for FMC boot via FSP Chain of Trust.
+pub(crate) struct FmcBootArgs<'a> {
+    chipset: crate::gpu::Chipset,
+    fmc_image_fw: &'a crate::dma::DmaObject,
+    fmc_boot_params: kernel::dma::CoherentAllocation<GspFmcBootParams>,
+    resume: bool,
+    signatures: &'a FmcSignatures,
+}
+
+impl<'a> FmcBootArgs<'a> {
+    /// Build FMC boot arguments, allocating the DMA-coherent boot parameter
+    /// structure that FSP will read.
+    #[expect(dead_code)]
+    #[allow(clippy::too_many_arguments)]
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: crate::gpu::Chipset,
+        fmc_image_fw: &'a crate::dma::DmaObject,
+        wpr_meta_addr: u64,
+        wpr_meta_size: u32,
+        libos_addr: u64,
+        resume: bool,
+        signatures: &'a FmcSignatures,
+    ) -> Result<Self> {
+        const GSP_DMA_TARGET_COHERENT_SYSTEM: u32 = 1;
+        const GSP_DMA_TARGET_NONCOHERENT_SYSTEM: u32 = 2;
+
+        let fmc_boot_params = CoherentAllocation::<GspFmcBootParams>::alloc_coherent(
+            dev,
+            1,
+            GFP_KERNEL | __GFP_ZERO,
+        )?;
+
+        kernel::dma_write!(
+            fmc_boot_params[0].boot_gsp_rm_params.target = GSP_DMA_TARGET_COHERENT_SYSTEM
+        )?;
+        kernel::dma_write!(
+            fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_offset = wpr_meta_addr
+        )?;
+        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_size = wpr_meta_size)?;
+
+        // Blackwell FSP expects wpr_carveout_offset and wpr_carveout_size to be zero;
+        // it obtains WPR info from other sources.
+        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.b_is_gsp_rm_boot = 1)?;
+
+        kernel::dma_write!(
+            fmc_boot_params[0].gsp_rm_params.target = GSP_DMA_TARGET_NONCOHERENT_SYSTEM
+        )?;
+        kernel::dma_write!(fmc_boot_params[0].gsp_rm_params.boot_args_offset = libos_addr)?;
+
+        Ok(Self {
+            chipset,
+            fmc_image_fw,
+            fmc_boot_params,
+            resume,
+            signatures,
+        })
+    }
+
+    /// DMA address of the FMC boot parameters, needed after boot for lockdown
+    /// release polling.
+    #[expect(dead_code)]
+    pub(crate) fn boot_params_dma_handle(&self) -> u64 {
+        self.fmc_boot_params.dma_handle()
+    }
+}
+
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -315,8 +387,66 @@ pub(crate) fn extract_fmc_signatures(
         Ok(signatures)
     }
 
-    /// Send message to FSP and wait for response.
+    /// Boot GSP FMC via FSP Chain of Trust.
+    ///
+    /// Builds the COT message from the pre-configured [`FmcBootArgs`], sends it
+    /// to FSP, and waits for the response.
     #[expect(dead_code)]
+    pub(crate) fn boot_fmc(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
+        args: &FmcBootArgs<'_>,
+    ) -> Result {
+        dev_dbg!(dev, "Starting FSP boot sequence for {}\n", args.chipset);
+
+        let fmc_addr = args.fmc_image_fw.dma_handle();
+        let fmc_boot_params_addr = args.fmc_boot_params.dma_handle();
+
+        // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
+        let frts_offset = if !args.resume {
+            let mut frts_reserved_size = crate::fb::calc_non_wpr_heap_size(args.chipset);
+
+            frts_reserved_size += u64::from(crate::fb::PMU_RESERVED_SIZE);
+
+            frts_reserved_size
+                .align_up(Alignment::new::<SZ_2M>())
+                .ok_or(EINVAL)?
+        } else {
+            0
+        };
+        let frts_size: u32 = if !args.resume { SZ_1M as u32 } else { 0 };
+
+        let msg = KBox::new(
+            FspMessage {
+                mctp_header: MctpHeader::single_packet().raw(),
+                nvdm_header: NvdmHeader::new(NvdmType::Cot).raw(),
+
+                cot: NvdmPayloadCot {
+                    version: args.chipset.fsp_cot_version().ok_or(ENOTSUPP)?.raw(),
+                    size: u16::try_from(core::mem::size_of::<NvdmPayloadCot>())
+                        .map_err(|_| EINVAL)?,
+                    gsp_fmc_sysmem_offset: fmc_addr,
+                    frts_sysmem_offset: 0,
+                    frts_sysmem_size: 0,
+                    frts_vidmem_offset: frts_offset,
+                    frts_vidmem_size: frts_size,
+                    hash384: args.signatures.hash384,
+                    public_key: args.signatures.public_key,
+                    signature: args.signatures.signature,
+                    gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
+                },
+            },
+            GFP_KERNEL,
+        )?;
+
+        Self::send_sync_fsp(dev, bar, fsp_falcon, &*msg)?;
+
+        dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
+        Ok(())
+    }
+
+    /// Send message to FSP and wait for response.
     fn send_sync_fsp<M>(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index fc34c97a61fc..51a91dc98415 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -133,7 +133,6 @@ pub(crate) const fn arch(&self) -> Architecture {
     ///
     /// Hopper (GH100) uses version 1, Blackwell uses version 2.
     /// Returns `None` for architectures that do not use FSP.
-    #[expect(dead_code)]
     pub(crate) const fn fsp_cot_version(&self) -> Option<FspCotVersion> {
         match self.arch() {
             Architecture::Hopper => Some(FspCotVersion::new(1)),
diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs
index 0dafc31b230c..c4e36a46fd69 100644
--- a/drivers/gpu/nova-core/mctp.rs
+++ b/drivers/gpu/nova-core/mctp.rs
@@ -6,8 +6,6 @@
 //! Device Management) messages between the kernel driver and GPU firmware
 //! processors such as FSP and GSP.
 
-#![expect(dead_code)]
-
 /// NVDM message type identifiers carried over MCTP.
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 #[repr(u32)]
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 31/38] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (29 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 30/38] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 32/38] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Blackwell GPUs moved the sysmem flush page registers away from the
legacy NV_PFB_NISO_FLUSH_SYSMEM_ADDR used by Ampere/Ada.

GB10x uses HSHUB0 registers, with both a primary and EG (egress) pair
that must be programmed to the same address. GB20x uses FBHUB0
registers.

Add separate GB100 and GB202 fb HALs, and split the Blackwell HAL
dispatch so that each uses its respective registers.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb/hal.rs       | 10 ++++-
 drivers/gpu/nova-core/fb/hal/gb100.rs | 47 +++++++++++++++++---
 drivers/gpu/nova-core/fb/hal/gb202.rs | 62 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/regs.rs         | 36 ++++++++++++++++
 4 files changed, 149 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs

diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index ebd12247f771..844b00868832 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -13,9 +13,14 @@
 mod ga100;
 mod ga102;
 mod gb100;
+mod gb202;
 mod gh100;
 mod tu102;
 
+/// Non-WPR heap size for Blackwell (2 MiB + 128 KiB).
+/// See Open RM: kgspCalculateFbLayout_GB100.
+const BLACKWELL_NON_WPR_HEAP_SIZE: u32 = 0x220000;
+
 pub(crate) trait FbHal {
     /// Returns the address of the currently-registered sysmem flush page.
     fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64;
@@ -46,6 +51,9 @@ pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
         Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
         Architecture::Hopper => gh100::GH100_HAL,
-        Architecture::Blackwell => gb100::GB100_HAL,
+        Architecture::Blackwell => match chipset {
+            Chipset::GB100 | Chipset::GB102 => gb100::GB100_HAL,
+            _ => gb202::GB202_HAL,
+        },
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
index bead99a6ca76..831a058a388b 100644
--- a/drivers/gpu/nova-core/fb/hal/gb100.rs
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -1,21 +1,59 @@
 // SPDX-License-Identifier: GPL-2.0
 
+//! Blackwell GB10x framebuffer HAL.
+//!
+//! GB10x GPUs use HSHUB0 registers for the sysmem flush page. Both the primary and EG (egress)
+//! register pairs must be programmed to the same address, as required by hardware.
+
 use kernel::prelude::*;
 
 use crate::{
     driver::Bar0,
-    fb::hal::FbHal, //
+    fb::hal::FbHal,
+    regs, //
 };
 
 struct Gb100;
 
+fn read_sysmem_flush_page_gb100(bar: &Bar0) -> u64 {
+    let lo = u64::from(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::read(bar).adr());
+    let hi = u64::from(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::read(bar).adr());
+
+    lo | (hi << 32)
+}
+
+fn write_sysmem_flush_page_gb100(bar: &Bar0, addr: u64) {
+    // CAST: lower 32 bits. Hardware ignores bits 7:0.
+    let addr_lo = addr as u32;
+    // CAST: upper 32 bits, then masked to 20 bits by the register field.
+    let addr_hi = (addr >> 32) as u32;
+
+    // Write HI first. The hardware will trigger the flush on the LO write.
+
+    // Primary HSHUB pair.
+    regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        .set_adr(addr_hi)
+        .write(bar);
+    regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        .set_adr(addr_lo)
+        .write(bar);
+
+    // EG (egress) pair -- must match the primary pair.
+    regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        .set_adr(addr_hi)
+        .write(bar);
+    regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        .set_adr(addr_lo)
+        .write(bar);
+}
+
 impl FbHal for Gb100 {
     fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
-        super::ga100::read_sysmem_flush_page_ga100(bar)
+        read_sysmem_flush_page_gb100(bar)
     }
 
     fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
-        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+        write_sysmem_flush_page_gb100(bar, addr);
 
         Ok(())
     }
@@ -29,8 +67,7 @@ fn vidmem_size(&self, bar: &Bar0) -> u64 {
     }
 
     fn non_wpr_heap_size(&self) -> Option<u32> {
-        // 2 MiB + 128 KiB non-WPR heap for Blackwell (see Open RM: kgspCalculateFbLayout_GB100).
-        Some(0x220000)
+        Some(super::BLACKWELL_NON_WPR_HEAP_SIZE)
     }
 }
 
diff --git a/drivers/gpu/nova-core/fb/hal/gb202.rs b/drivers/gpu/nova-core/fb/hal/gb202.rs
new file mode 100644
index 000000000000..2a4c3e7961b2
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb202.rs
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Blackwell GB20x framebuffer HAL.
+//!
+//! GB20x GPUs moved the sysmem flush registers from `NV_PFB_NISO_FLUSH_SYSMEM_ADDR` to
+//! `NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_{LO,HI}`.
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal,
+    regs, //
+};
+
+struct Gb202;
+
+fn read_sysmem_flush_page_gb202(bar: &Bar0) -> u64 {
+    let lo = u64::from(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::read(bar).adr());
+    let hi = u64::from(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::read(bar).adr());
+
+    lo | (hi << 32)
+}
+
+fn write_sysmem_flush_page_gb202(bar: &Bar0, addr: u64) {
+    // Write HI first. The hardware will trigger the flush on the LO write.
+    regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        // CAST: upper 32 bits, then masked to 20 bits by the register field.
+        .set_adr((addr >> 32) as u32)
+        .write(bar);
+    regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        // CAST: lower 32 bits. Hardware ignores bits 7:0.
+        .set_adr(addr as u32)
+        .write(bar);
+}
+
+impl FbHal for Gb202 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        read_sysmem_flush_page_gb202(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        write_sysmem_flush_page_gb202(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        Some(super::BLACKWELL_NON_WPR_HEAP_SIZE)
+    }
+}
+
+const GB202: Gb202 = Gb202;
+pub(super) const GB202_HAL: &dyn FbHal = &GB202;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 77d590887ee7..91911f9b32ca 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -116,6 +116,42 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
     23:0    adr_63_40 as u32;
 });
 
+// Blackwell GB10x sysmem flush registers (HSHUB0).
+//
+// GB10x GPUs use two pairs of HSHUB registers for sysmembar: a primary pair and an EG
+// (egress) pair. Both must be programmed to the same address. Hardware ignores bits 7:0
+// of each LO register. HSHUB0 base is 0x00891000.
+
+register!(NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x00891e50 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x00891e54 {
+    19:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x008916c0 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x008916c4 {
+    19:0    adr as u32;
+});
+
+// Blackwell GB20x sysmem flush registers (FBHUB0).
+//
+// Unlike the older NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an
+// 8-bit right-shift, these registers take the raw address split into lower/upper 32-bit halves.
+// The hardware ignores bits 7:0 of the LO register.
+
+register!(NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x008a1d58 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x008a1d5c {
+    19:0    adr as u32;
+});
+
 register!(NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE @ 0x00100ce0 {
     3:0     lower_scale as u8;
     9:4     lower_mag as u8;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 32/38] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (30 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 31/38] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 33/38] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run() John Hubbard
                   ` (5 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper, Blackwell and later GPUs require a larger heap for WPR2.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs     |  2 +-
 drivers/gpu/nova-core/gsp/fw.rs | 74 ++++++++++++++++++++++++---------
 2 files changed, 55 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 8b3ba9c9f464..08e6dd815352 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -247,7 +247,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         let wpr2_heap = {
             const WPR2_HEAP_DOWN_ALIGN: Alignment = Alignment::new::<SZ_1M>();
             let wpr2_heap_size =
-                gsp::LibosParams::from_chipset(chipset).wpr_heap_size(chipset, fb.end);
+                gsp::LibosParams::from_chipset(chipset).wpr_heap_size(chipset, fb.end)?;
             let wpr2_heap_addr = (elf.start - wpr2_heap_size).align_down(WPR2_HEAP_DOWN_ALIGN);
 
             FbRange(wpr2_heap_addr..(elf.start).align_down(WPR2_HEAP_DOWN_ALIGN))
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 086153edfa86..7fa9d3b1a592 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -49,32 +49,52 @@ enum GspFwHeapParams {}
 /// Minimum required alignment for the GSP heap.
 const GSP_HEAP_ALIGNMENT: Alignment = Alignment::new::<{ 1 << 20 }>();
 
+// These constants override the generated bindings for architecture-specific heap sizing.
+// See Open RM: kgspCalculateGspFwHeapSize and related functions.
+//
+// 14MB for Hopper/Blackwell+.
+const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100: u64 = 14 * num::usize_as_u64(SZ_1M);
+// 142MB client alloc for ~188MB total.
+const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100: u64 = 142 * num::usize_as_u64(SZ_1M);
+// Hopper/Blackwell+ minimum heap size: 170MB (88 + 12 + 70).
+// See Open RM: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB for the base 88MB,
+// plus Hopper+ additions in kgspCalculateGspFwHeapSize_GH100.
+const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_HOPPER: u64 = 170;
+
 impl GspFwHeapParams {
     /// Returns the amount of GSP-RM heap memory used during GSP-RM boot and initialization (up to
     /// and including the first client subdevice allocation).
-    fn base_rm_size(_chipset: Chipset) -> u64 {
-        // TODO: this needs to be updated to return the correct value for Hopper+ once support for
-        // them is added:
-        // u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100)
-        u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+    fn base_rm_size(chipset: Chipset) -> u64 {
+        use crate::gpu::Architecture;
+        match chipset.arch() {
+            Architecture::Hopper | Architecture::Blackwell => {
+                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
+            }
+            _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
+        }
     }
 
     /// Returns the amount of heap memory required to support a single channel allocation.
-    fn client_alloc_size() -> u64 {
-        u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
-            .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+    fn client_alloc_size(chipset: Chipset) -> Result<u64> {
+        use crate::gpu::Architecture;
+        let size = match chipset.arch() {
+            Architecture::Hopper | Architecture::Blackwell => {
+                GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100
+            }
+            _ => u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE),
+        };
+        size.align_up(GSP_HEAP_ALIGNMENT).ok_or(EINVAL)
     }
 
     /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
     /// `fb_size`.
-    fn management_overhead(fb_size: u64) -> u64 {
+    fn management_overhead(fb_size: u64) -> Result<u64> {
         let fb_size_gb = fb_size.div_ceil(u64::from_safe_cast(kernel::sizes::SZ_1G));
 
         u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
             .saturating_mul(fb_size_gb)
             .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+            .ok_or(EINVAL)
     }
 }
 
@@ -106,29 +126,43 @@ impl LibosParams {
                 * num::usize_as_u64(SZ_1M),
     };
 
+    /// Hopper/Blackwell+ GPUs need a larger minimum heap size than the bindings specify.
+    /// The r570 bindings set LIBOS3_BAREMETAL_MIN_MB to 88MB, but Hopper/Blackwell+ actually
+    /// requires 170MB (88 + 12 + 70).
+    const LIBOS_HOPPER: LibosParams = LibosParams {
+        carveout_size: num::u32_as_u64(bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL),
+        allowed_heap_size: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_HOPPER
+            * num::usize_as_u64(SZ_1M)
+            ..num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB)
+                * num::usize_as_u64(SZ_1M),
+    };
+
     /// Returns the libos parameters corresponding to `chipset`.
     pub(crate) fn from_chipset(chipset: Chipset) -> &'static LibosParams {
-        if chipset < Chipset::GA102 {
-            &Self::LIBOS2
-        } else {
-            &Self::LIBOS3
+        use crate::gpu::Architecture;
+        match chipset.arch() {
+            Architecture::Turing => &Self::LIBOS2,
+            Architecture::Ampere if chipset == Chipset::GA100 => &Self::LIBOS2,
+            Architecture::Ampere | Architecture::Ada => &Self::LIBOS3,
+            Architecture::Hopper | Architecture::Blackwell => &Self::LIBOS_HOPPER,
         }
     }
 
     /// Returns the amount of memory (in bytes) to allocate for the WPR heap for a framebuffer size
     /// of `fb_size` (in bytes) for `chipset`.
-    pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
+    pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> Result<u64> {
         // The WPR heap will contain the following:
         // LIBOS carveout,
-        self.carveout_size
+        Ok(self
+            .carveout_size
             // RM boot working memory,
             .saturating_add(GspFwHeapParams::base_rm_size(chipset))
             // One RM client,
-            .saturating_add(GspFwHeapParams::client_alloc_size())
+            .saturating_add(GspFwHeapParams::client_alloc_size(chipset)?)
             // Overhead for memory management.
-            .saturating_add(GspFwHeapParams::management_overhead(fb_size))
+            .saturating_add(GspFwHeapParams::management_overhead(fb_size)?)
             // Clamp to the supported heap sizes.
-            .clamp(self.allowed_heap_size.start, self.allowed_heap_size.end - 1)
+            .clamp(self.allowed_heap_size.start, self.allowed_heap_size.end - 1))
     }
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 33/38] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run()
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (31 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 32/38] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 34/38] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
                   ` (4 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Move the SEC2 reset/load/boot sequence into a BooterFirmware::run()
method, and call it from a thin run_booter() helper on Gsp. This is
almost a pure refactoring with no behavior change, done in preparation
for adding an alternative FSP boot path. The one slight difference is
that an MBOX1 printing typo is fixed:

Previous output:

NovaCore 0000:e1:00.0: SEC2 MBOX0: 0x0, MBOX10x1

Fixed output:

NovaCore 0000:e1:00.0: SEC2 MBOX0: 0x0, MBOX1: 0x1

Cc: Timur Tabi <ttabi@nvidia.com>
Suggested-by: Danilo Krummrich <dakr@kernel.org>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/booter.rs | 35 +++++++++++++++-
 drivers/gpu/nova-core/gsp/boot.rs        | 52 +++++++++---------------
 2 files changed, 54 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/booter.rs b/drivers/gpu/nova-core/firmware/booter.rs
index 86556cee8e67..b3ac3b826e9f 100644
--- a/drivers/gpu/nova-core/firmware/booter.rs
+++ b/drivers/gpu/nova-core/firmware/booter.rs
@@ -11,8 +11,12 @@
 
 use kernel::{
     device,
+    dma::CoherentAllocation,
     prelude::*,
-    transmute::FromBytes, //
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    },
 };
 
 use crate::{
@@ -389,6 +393,35 @@ pub(crate) fn new(
             ucode: ucode_signed,
         })
     }
+
+    /// Load and run the booter firmware on SEC2.
+    ///
+    /// Resets SEC2, loads this firmware image, then boots with the WPR metadata
+    /// address passed via the SEC2 mailboxes.
+    pub(crate) fn run<T: AsBytes + FromBytes>(
+        &self,
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        sec2_falcon: &Falcon<Sec2>,
+        wpr_meta: &CoherentAllocation<T>,
+    ) -> Result {
+        sec2_falcon.reset(bar)?;
+        sec2_falcon.load(bar, self)?;
+        let wpr_handle = wpr_meta.dma_handle();
+        let (mbox0, mbox1) = sec2_falcon.boot(
+            bar,
+            Some(wpr_handle as u32),
+            Some((wpr_handle >> 32) as u32),
+        )?;
+        dev_dbg!(dev, "SEC2 MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
+
+        if mbox0 != 0 {
+            dev_err!(dev, "Booter-load failed with error {:#x}\n", mbox0);
+            return Err(ENODEV);
+        }
+
+        Ok(())
+    }
 }
 
 impl FalconLoadParams for BooterFirmware {
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 465c18e4c888..7b177756d16d 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -120,6 +120,25 @@ fn run_fwsec_frts(
         }
     }
 
+    fn run_booter(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        sec2_falcon: &Falcon<Sec2>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+    ) -> Result {
+        let booter = BooterFirmware::new(
+            dev,
+            BooterKind::Loader,
+            chipset,
+            FIRMWARE_VERSION,
+            sec2_falcon,
+            bar,
+        )?;
+
+        booter.run(dev, bar, sec2_falcon, wpr_meta)
+    }
+
     /// Attempt to boot the GSP.
     ///
     /// This is a GPU-dependent and complex procedure that involves loading firmware files from
@@ -146,15 +165,6 @@ pub(crate) fn boot(
 
         Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
 
-        let booter_loader = BooterFirmware::new(
-            dev,
-            BooterKind::Loader,
-            chipset,
-            FIRMWARE_VERSION,
-            sec2_falcon,
-            bar,
-        )?;
-
         let wpr_meta =
             CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
         dma_write!(wpr_meta[0] = GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
@@ -182,29 +192,7 @@ pub(crate) fn boot(
             "Using SEC2 to load and run the booter_load firmware...\n"
         );
 
-        sec2_falcon.reset(bar)?;
-        sec2_falcon.load(bar, &booter_loader)?;
-        let wpr_handle = wpr_meta.dma_handle();
-        let (mbox0, mbox1) = sec2_falcon.boot(
-            bar,
-            Some(wpr_handle as u32),
-            Some((wpr_handle >> 32) as u32),
-        )?;
-        dev_dbg!(
-            pdev,
-            "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n",
-            mbox0,
-            mbox1
-        );
-
-        if mbox0 != 0 {
-            dev_err!(
-                pdev,
-                "Booter-load failed with error {:#x}\n",
-                mbox0
-            );
-            return Err(ENODEV);
-        }
+        Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?;
 
         gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 34/38] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (32 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 33/38] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run() John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 35/38] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
                   ` (3 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

On Hopper and Blackwell, FSP boots GSP with hardware lockdown enabled.
After FSP Chain of Trust completes, the driver must poll for lockdown
release before proceeding with GSP initialization. Add the register
bit and helper functions needed for this polling.

Cc: Gary Guo <gary@garyguo.net>
Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs | 80 ++++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs     |  1 +
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 7b177756d16d..5f3207bf7797 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -15,7 +15,8 @@
     falcon::{
         gsp::Gsp,
         sec2::Sec2,
-        Falcon, //
+        Falcon,
+        FalconEngine, //
     },
     fb::FbLayout,
     firmware::{
@@ -43,6 +44,54 @@
     vbios::Vbios,
 };
 
+/// GSP lockdown pattern written by firmware to mbox0 while RISC-V branch privilege
+/// lockdown is active. The low byte varies, the upper 24 bits are fixed.
+const GSP_LOCKDOWN_PATTERN: u32 = 0xbadf4100;
+const GSP_LOCKDOWN_MASK: u32 = 0xffffff00;
+
+/// GSP falcon mailbox state, used to track lockdown release status.
+struct GspMbox {
+    mbox0: u32,
+    mbox1: u32,
+}
+
+impl GspMbox {
+    /// Read both mailboxes from the GSP falcon.
+    fn read(gsp_falcon: &Falcon<Gsp>, bar: &Bar0) -> Self {
+        Self {
+            mbox0: gsp_falcon.read_mailbox0(bar),
+            mbox1: gsp_falcon.read_mailbox1(bar),
+        }
+    }
+
+    /// Returns true if the lockdown pattern is present in mbox0.
+    fn is_locked_down(&self) -> bool {
+        self.mbox0 != 0 && (self.mbox0 & GSP_LOCKDOWN_MASK) == GSP_LOCKDOWN_PATTERN
+    }
+
+    /// Combines mailbox0 and mailbox1 into a 64-bit address.
+    fn combined_addr(&self) -> u64 {
+        (u64::from(self.mbox1) << 32) | u64::from(self.mbox0)
+    }
+
+    /// Returns true if GSP lockdown has been released.
+    ///
+    /// Checks the lockdown pattern, validates the boot params address,
+    /// and verifies the HWCFG2 lockdown bit is clear.
+    fn lockdown_released(&self, bar: &Bar0, fmc_boot_params_addr: u64) -> bool {
+        if self.is_locked_down() {
+            return false;
+        }
+
+        if self.mbox0 != 0 && self.combined_addr() != fmc_boot_params_addr {
+            return true;
+        }
+
+        let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &crate::falcon::gsp::Gsp::ID);
+        !hwcfg2.riscv_br_priv_lockdown()
+    }
+}
+
 impl super::Gsp {
     /// Helper function to load and run the FWSEC-FRTS firmware and confirm that it has properly
     /// created the WPR2 region.
@@ -139,6 +188,35 @@ fn run_booter(
         booter.run(dev, bar, sec2_falcon, wpr_meta)
     }
 
+    /// Wait for GSP lockdown to be released after FSP Chain of Trust.
+    #[expect(dead_code)]
+    fn wait_for_gsp_lockdown_release(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        gsp_falcon: &Falcon<Gsp>,
+        fmc_boot_params_addr: u64,
+    ) -> Result {
+        dev_dbg!(dev, "Waiting for GSP lockdown release\n");
+
+        let mbox = read_poll_timeout(
+            || Ok(GspMbox::read(gsp_falcon, bar)),
+            |mbox| mbox.lockdown_released(bar, fmc_boot_params_addr),
+            Delta::from_millis(10),
+            Delta::from_millis(4000),
+        )
+        .inspect_err(|_| {
+            dev_err!(dev, "GSP lockdown release timeout\n");
+        })?;
+
+        if mbox.mbox0 != 0 {
+            dev_err!(dev, "GSP-FMC boot failed (mbox: {:#x})\n", mbox.mbox0);
+            return Err(EIO);
+        }
+
+        dev_dbg!(dev, "GSP lockdown released\n");
+        Ok(())
+    }
+
     /// Attempt to boot the GSP.
     ///
     /// This is a GPU-dependent and complex procedure that involves loading firmware files from
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 91911f9b32ca..8e4922399569 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -321,6 +321,7 @@ pub(crate) fn vga_workspace_addr(self) -> Option<u64> {
 register!(NV_PFALCON_FALCON_HWCFG2 @ PFalconBase[0x000000f4] {
     10:10   riscv as bool;
     12:12   mem_scrubbing as bool, "Set to 0 after memory scrubbing is completed";
+    13:13   riscv_br_priv_lockdown as bool, "RISC-V branch privilege lockdown bit";
     31:31   reset_ready as bool, "Signal indicating that reset is completed (GA102+)";
 });
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 35/38] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (33 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 34/38] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 36/38] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot() John Hubbard
                   ` (2 subsequent siblings)
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell GPUs use a different PCI config space mirror
address (0x088000) compared to older architectures (0x088480). Update
SetSystemInfo to accept a chipset parameter and select the correct
address based on architecture.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs        |  2 +-
 drivers/gpu/nova-core/gsp/commands.rs    |  8 +++++---
 drivers/gpu/nova-core/gsp/fw/commands.rs | 18 +++++++++++++++---
 3 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 5f3207bf7797..0db2c58e0765 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -248,7 +248,7 @@ pub(crate) fn boot(
         dma_write!(wpr_meta[0] = GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
 
         self.cmdq
-            .send_command(bar, commands::SetSystemInfo::new(pdev))?;
+            .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
         self.cmdq.send_command(bar, commands::SetRegistry::new())?;
 
         gsp_falcon.reset(bar)?;
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 8f270eca33be..e6a9a1fc6296 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -20,6 +20,7 @@
 
 use crate::{
     driver::Bar0,
+    gpu::Chipset,
     gsp::{
         cmdq::{
             Cmdq,
@@ -37,12 +38,13 @@
 /// The `GspSetSystemInfo` command.
 pub(crate) struct SetSystemInfo<'a> {
     pdev: &'a pci::Device<device::Bound>,
+    chipset: Chipset,
 }
 
 impl<'a> SetSystemInfo<'a> {
     /// Creates a new `GspSetSystemInfo` command using the parameters of `pdev`.
-    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>) -> Self {
-        Self { pdev }
+    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>, chipset: Chipset) -> Self {
+        Self { pdev, chipset }
     }
 }
 
@@ -52,7 +54,7 @@ impl<'a> CommandToGsp for SetSystemInfo<'a> {
     type InitError = Error;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
-        GspSetSystemInfo::init(self.pdev)
+        GspSetSystemInfo::init(self.pdev, self.chipset)
     }
 }
 
diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs
index 470d8edb62ff..fe8f56ba3e80 100644
--- a/drivers/gpu/nova-core/gsp/fw/commands.rs
+++ b/drivers/gpu/nova-core/gsp/fw/commands.rs
@@ -10,7 +10,13 @@
     }, //
 };
 
-use crate::gsp::GSP_PAGE_SIZE;
+use crate::{
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
+    gsp::GSP_PAGE_SIZE, //
+};
 
 use super::bindings;
 
@@ -24,7 +30,10 @@ pub(crate) struct GspSetSystemInfo {
 impl GspSetSystemInfo {
     /// Returns an in-place initializer for the `GspSetSystemInfo` command.
     #[allow(non_snake_case)]
-    pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, Error> + 'a {
+    pub(crate) fn init<'a>(
+        dev: &'a pci::Device<device::Bound>,
+        chipset: Chipset,
+    ) -> impl Init<Self, Error> + 'a {
         type InnerGspSystemInfo = bindings::GspSystemInfo;
         let init_inner = try_init!(InnerGspSystemInfo {
             gpuPhysAddr: dev.resource_start(0)?,
@@ -35,7 +44,10 @@ pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, E
             // Using TASK_SIZE in r535_gsp_rpc_set_system_info() seems wrong because
             // TASK_SIZE is per-task. That's probably a design issue in GSP-RM though.
             maxUserVa: (1 << 47) - 4096,
-            pciConfigMirrorBase: 0x088000,
+            pciConfigMirrorBase: match chipset.arch() {
+                Architecture::Turing | Architecture::Ampere | Architecture::Ada => 0x088000,
+                Architecture::Hopper | Architecture::Blackwell => 0x092000,
+            },
             pciConfigMirrorSize: 0x001000,
 
             PCIDeviceID: (u32::from(dev.device_id()) << 16) | u32::from(dev.vendor_id().as_raw()),
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 36/38] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (34 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 35/38] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 37/38] rust: sizes: add u64 variants of SZ_* constants John Hubbard
  2026-02-21  2:09 ` [PATCH v5 38/38] gpu: nova-core: use SZ_*_U64 constants from kernel::sizes John Hubbard
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP boot path for Hopper and Blackwell GPUs. These architectures
use FSP with FMC firmware for Chain of Trust boot, rather than SEC2.

boot() now dispatches to boot_via_sec2() or boot_via_fsp() based on
architecture. The SEC2 path keeps its original command ordering. The
FSP path sends SetSystemInfo/SetRegistry after GSP becomes active.
The GSP sequencer only runs for SEC2-based architectures.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/fsp.rs |   2 -
 drivers/gpu/nova-core/fsp.rs          |   5 -
 drivers/gpu/nova-core/gsp/boot.rs     | 190 +++++++++++++++++++-------
 3 files changed, 144 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index bb35f363b998..0e72f1378ef0 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -13,7 +13,6 @@
     gpu::Chipset, //
 };
 
-#[expect(dead_code)]
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the "image" ELF section).
     pub(crate) fmc_image: DmaObject,
@@ -22,7 +21,6 @@ pub(crate) struct FspFirmware {
 }
 
 impl FspFirmware {
-    #[expect(dead_code)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
         chipset: Chipset,
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index c66ad0a102a6..3749b5e3a677 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -238,7 +238,6 @@ pub(crate) struct FmcBootArgs<'a> {
 impl<'a> FmcBootArgs<'a> {
     /// Build FMC boot arguments, allocating the DMA-coherent boot parameter
     /// structure that FSP will read.
-    #[expect(dead_code)]
     #[allow(clippy::too_many_arguments)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
@@ -287,7 +286,6 @@ pub(crate) fn new(
 
     /// DMA address of the FMC boot parameters, needed after boot for lockdown
     /// release polling.
-    #[expect(dead_code)]
     pub(crate) fn boot_params_dma_handle(&self) -> u64 {
         self.fmc_boot_params.dma_handle()
     }
@@ -301,7 +299,6 @@ impl Fsp {
     ///
     /// Polls the thermal scratch register until FSP signals boot completion
     /// or timeout occurs.
-    #[expect(dead_code)]
     pub(crate) fn wait_secure_boot(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
@@ -331,7 +328,6 @@ pub(crate) fn wait_secure_boot(
     ///
     /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
     /// Returns signatures in a heap-allocated structure to prevent stack overflow.
-    #[expect(dead_code)]
     pub(crate) fn extract_fmc_signatures(
         dev: &device::Device<device::Bound>,
         fmc_fw_data: &[u8],
@@ -391,7 +387,6 @@ pub(crate) fn extract_fmc_signatures(
     ///
     /// Builds the COT message from the pre-configured [`FmcBootArgs`], sends it
     /// to FSP, and waits for the response.
-    #[expect(dead_code)]
     pub(crate) fn boot_fmc(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 0db2c58e0765..1fdcb72ce163 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -13,6 +13,7 @@
 use crate::{
     driver::Bar0,
     falcon::{
+        fsp::Fsp as FspEngine,
         gsp::Gsp,
         sec2::Sec2,
         Falcon,
@@ -24,6 +25,7 @@
             BooterFirmware,
             BooterKind, //
         },
+        fsp::FspFirmware,
         fwsec::{
             FwsecCommand,
             FwsecFirmware, //
@@ -31,9 +33,17 @@
         gsp::GspFirmware,
         FIRMWARE_VERSION, //
     },
-    gpu::Chipset,
+    fsp::{
+        FmcBootArgs,
+        Fsp, //
+    },
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
     gsp::{
         commands,
+        fw::LibosMemoryRegionInitArgument,
         sequencer::{
             GspSequencer,
             GspSequencerParams, //
@@ -188,8 +198,83 @@ fn run_booter(
         booter.run(dev, bar, sec2_falcon, wpr_meta)
     }
 
+    /// Boot GSP via SEC2 booter firmware (Turing/Ampere/Ada path).
+    ///
+    /// This path uses FWSEC-FRTS to set up WPR2, then boots GSP directly,
+    /// then uses SEC2 to run the booter firmware.
+    #[allow(clippy::too_many_arguments)]
+    fn boot_via_sec2(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        gsp_falcon: &Falcon<Gsp>,
+        sec2_falcon: &Falcon<Sec2>,
+        fb_layout: &FbLayout,
+        libos: &CoherentAllocation<LibosMemoryRegionInitArgument>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+    ) -> Result {
+        // Run FWSEC-FRTS to set up the WPR2 region
+        let bios = Vbios::new(dev, bar)?;
+        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, fb_layout)?;
+
+        // Reset and boot GSP before SEC2
+        gsp_falcon.reset(bar)?;
+        let libos_handle = libos.dma_handle();
+        let (mbox0, mbox1) = gsp_falcon.boot(
+            bar,
+            Some(libos_handle as u32),
+            Some((libos_handle >> 32) as u32),
+        )?;
+        dev_dbg!(dev, "GSP MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
+        dev_dbg!(
+            dev,
+            "Using SEC2 to load and run the booter_load firmware...\n"
+        );
+
+        // Run booter via SEC2
+        Self::run_booter(dev, bar, chipset, sec2_falcon, wpr_meta)
+    }
+
+    /// Boot GSP via FSP Chain of Trust (Hopper/Blackwell+ path).
+    ///
+    /// This path uses FSP to establish a chain of trust and boot GSP-FMC. FSP handles
+    /// the GSP boot internally - no manual GSP reset/boot is needed.
+    fn boot_via_fsp(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        gsp_falcon: &Falcon<Gsp>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+        libos: &CoherentAllocation<LibosMemoryRegionInitArgument>,
+    ) -> Result {
+        let fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
+
+        Fsp::wait_secure_boot(dev, bar, chipset.arch())?;
+
+        let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
+
+        let signatures = Fsp::extract_fmc_signatures(dev, &fsp_fw.fmc_full)?;
+
+        let args = FmcBootArgs::new(
+            dev,
+            chipset,
+            &fsp_fw.fmc_image,
+            wpr_meta.dma_handle(),
+            core::mem::size_of::<GspFwWprMeta>() as u32,
+            libos.dma_handle(),
+            false,
+            &signatures,
+        )?;
+
+        Fsp::boot_fmc(dev, bar, &fsp_falcon, &args)?;
+
+        let fmc_boot_params_addr = args.boot_params_dma_handle();
+        Self::wait_for_gsp_lockdown_release(dev, bar, gsp_falcon, fmc_boot_params_addr)?;
+
+        Ok(())
+    }
+
     /// Wait for GSP lockdown to be released after FSP Chain of Trust.
-    #[expect(dead_code)]
     fn wait_for_gsp_lockdown_release(
         dev: &device::Device<device::Bound>,
         bar: &Bar0,
@@ -233,45 +318,49 @@ pub(crate) fn boot(
         sec2_falcon: &Falcon<Sec2>,
     ) -> Result {
         let dev = pdev.as_ref();
-
-        let bios = Vbios::new(dev, bar)?;
+        let uses_sec2 = matches!(
+            chipset.arch(),
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada
+        );
 
         let gsp_fw = KBox::pin_init(GspFirmware::new(dev, chipset, FIRMWARE_VERSION), GFP_KERNEL)?;
 
         let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?;
         dev_dbg!(dev, "{:#x?}\n", fb_layout);
 
-        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
-
         let wpr_meta =
             CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
         dma_write!(wpr_meta[0] = GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
 
-        self.cmdq
-            .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
-        self.cmdq.send_command(bar, commands::SetRegistry::new())?;
-
-        gsp_falcon.reset(bar)?;
-        let libos_handle = self.libos.dma_handle();
-        let (mbox0, mbox1) = gsp_falcon.boot(
-            bar,
-            Some(libos_handle as u32),
-            Some((libos_handle >> 32) as u32),
-        )?;
-        dev_dbg!(
-            pdev,
-            "GSP MBOX0: {:#x}, MBOX1: {:#x}\n",
-            mbox0,
-            mbox1
-        );
-
-        dev_dbg!(
-            pdev,
-            "Using SEC2 to load and run the booter_load firmware...\n"
-        );
+        // Architecture-specific boot path
+        if uses_sec2 {
+            // SEC2 path: send commands before GSP reset/boot (original order).
+            self.cmdq
+                .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
+            self.cmdq.send_command(bar, commands::SetRegistry::new())?;
 
-        Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?;
+            Self::boot_via_sec2(
+                dev,
+                bar,
+                chipset,
+                gsp_falcon,
+                sec2_falcon,
+                &fb_layout,
+                &self.libos,
+                &wpr_meta,
+            )?;
+        } else {
+            Self::boot_via_fsp(
+                dev,
+                bar,
+                chipset,
+                gsp_falcon,
+                &wpr_meta,
+                &self.libos,
+            )?;
+        }
 
+        // Common post-boot initialization
         gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version);
 
         // Poll for RISC-V to become active before running sequencer
@@ -282,22 +371,31 @@ pub(crate) fn boot(
             Delta::from_secs(5),
         )?;
 
-        dev_dbg!(
-            pdev,
-            "RISC-V active? {}\n",
-            gsp_falcon.is_riscv_active(bar),
-        );
+        dev_dbg!(dev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(bar));
 
-        // Create and run the GSP sequencer.
-        let seq_params = GspSequencerParams {
-            bootloader_app_version: gsp_fw.bootloader.app_version,
-            libos_dma_handle: libos_handle,
-            gsp_falcon,
-            sec2_falcon,
-            dev: pdev.as_ref().into(),
-            bar,
-        };
-        GspSequencer::run(&mut self.cmdq, seq_params)?;
+        // For FSP path, send commands after GSP becomes active.
+        if matches!(
+            chipset.arch(),
+            Architecture::Hopper | Architecture::Blackwell
+        ) {
+            self.cmdq
+                .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
+            self.cmdq.send_command(bar, commands::SetRegistry::new())?;
+        }
+
+        // SEC2-based architectures need to run the GSP sequencer
+        if uses_sec2 {
+            let libos_handle = self.libos.dma_handle();
+            let seq_params = GspSequencerParams {
+                bootloader_app_version: gsp_fw.bootloader.app_version,
+                libos_dma_handle: libos_handle,
+                gsp_falcon,
+                sec2_falcon,
+                dev: dev.into(),
+                bar,
+            };
+            GspSequencer::run(&mut self.cmdq, seq_params)?;
+        }
 
         // Wait until GSP is fully initialized.
         commands::wait_gsp_init_done(&mut self.cmdq)?;
@@ -305,8 +403,8 @@ pub(crate) fn boot(
         // Obtain and display basic GPU information.
         let info = commands::get_gsp_info(&mut self.cmdq, bar)?;
         match info.gpu_name() {
-            Ok(name) => dev_info!(pdev, "GPU name: {}\n", name),
-            Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e),
+            Ok(name) => dev_info!(dev, "GPU name: {}\n", name),
+            Err(e) => dev_warn!(dev, "GPU name unavailable: {:?}\n", e),
         }
 
         Ok(())
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 37/38] rust: sizes: add u64 variants of SZ_* constants
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (35 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 36/38] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot() John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  2026-02-21  2:09 ` [PATCH v5 38/38] gpu: nova-core: use SZ_*_U64 constants from kernel::sizes John Hubbard
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Drivers that operate on 64-bit address spaces (GPU framebuffer layouts,
DMA regions, etc.) frequently need these size constants as a u64 type.
Today this requires repeated usize-to-u64 conversion calls like
usize_as_u64(SZ_1M) or u64::from_safe_cast(SZ_1M), which adds
boilerplate without any safety benefit.

Add u64-typed constants (SZ_1K_U64 through SZ_2G_U64) alongside the
existing usize constants. Every value fits in u64 (actually, within a
u32 for that matter), so the as-cast is always lossless.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 rust/kernel/sizes.rs | 51 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/rust/kernel/sizes.rs b/rust/kernel/sizes.rs
index 661e680d9330..a11c134be64e 100644
--- a/rust/kernel/sizes.rs
+++ b/rust/kernel/sizes.rs
@@ -48,3 +48,54 @@
 pub const SZ_1G: usize = bindings::SZ_1G as usize;
 /// 0x80000000
 pub const SZ_2G: usize = bindings::SZ_2G as usize;
+
+// `u64` variants of the size constants. These are the same values as the
+// `usize` constants above, but typed as `u64` to avoid repeated conversion
+// boilerplate in code that operates on 64-bit address spaces.
+//
+// CAST: every SZ_* value below fits in u64, so `as u64` is always lossless.
+
+/// [`SZ_1K`] as a [`u64`].
+pub const SZ_1K_U64: u64 = SZ_1K as u64;
+/// [`SZ_2K`] as a [`u64`].
+pub const SZ_2K_U64: u64 = SZ_2K as u64;
+/// [`SZ_4K`] as a [`u64`].
+pub const SZ_4K_U64: u64 = SZ_4K as u64;
+/// [`SZ_8K`] as a [`u64`].
+pub const SZ_8K_U64: u64 = SZ_8K as u64;
+/// [`SZ_16K`] as a [`u64`].
+pub const SZ_16K_U64: u64 = SZ_16K as u64;
+/// [`SZ_32K`] as a [`u64`].
+pub const SZ_32K_U64: u64 = SZ_32K as u64;
+/// [`SZ_64K`] as a [`u64`].
+pub const SZ_64K_U64: u64 = SZ_64K as u64;
+/// [`SZ_128K`] as a [`u64`].
+pub const SZ_128K_U64: u64 = SZ_128K as u64;
+/// [`SZ_256K`] as a [`u64`].
+pub const SZ_256K_U64: u64 = SZ_256K as u64;
+/// [`SZ_512K`] as a [`u64`].
+pub const SZ_512K_U64: u64 = SZ_512K as u64;
+/// [`SZ_1M`] as a [`u64`].
+pub const SZ_1M_U64: u64 = SZ_1M as u64;
+/// [`SZ_2M`] as a [`u64`].
+pub const SZ_2M_U64: u64 = SZ_2M as u64;
+/// [`SZ_4M`] as a [`u64`].
+pub const SZ_4M_U64: u64 = SZ_4M as u64;
+/// [`SZ_8M`] as a [`u64`].
+pub const SZ_8M_U64: u64 = SZ_8M as u64;
+/// [`SZ_16M`] as a [`u64`].
+pub const SZ_16M_U64: u64 = SZ_16M as u64;
+/// [`SZ_32M`] as a [`u64`].
+pub const SZ_32M_U64: u64 = SZ_32M as u64;
+/// [`SZ_64M`] as a [`u64`].
+pub const SZ_64M_U64: u64 = SZ_64M as u64;
+/// [`SZ_128M`] as a [`u64`].
+pub const SZ_128M_U64: u64 = SZ_128M as u64;
+/// [`SZ_256M`] as a [`u64`].
+pub const SZ_256M_U64: u64 = SZ_256M as u64;
+/// [`SZ_512M`] as a [`u64`].
+pub const SZ_512M_U64: u64 = SZ_512M as u64;
+/// [`SZ_1G`] as a [`u64`].
+pub const SZ_1G_U64: u64 = SZ_1G as u64;
+/// [`SZ_2G`] as a [`u64`].
+pub const SZ_2G_U64: u64 = SZ_2G as u64;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 38/38] gpu: nova-core: use SZ_*_U64 constants from kernel::sizes
  2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (36 preceding siblings ...)
  2026-02-21  2:09 ` [PATCH v5 37/38] rust: sizes: add u64 variants of SZ_* constants John Hubbard
@ 2026-02-21  2:09 ` John Hubbard
  37 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-21  2:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Replace manual usize_as_u64(SZ_*) and u64::from_safe_cast(SZ_*)
conversions with the new SZ_*_U64 constants throughout fb.rs, gsp/fw.rs,
and regs.rs. This removes the conversion boilerplate and the now-unused
usize_as_u64 import in fb.rs.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs     | 19 ++++++++-----------
 drivers/gpu/nova-core/gsp/fw.rs | 23 ++++++++++-------------
 drivers/gpu/nova-core/regs.rs   |  6 +++---
 3 files changed, 21 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 08e6dd815352..ab52a82e21a4 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -24,10 +24,7 @@
     firmware::gsp::GspFirmware,
     gpu::Chipset,
     gsp,
-    num::{
-        usize_as_u64,
-        FromSafeCast, //
-    },
+    num::FromSafeCast,
     regs,
 };
 
@@ -105,7 +102,7 @@ pub(crate) fn calc_non_wpr_heap_size(chipset: Chipset) -> u64 {
     hal::fb_hal(chipset)
         .non_wpr_heap_size()
         .map(u64::from)
-        .unwrap_or(usize_as_u64(SZ_1M))
+        .unwrap_or(SZ_1M_U64)
 }
 
 pub(crate) struct FbRange(Range<u64>);
@@ -136,8 +133,8 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
         if f.alternate() {
             let size = self.len();
 
-            if size < usize_as_u64(SZ_1M) {
-                let size_kib = size / usize_as_u64(SZ_1K);
+            if size < SZ_1M_U64 {
+                let size_kib = size / SZ_1K_U64;
                 f.write_fmt(fmt!(
                     "{:#x}..{:#x} ({} KiB)",
                     self.0.start,
@@ -145,7 +142,7 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
                     size_kib
                 ))
             } else {
-                let size_mib = size / usize_as_u64(SZ_1M);
+                let size_mib = size / SZ_1M_U64;
                 f.write_fmt(fmt!(
                     "{:#x}..{:#x} ({} MiB)",
                     self.0.start,
@@ -195,14 +192,14 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
 
         let vga_workspace = {
             let vga_base = {
-                const NV_PRAMIN_SIZE: u64 = usize_as_u64(SZ_1M);
+                const NV_PRAMIN_SIZE: u64 = SZ_1M_U64;
                 let base = fb.end - NV_PRAMIN_SIZE;
 
                 if hal.supports_display(bar) {
                     match regs::NV_PDISP_VGA_WORKSPACE_BASE::read(bar).vga_workspace_addr() {
                         Some(addr) => {
                             if addr < base {
-                                const VBIOS_WORKSPACE_SIZE: u64 = usize_as_u64(SZ_128K);
+                                const VBIOS_WORKSPACE_SIZE: u64 = SZ_128K_U64;
 
                                 // Point workspace address to end of framebuffer.
                                 fb.end - VBIOS_WORKSPACE_SIZE
@@ -222,7 +219,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
 
         let frts = {
             const FRTS_DOWN_ALIGN: Alignment = Alignment::new::<SZ_128K>();
-            const FRTS_SIZE: u64 = usize_as_u64(SZ_1M);
+            const FRTS_SIZE: u64 = SZ_1M_U64;
             let frts_base = vga_workspace.start.align_down(FRTS_DOWN_ALIGN) - FRTS_SIZE;
 
             FbRange(frts_base..frts_base + FRTS_SIZE)
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 7fa9d3b1a592..6ab0586d5e85 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -16,10 +16,7 @@
         Alignable,
         Alignment, //
     },
-    sizes::{
-        SZ_128K,
-        SZ_1M, //
-    },
+    sizes::*,
     transmute::{
         AsBytes,
         FromBytes, //
@@ -53,9 +50,9 @@ enum GspFwHeapParams {}
 // See Open RM: kgspCalculateGspFwHeapSize and related functions.
 //
 // 14MB for Hopper/Blackwell+.
-const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100: u64 = 14 * num::usize_as_u64(SZ_1M);
+const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100: u64 = 14 * SZ_1M_U64;
 // 142MB client alloc for ~188MB total.
-const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100: u64 = 142 * num::usize_as_u64(SZ_1M);
+const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100: u64 = 142 * SZ_1M_U64;
 // Hopper/Blackwell+ minimum heap size: 170MB (88 + 12 + 70).
 // See Open RM: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB for the base 88MB,
 // plus Hopper+ additions in kgspCalculateGspFwHeapSize_GH100.
@@ -89,7 +86,7 @@ fn client_alloc_size(chipset: Chipset) -> Result<u64> {
     /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
     /// `fb_size`.
     fn management_overhead(fb_size: u64) -> Result<u64> {
-        let fb_size_gb = fb_size.div_ceil(u64::from_safe_cast(kernel::sizes::SZ_1G));
+        let fb_size_gb = fb_size.div_ceil(SZ_1G_U64);
 
         u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
             .saturating_mul(fb_size_gb)
@@ -111,9 +108,9 @@ impl LibosParams {
     const LIBOS2: LibosParams = LibosParams {
         carveout_size: num::u32_as_u64(bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS2),
         allowed_heap_size: num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS2_MIN_MB)
-            * num::usize_as_u64(SZ_1M)
+            * SZ_1M_U64
             ..num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS2_MAX_MB)
-                * num::usize_as_u64(SZ_1M),
+                * SZ_1M_U64,
     };
 
     /// Version 3 of the GSP LIBOS (GA102+)
@@ -121,9 +118,9 @@ impl LibosParams {
         carveout_size: num::u32_as_u64(bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL),
         allowed_heap_size: num::u32_as_u64(
             bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB,
-        ) * num::usize_as_u64(SZ_1M)
+        ) * SZ_1M_U64
             ..num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB)
-                * num::usize_as_u64(SZ_1M),
+                * SZ_1M_U64,
     };
 
     /// Hopper/Blackwell+ GPUs need a larger minimum heap size than the bindings specify.
@@ -132,9 +129,9 @@ impl LibosParams {
     const LIBOS_HOPPER: LibosParams = LibosParams {
         carveout_size: num::u32_as_u64(bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL),
         allowed_heap_size: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_HOPPER
-            * num::usize_as_u64(SZ_1M)
+            * SZ_1M_U64
             ..num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB)
-                * num::usize_as_u64(SZ_1M),
+                * SZ_1M_U64,
     };
 
     /// Returns the libos parameters corresponding to `chipset`.
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 8e4922399569..7b075ddd3ccf 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -10,6 +10,7 @@
 use kernel::{
     io::Io,
     prelude::*,
+    sizes::*,
     time, //
 };
 
@@ -33,7 +34,6 @@
         Architecture,
         Chipset, //
     },
-    num::FromSafeCast,
 };
 
 // PMC
@@ -166,7 +166,7 @@ impl NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE {
     /// Returns the usable framebuffer size, in bytes.
     pub(crate) fn usable_fb_size(self) -> u64 {
         let size = (u64::from(self.lower_mag()) << u64::from(self.lower_scale()))
-            * u64::from_safe_cast(kernel::sizes::SZ_1M);
+            * SZ_1M_U64;
 
         if self.ecc_mode_enabled() {
             // Remove the amount of memory reserved for ECC (one per 16 units).
@@ -255,7 +255,7 @@ pub(crate) fn completed(self) -> bool {
 impl NV_USABLE_FB_SIZE_IN_MB {
     /// Returns the usable framebuffer size, in bytes.
     pub(crate) fn usable_fb_size(self) -> u64 {
-        u64::from(self.value()) * u64::from_safe_cast(kernel::sizes::SZ_1M)
+        u64::from(self.value()) * SZ_1M_U64
     }
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-21  2:09 ` [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature John Hubbard
@ 2026-02-21 20:50   ` Miguel Ojeda
  2026-02-22 19:03     ` John Hubbard
  2026-02-22  7:46   ` Gary Guo
  2026-02-23 11:23   ` Alice Ryhl
  2 siblings, 1 reply; 68+ messages in thread
From: Miguel Ojeda @ 2026-02-21 20:50 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Sat, Feb 21, 2026 at 3:11 AM John Hubbard <jhubbard@nvidia.com> wrote:
>
> [1] https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/

Link: https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/
[1]

> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {

Ah, I thought you wanted to put this in `drivers/gpu/nova-core/num.rs`
like in the previous version.

If it is here instead, then you shouldn't need the
`rust_allowed_features` change anymore, because we already enable
`inline_const` in the `kernel` crate.

> diff --git a/scripts/Makefile.build b/scripts/Makefile.build

Having said that, if you do end up needing it elsewhere, then please
add the other line added by Gary's patch, i.e.:

  +#   - Stable since Rust 1.79.0: `feature(inline_const)`.

Thanks!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-21  2:09 ` [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature John Hubbard
  2026-02-21 20:50   ` Miguel Ojeda
@ 2026-02-22  7:46   ` Gary Guo
  2026-02-22 19:04     ` John Hubbard
  2026-02-23 11:23   ` Alice Ryhl
  2 siblings, 1 reply; 68+ messages in thread
From: Gary Guo @ 2026-02-22  7:46 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On 2026-02-21 02:09, John Hubbard wrote:
> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
> equivalent of Alignable::align_up(). This uses inline_const to validate
> the alignment at compile time with a clear error message.
> 
> Add inline_const to rust_allowed_features in scripts/Makefile.build,
> following the approach in [1].
> 
> [1] https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/
> 
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Suggested-by: Miguel Ojeda <ojeda@kernel.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  rust/kernel/ptr.rs     | 27 +++++++++++++++++++++++++++
>  scripts/Makefile.build |  2 +-
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
> index 5b6a382637fe..b3509caa5ad7 100644
> --- a/rust/kernel/ptr.rs
> +++ b/rust/kernel/ptr.rs
> @@ -225,3 +225,30 @@ fn align_up(self, alignment: Alignment) -> Option<Self> {
>  }
>  
>  impl_alignable_uint!(u8, u16, u32, u64, usize);
> +
> +/// Aligns `value` up to `ALIGN` at compile time.
> +///
> +/// This is the const-compatible equivalent of [`Alignable::align_up`].
> +/// `ALIGN` must be a power of two (enforced at compile time).
> +///
> +/// Panics on overflow, which becomes a compile-time error when called in a
> +/// const context.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// use kernel::ptr::const_align_up;
> +/// use kernel::sizes::SZ_4K;
> +///
> +/// assert_eq!(const_align_up::<16>(0x4f), 0x50);
> +/// assert_eq!(const_align_up::<16>(0x40), 0x40);
> +/// assert_eq!(const_align_up::<SZ_4K>(1), SZ_4K);
> +/// ```
> +#[inline(always)]
> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
> +    match value.checked_add(ALIGN - 1) {
> +        Some(v) => v & !(ALIGN - 1),
> +        None => panic!("const_align_up: overflow"),

This is wrong. Either this function is always used in const context, in which case
you take `ALIGN` as normal function parameter and use `build_assert` and `build_error`,
or this function can be called from runtime and you shouldn't have a panic call here.

Best,
Gary

> +    }
> +}
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 32e209bc7985..a58a7d079710 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -319,7 +319,7 @@ $(obj)/%.lst: $(obj)/%.c FORCE
>  #
>  # Please see https://github.com/Rust-for-Linux/linux/issues/2 for details on
>  # the unstable features in use.
> -rust_allowed_features := asm_const,asm_goto,arbitrary_self_types,lint_reasons,offset_of_nested,raw_ref_op,used_with_arg
> +rust_allowed_features := asm_const,asm_goto,arbitrary_self_types,inline_const,lint_reasons,offset_of_nested,raw_ref_op,used_with_arg
>  
>  # `--out-dir` is required to avoid temporaries being created by `rustc` in the
>  # current working directory, which may be not accessible in the out-of-tree

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-21 20:50   ` Miguel Ojeda
@ 2026-02-22 19:03     ` John Hubbard
  2026-02-22 19:08       ` Miguel Ojeda
  0 siblings, 1 reply; 68+ messages in thread
From: John Hubbard @ 2026-02-22 19:03 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On 2/21/26 12:50 PM, Miguel Ojeda wrote:
> On Sat, Feb 21, 2026 at 3:11 AM John Hubbard <jhubbard@nvidia.com> wrote:
>>
>> [1] https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/
> 
> Link: https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/
> [1]
> 
>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
> 
> Ah, I thought you wanted to put this in `drivers/gpu/nova-core/num.rs`
> like in the previous version.

Works for me. I was anticipating that people wanted it in rust/ but
I'm perfectly happy to keep it local to nova-core.

> 
> If it is here instead, then you shouldn't need the
> `rust_allowed_features` change anymore, because we already enable
> `inline_const` in the `kernel` crate.

I see.

> 
>> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> 
> Having said that, if you do end up needing it elsewhere, then please
> add the other line added by Gary's patch, i.e.:
> 
>    +#   - Stable since Rust 1.79.0: `feature(inline_const)`.
> 
> Thanks!
> 
> Cheers,
> Miguel

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-22  7:46   ` Gary Guo
@ 2026-02-22 19:04     ` John Hubbard
  2026-02-23 11:07       ` Danilo Krummrich
  0 siblings, 1 reply; 68+ messages in thread
From: John Hubbard @ 2026-02-22 19:04 UTC (permalink / raw)
  To: Gary Guo
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On 2/21/26 11:46 PM, Gary Guo wrote:
> On 2026-02-21 02:09, John Hubbard wrote:
>> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>> equivalent of Alignable::align_up(). This uses inline_const to validate
>> the alignment at compile time with a clear error message.
>>
...

>> +#[inline(always)]
>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>> +    match value.checked_add(ALIGN - 1) {
>> +        Some(v) => v & !(ALIGN - 1),
>> +        None => panic!("const_align_up: overflow"),
> 
> This is wrong. Either this function is always used in const context, in which case
> you take `ALIGN` as normal function parameter and use `build_assert` and `build_error`,
> or this function can be called from runtime and you shouldn't have a panic call here.
>

I will have another go at this, and put it in nova-core as per Miguel's
comment as well. Thanks for catching this, Gary!


thanks,
-- 
John Hubbard

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-22 19:03     ` John Hubbard
@ 2026-02-22 19:08       ` Miguel Ojeda
  2026-02-23  3:36         ` Alexandre Courbot
  0 siblings, 1 reply; 68+ messages in thread
From: Miguel Ojeda @ 2026-02-22 19:08 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Sun, Feb 22, 2026 at 8:03 PM John Hubbard <jhubbard@nvidia.com> wrote:
>
> Works for me. I was anticipating that people wanted it in rust/ but
> I'm perfectly happy to keep it local to nova-core.

Sorry, I didn't mean you necessarily need to move it -- I only meant
to point out that if you do, then you don't need the other changes.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-22 19:08       ` Miguel Ojeda
@ 2026-02-23  3:36         ` Alexandre Courbot
  0 siblings, 0 replies; 68+ messages in thread
From: Alexandre Courbot @ 2026-02-23  3:36 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: John Hubbard, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Mon Feb 23, 2026 at 4:08 AM JST, Miguel Ojeda wrote:
> On Sun, Feb 22, 2026 at 8:03 PM John Hubbard <jhubbard@nvidia.com> wrote:
>>
>> Works for me. I was anticipating that people wanted it in rust/ but
>> I'm perfectly happy to keep it local to nova-core.
>
> Sorry, I didn't mean you necessarily need to move it -- I only meant
> to point out that if you do, then you don't need the other changes.

FWIW I think it makes more sense to keep it in `kernel` - even though
Nova is the only user for now, this is a useful addition in general.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-22 19:04     ` John Hubbard
@ 2026-02-23 11:07       ` Danilo Krummrich
  2026-02-23 14:16         ` Gary Guo
  0 siblings, 1 reply; 68+ messages in thread
From: Danilo Krummrich @ 2026-02-23 11:07 UTC (permalink / raw)
  To: John Hubbard
  Cc: Gary Guo, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On Sun Feb 22, 2026 at 8:04 PM CET, John Hubbard wrote:
> On 2/21/26 11:46 PM, Gary Guo wrote:
>> On 2026-02-21 02:09, John Hubbard wrote:
>>> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>>> equivalent of Alignable::align_up(). This uses inline_const to validate
>>> the alignment at compile time with a clear error message.
>>>
> ...
>
>>> +#[inline(always)]
>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>> +    match value.checked_add(ALIGN - 1) {
>>> +        Some(v) => v & !(ALIGN - 1),
>>> +        None => panic!("const_align_up: overflow"),
>> 
>> This is wrong. Either this function is always used in const context, in which case
>> you take `ALIGN` as normal function parameter and use `build_assert` and `build_error`,
>> or this function can be called from runtime and you shouldn't have a panic call here.

I think the most common case is that ALIGN is const, but value is not.

What about keeping the function as is (with the panic() replaced with a Result)
and also add

	#[inline(always)]
	pub const fn const_expect<T: Copy>(opt: Result<T>, &'static str) -> T {
	    match opt {
	        Ok(v) => v,
	        Err(_) => panic!(""),
	    }
	}

for when it is entirely called from const context, e.g.

	pub(crate) const PMU_RESERVED_SIZE: u32 =
	    const_expect(const_align_up::<SZ_128K>(SZ_8M + SZ_16M + SZ_4K), "...");

> I will have another go at this, and put it in nova-core as per Miguel's
> comment as well.

I think Miguel didn't mean to say it should not be in this file. I think the
current place makes sense, let's keep it there.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-21  2:09 ` [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature John Hubbard
  2026-02-21 20:50   ` Miguel Ojeda
  2026-02-22  7:46   ` Gary Guo
@ 2026-02-23 11:23   ` Alice Ryhl
  2 siblings, 0 replies; 68+ messages in thread
From: Alice Ryhl @ 2026-02-23 11:23 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, nouveau, rust-for-linux, LKML

On Fri, Feb 20, 2026 at 06:09:35PM -0800, John Hubbard wrote:
> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
> equivalent of Alignable::align_up(). This uses inline_const to validate
> the alignment at compile time with a clear error message.
> 
> Add inline_const to rust_allowed_features in scripts/Makefile.build,
> following the approach in [1].
> 
> [1] https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/
> 
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Suggested-by: Miguel Ojeda <ojeda@kernel.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

Note that Rust Binder's ptr_align could use this if you want another
user.

Alice


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-23 11:07       ` Danilo Krummrich
@ 2026-02-23 14:16         ` Gary Guo
  2026-02-23 14:20           ` Danilo Krummrich
  0 siblings, 1 reply; 68+ messages in thread
From: Gary Guo @ 2026-02-23 14:16 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: John Hubbard, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On 2026-02-23 11:07, Danilo Krummrich wrote:
> On Sun Feb 22, 2026 at 8:04 PM CET, John Hubbard wrote:
>> On 2/21/26 11:46 PM, Gary Guo wrote:
>>> On 2026-02-21 02:09, John Hubbard wrote:
>>>> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>>>> equivalent of Alignable::align_up(). This uses inline_const to validate
>>>> the alignment at compile time with a clear error message.
>>>>
>> ...
>>
>>>> +#[inline(always)]
>>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
>>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>>> +    match value.checked_add(ALIGN - 1) {
>>>> +        Some(v) => v & !(ALIGN - 1),
>>>> +        None => panic!("const_align_up: overflow"),
>>> 
>>> This is wrong. Either this function is always used in const context, in which case
>>> you take `ALIGN` as normal function parameter and use `build_assert` and `build_error`,
>>> or this function can be called from runtime and you shouldn't have a panic call here.
> 
> I think the most common case is that ALIGN is const, but value is not.
> 
> What about keeping the function as is (with the panic() replaced with a Result)
> and also add
> 
> 	#[inline(always)]
> 	pub const fn const_expect<T: Copy>(opt: Result<T>, &'static str) -> T {
> 	    match opt {
> 	        Ok(v) => v,
> 	        Err(_) => panic!(""),
> 	    }
> 	}
> 

We already have `Alignable::align_up` for non-const cases, so this would only be used
in const context and I don't see the need of having explicit const_expect?

Best,
Gary

> for when it is entirely called from const context, e.g.
> 
> 	pub(crate) const PMU_RESERVED_SIZE: u32 =
> 	    const_expect(const_align_up::<SZ_128K>(SZ_8M + SZ_16M + SZ_4K), "...");
> 
>> I will have another go at this, and put it in nova-core as per Miguel's
>> comment as well.
> 
> I think Miguel didn't mean to say it should not be in this file. I think the
> current place makes sense, let's keep it there.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-23 14:16         ` Gary Guo
@ 2026-02-23 14:20           ` Danilo Krummrich
  2026-03-04  3:47             ` John Hubbard
  0 siblings, 1 reply; 68+ messages in thread
From: Danilo Krummrich @ 2026-02-23 14:20 UTC (permalink / raw)
  To: Gary Guo
  Cc: John Hubbard, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On Mon Feb 23, 2026 at 3:16 PM CET, Gary Guo wrote:
> On 2026-02-23 11:07, Danilo Krummrich wrote:
>> On Sun Feb 22, 2026 at 8:04 PM CET, John Hubbard wrote:
>>> On 2/21/26 11:46 PM, Gary Guo wrote:
>>>> On 2026-02-21 02:09, John Hubbard wrote:
>>>>> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>>>>> equivalent of Alignable::align_up(). This uses inline_const to validate
>>>>> the alignment at compile time with a clear error message.
>>>>>
>>> ...
>>>
>>>>> +#[inline(always)]
>>>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
>>>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>>>> +    match value.checked_add(ALIGN - 1) {
>>>>> +        Some(v) => v & !(ALIGN - 1),
>>>>> +        None => panic!("const_align_up: overflow"),
>>>> 
>>>> This is wrong. Either this function is always used in const context, in which case
>>>> you take `ALIGN` as normal function parameter and use `build_assert` and `build_error`,
>>>> or this function can be called from runtime and you shouldn't have a panic call here.
>> 
>> I think the most common case is that ALIGN is const, but value is not.
>> 
>> What about keeping the function as is (with the panic() replaced with a Result)
>> and also add
>> 
>> 	#[inline(always)]
>> 	pub const fn const_expect<T: Copy>(opt: Result<T>, &'static str) -> T {
>> 	    match opt {
>> 	        Ok(v) => v,
>> 	        Err(_) => panic!(""),
>> 	    }
>> 	}
>> 
>
> We already have `Alignable::align_up` for non-const cases, so this would only be used
> in const context and I don't see the need of having explicit const_expect?

Fair enough -- unfortunate we can't call this from const context.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems
  2026-02-21  2:09 ` [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
@ 2026-02-24 14:47   ` Danilo Krummrich
  2026-02-27 15:37     ` Gary Guo
  0 siblings, 1 reply; 68+ messages in thread
From: Danilo Krummrich @ 2026-02-24 14:47 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/21/26 3:09 AM, John Hubbard wrote:
> The auxiliary device registration was using a hardcoded ID of 0, which
> caused probe() to fail on multi-GPU systems with:
> 
>    sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'
> 
> Fix this by using an atomic counter to generate unique IDs for each
> GPU's aux device registration. The TODO item to eventually use XArray
> for recycling aux device IDs is retained, but for now, this works very
> nicely.
> 
> This has the side effect of making debugfs[1] work on multi-GPU systems.
> 
> [1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com
> 
> Reviewed-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

Applied to drm-rust-next, thanks!

    [ Use LKMM atomics; inline and slightly reword TODO comment. - Danilo ]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems
  2026-02-24 14:47   ` Danilo Krummrich
@ 2026-02-27 15:37     ` Gary Guo
  2026-02-27 15:41       ` Gary Guo
  0 siblings, 1 reply; 68+ messages in thread
From: Gary Guo @ 2026-02-27 15:37 UTC (permalink / raw)
  To: Danilo Krummrich, John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 24, 2026 at 2:47 PM GMT, Danilo Krummrich wrote:
> On 2/21/26 3:09 AM, John Hubbard wrote:
>> The auxiliary device registration was using a hardcoded ID of 0, which
>> caused probe() to fail on multi-GPU systems with:
>> 
>>    sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'
>> 
>> Fix this by using an atomic counter to generate unique IDs for each
>> GPU's aux device registration. The TODO item to eventually use XArray
>> for recycling aux device IDs is retained, but for now, this works very
>> nicely.
>> 
>> This has the side effect of making debugfs[1] work on multi-GPU systems.
>> 
>> [1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com
>> 
>> Reviewed-by: Gary Guo <gary@garyguo.net>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>
> Applied to drm-rust-next, thanks!
>
>     [ Use LKMM atomics; inline and slightly reword TODO comment. - Danilo ]

Danilo, can you drop this patch from drm-rust-next?

The patch that is supposed to be queued is
https://lore.kernel.org/rust-for-linux/20260205221758.219192-1-jhubbard@nvidia.com/#t,
which does correctly use LKMM atomics and add comments about possible use of
XArray.

In fact, I am not sure why this patch carries my R-b.

Best,
Gary

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems
  2026-02-27 15:37     ` Gary Guo
@ 2026-02-27 15:41       ` Gary Guo
  2026-02-27 16:05         ` Danilo Krummrich
  0 siblings, 1 reply; 68+ messages in thread
From: Gary Guo @ 2026-02-27 15:41 UTC (permalink / raw)
  To: Gary Guo, Danilo Krummrich, John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Fri Feb 27, 2026 at 3:37 PM GMT, Gary Guo wrote:
> On Tue Feb 24, 2026 at 2:47 PM GMT, Danilo Krummrich wrote:
>> On 2/21/26 3:09 AM, John Hubbard wrote:
>>> The auxiliary device registration was using a hardcoded ID of 0, which
>>> caused probe() to fail on multi-GPU systems with:
>>> 
>>>    sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'
>>> 
>>> Fix this by using an atomic counter to generate unique IDs for each
>>> GPU's aux device registration. The TODO item to eventually use XArray
>>> for recycling aux device IDs is retained, but for now, this works very
>>> nicely.
>>> 
>>> This has the side effect of making debugfs[1] work on multi-GPU systems.
>>> 
>>> [1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com
>>> 
>>> Reviewed-by: Gary Guo <gary@garyguo.net>
>>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>>
>> Applied to drm-rust-next, thanks!
>>
>>     [ Use LKMM atomics; inline and slightly reword TODO comment. - Danilo ]
>
> Danilo, can you drop this patch from drm-rust-next?
>
> The patch that is supposed to be queued is
> https://lore.kernel.org/rust-for-linux/20260205221758.219192-1-jhubbard@nvidia.com/#t,
> which does correctly use LKMM atomics and add comments about possible use of
> XArray.
>
> In fact, I am not sure why this patch carries my R-b.

Hmm, actually this patch contains updated comment but somehow have LKMM atomics
changed back to Rust atomics. Not sure what happens. Anyhow that patch should be
picked instead.

Best,
Gary

>
> Best,
> Gary


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems
  2026-02-27 15:41       ` Gary Guo
@ 2026-02-27 16:05         ` Danilo Krummrich
  2026-02-27 16:29           ` John Hubbard
  0 siblings, 1 reply; 68+ messages in thread
From: Danilo Krummrich @ 2026-02-27 16:05 UTC (permalink / raw)
  To: Gary Guo
  Cc: John Hubbard, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On Fri Feb 27, 2026 at 4:41 PM CET, Gary Guo wrote:
> On Fri Feb 27, 2026 at 3:37 PM GMT, Gary Guo wrote:
>> On Tue Feb 24, 2026 at 2:47 PM GMT, Danilo Krummrich wrote:
>>> On 2/21/26 3:09 AM, John Hubbard wrote:
>>>> The auxiliary device registration was using a hardcoded ID of 0, which
>>>> caused probe() to fail on multi-GPU systems with:
>>>> 
>>>>    sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'
>>>> 
>>>> Fix this by using an atomic counter to generate unique IDs for each
>>>> GPU's aux device registration. The TODO item to eventually use XArray
>>>> for recycling aux device IDs is retained, but for now, this works very
>>>> nicely.
>>>> 
>>>> This has the side effect of making debugfs[1] work on multi-GPU systems.
>>>> 
>>>> [1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com
>>>> 
>>>> Reviewed-by: Gary Guo <gary@garyguo.net>
>>>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>>>
>>> Applied to drm-rust-next, thanks!
>>>
>>>     [ Use LKMM atomics; inline and slightly reword TODO comment. - Danilo ]
>>
>> Danilo, can you drop this patch from drm-rust-next?
>>
>> The patch that is supposed to be queued is
>> https://lore.kernel.org/rust-for-linux/20260205221758.219192-1-jhubbard@nvidia.com/#t,
>> which does correctly use LKMM atomics and add comments about possible use of
>> XArray.
>>
>> In fact, I am not sure why this patch carries my R-b.
>
> Hmm, actually this patch contains updated comment but somehow have LKMM atomics
> changed back to Rust atomics. Not sure what happens. Anyhow that patch should be
> picked instead.

I picked up the latest version of this patch and fixed up the LKMM atomics, i.e.
the result should be correct:

https://gitlab.freedesktop.org/drm/rust/kernel/-/commit/d3f36fa57aa289c43e01da16c928a2cd971ad5dc

Looks like I could have picked v2 instead, as it seems to be identical except
that it already uses LKMM atomics.

@John: For the future, please don't send patches in multiple series / ways. I
think there was no reason to include the patch in this series in the first
place.

Thanks,
Danilo

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems
  2026-02-27 16:05         ` Danilo Krummrich
@ 2026-02-27 16:29           ` John Hubbard
  0 siblings, 0 replies; 68+ messages in thread
From: John Hubbard @ 2026-02-27 16:29 UTC (permalink / raw)
  To: Danilo Krummrich, Gary Guo
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/27/26 9:05 AM, Danilo Krummrich wrote:
> On Fri Feb 27, 2026 at 4:41 PM CET, Gary Guo wrote:
>> On Fri Feb 27, 2026 at 3:37 PM GMT, Gary Guo wrote:
>>> On Tue Feb 24, 2026 at 2:47 PM GMT, Danilo Krummrich wrote:
>>>> On 2/21/26 3:09 AM, John Hubbard wrote:
...
> I picked up the latest version of this patch and fixed up the LKMM atomics, i.e.
> the result should be correct:
> 
> https://gitlab.freedesktop.org/drm/rust/kernel/-/commit/d3f36fa57aa289c43e01da16c928a2cd971ad5dc
> 
> Looks like I could have picked v2 instead, as it seems to be identical except
> that it already uses LKMM atomics.
> 
> @John: For the future, please don't send patches in multiple series / ways. I
> think there was no reason to include the patch in this series in the first
> place.


OK, and sorry about the mix-up! I thought it would be "helpful",
but this turned out to be the opposite of that. :)

thanks,
John Hubbard


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-02-23 14:20           ` Danilo Krummrich
@ 2026-03-04  3:47             ` John Hubbard
  2026-03-04 11:18               ` Gary Guo
  0 siblings, 1 reply; 68+ messages in thread
From: John Hubbard @ 2026-03-04  3:47 UTC (permalink / raw)
  To: Danilo Krummrich, Gary Guo
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/23/26 6:20 AM, Danilo Krummrich wrote:
> On Mon Feb 23, 2026 at 3:16 PM CET, Gary Guo wrote:
>> On 2026-02-23 11:07, Danilo Krummrich wrote:
>>> On Sun Feb 22, 2026 at 8:04 PM CET, John Hubbard wrote:
>>>> On 2/21/26 11:46 PM, Gary Guo wrote:
>>>>> On 2026-02-21 02:09, John Hubbard wrote:
>>>>>> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>>>>>> equivalent of Alignable::align_up(). This uses inline_const to validate
>>>>>> the alignment at compile time with a clear error message.
>>>>>>
>>>> ...
>>>>
>>>>>> +#[inline(always)]
>>>>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
>>>>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>>>>> +    match value.checked_add(ALIGN - 1) {
>>>>>> +        Some(v) => v & !(ALIGN - 1),
>>>>>> +        None => panic!("const_align_up: overflow"),
>>>>>
>>>>> This is wrong. Either this function is always used in const context, in which case
>>>>> you take `ALIGN` as normal function parameter and use `build_assert` and `build_error`,
>>>>> or this function can be called from runtime and you shouldn't have a panic call here.
>>>
>>> I think the most common case is that ALIGN is const, but value is not.
>>>
>>> What about keeping the function as is (with the panic() replaced with a Result)
>>> and also add
>>>
>>> 	#[inline(always)]
>>> 	pub const fn const_expect<T: Copy>(opt: Result<T>, &'static str) -> T {
>>> 	    match opt {
>>> 	        Ok(v) => v,
>>> 	        Err(_) => panic!(""),
>>> 	    }
>>> 	}
>>>
>>
>> We already have `Alignable::align_up` for non-const cases, so this would only be used
>> in const context and I don't see the need of having explicit const_expect?
> 
> Fair enough -- unfortunate we can't call this from const context.

OK, so after the dust settled on this discussion, I *think* we ended up
at this, which I have staged for an upcoming v6. Did I understand you both
correctly?

commit 1360a440272976df697140361537c5b697d602e0
Author: John Hubbard <jhubbard@nvidia.com>
Date:   Thu Feb 19 14:44:02 2026 -0800

    rust: ptr: add const_align_up()
    
    Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
    equivalent of Alignable::align_up(). This uses inline_const to validate
    the alignment at compile time with a clear error message.
    
    Overflow causes a panic, which becomes a compile-time error in const
    context. For runtime alignment with fallible overflow handling, callers
    should use Alignable::align_up() instead.
    
    Suggested-by: Danilo Krummrich <dakr@kernel.org>
    Suggested-by: Miguel Ojeda <ojeda@kernel.org>
    Signed-off-by: John Hubbard <jhubbard@nvidia.com>

diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
index 5b6a382637fe..b8b0cb1e0b6b 100644
--- a/rust/kernel/ptr.rs
+++ b/rust/kernel/ptr.rs
@@ -225,3 +225,31 @@ fn align_up(self, alignment: Alignment) -> Option<Self> {
 }
 
 impl_alignable_uint!(u8, u16, u32, u64, usize);
+
+/// Aligns `value` up to `ALIGN` at compile time.
+///
+/// This is the const-compatible equivalent of [`Alignable::align_up`].
+/// `ALIGN` must be a power of two (enforced at compile time).
+///
+/// Panics on overflow, which becomes a compile-time error in const context.
+/// For runtime alignment with fallible overflow handling, use
+/// [`Alignable::align_up`] instead.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::ptr::const_align_up;
+/// use kernel::sizes::SZ_4K;
+///
+/// assert_eq!(const_align_up::<16>(0x4f), 0x50);
+/// assert_eq!(const_align_up::<16>(0x40), 0x40);
+/// assert_eq!(const_align_up::<SZ_4K>(1), SZ_4K);
+/// ```
+#[inline(always)]
+pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
+    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
+    match value.checked_add(ALIGN - 1) {
+        Some(v) => v & !(ALIGN - 1),
+        None => panic!("const_align_up: overflow"),
+    }
+}
<blueforge> linux-github (nova-core-blackwell-v6)$ 


thanks,
-- 
John Hubbard


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-04  3:47             ` John Hubbard
@ 2026-03-04 11:18               ` Gary Guo
  2026-03-04 18:53                 ` John Hubbard
  0 siblings, 1 reply; 68+ messages in thread
From: Gary Guo @ 2026-03-04 11:18 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Gary Guo
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
> On 2/23/26 6:20 AM, Danilo Krummrich wrote:
>> On Mon Feb 23, 2026 at 3:16 PM CET, Gary Guo wrote:
>>> On 2026-02-23 11:07, Danilo Krummrich wrote:
>>>> On Sun Feb 22, 2026 at 8:04 PM CET, John Hubbard wrote:
>>>>> On 2/21/26 11:46 PM, Gary Guo wrote:
>>>>>> On 2026-02-21 02:09, John Hubbard wrote:
>>>>>>> Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>>>>>>> equivalent of Alignable::align_up(). This uses inline_const to validate
>>>>>>> the alignment at compile time with a clear error message.
>>>>>>>
>>>>> ...
>>>>>
>>>>>>> +#[inline(always)]
>>>>>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
>>>>>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>>>>>> +    match value.checked_add(ALIGN - 1) {
>>>>>>> +        Some(v) => v & !(ALIGN - 1),
>>>>>>> +        None => panic!("const_align_up: overflow"),
>>>>>>
>>>>>> This is wrong. Either this function is always used in const context, in which case
>>>>>> you take `ALIGN` as normal function parameter and use `build_assert` and `build_error`,
>>>>>> or this function can be called from runtime and you shouldn't have a panic call here.
>>>>
>>>> I think the most common case is that ALIGN is const, but value is not.
>>>>
>>>> What about keeping the function as is (with the panic() replaced with a Result)
>>>> and also add
>>>>
>>>> 	#[inline(always)]
>>>> 	pub const fn const_expect<T: Copy>(opt: Result<T>, &'static str) -> T {
>>>> 	    match opt {
>>>> 	        Ok(v) => v,
>>>> 	        Err(_) => panic!(""),
>>>> 	    }
>>>> 	}
>>>>
>>>
>>> We already have `Alignable::align_up` for non-const cases, so this would only be used
>>> in const context and I don't see the need of having explicit const_expect?
>> 
>> Fair enough -- unfortunate we can't call this from const context.
>
> OK, so after the dust settled on this discussion, I *think* we ended up
> at this, which I have staged for an upcoming v6. Did I understand you both
> correctly?
>
> commit 1360a440272976df697140361537c5b697d602e0
> Author: John Hubbard <jhubbard@nvidia.com>
> Date:   Thu Feb 19 14:44:02 2026 -0800
>
>     rust: ptr: add const_align_up()
>     
>     Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>     equivalent of Alignable::align_up(). This uses inline_const to validate
>     the alignment at compile time with a clear error message.
>     
>     Overflow causes a panic, which becomes a compile-time error in const
>     context. For runtime alignment with fallible overflow handling, callers
>     should use Alignable::align_up() instead.
>     
>     Suggested-by: Danilo Krummrich <dakr@kernel.org>
>     Suggested-by: Miguel Ojeda <ojeda@kernel.org>
>     Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>
> diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
> index 5b6a382637fe..b8b0cb1e0b6b 100644
> --- a/rust/kernel/ptr.rs
> +++ b/rust/kernel/ptr.rs
> @@ -225,3 +225,31 @@ fn align_up(self, alignment: Alignment) -> Option<Self> {
>  }
>  
>  impl_alignable_uint!(u8, u16, u32, u64, usize);
> +
> +/// Aligns `value` up to `ALIGN` at compile time.
> +///
> +/// This is the const-compatible equivalent of [`Alignable::align_up`].
> +/// `ALIGN` must be a power of two (enforced at compile time).
> +///
> +/// Panics on overflow, which becomes a compile-time error in const context.
> +/// For runtime alignment with fallible overflow handling, use
> +/// [`Alignable::align_up`] instead.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// use kernel::ptr::const_align_up;
> +/// use kernel::sizes::SZ_4K;
> +///
> +/// assert_eq!(const_align_up::<16>(0x4f), 0x50);
> +/// assert_eq!(const_align_up::<16>(0x40), 0x40);
> +/// assert_eq!(const_align_up::<SZ_4K>(1), SZ_4K);
> +/// ```
> +#[inline(always)]
> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
> +    match value.checked_add(ALIGN - 1) {
> +        Some(v) => v & !(ALIGN - 1),
> +        None => panic!("const_align_up: overflow"),
> +    }
> +}

The implementation doesn't address any of my original comment and all my points
still apply.

Best,
Gary

> <blueforge> linux-github (nova-core-blackwell-v6)$ 
>
>
> thanks,


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-04 11:18               ` Gary Guo
@ 2026-03-04 18:53                 ` John Hubbard
  2026-03-04 19:04                   ` Gary Guo
  0 siblings, 1 reply; 68+ messages in thread
From: John Hubbard @ 2026-03-04 18:53 UTC (permalink / raw)
  To: Gary Guo, Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 3/4/26 3:18 AM, Gary Guo wrote:
> On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
...
> The implementation doesn't address any of my original comment and all my points
> still apply.
> 

OK, so that implies that you want to return an Option, I believe,
like this?

commit b41512390999f85bcb2a3809c68f392e936b09ab
Author: John Hubbard <jhubbard@nvidia.com>
Date:   Thu Feb 19 14:44:02 2026 -0800

    rust: ptr: add const_align_up()
    
    Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
    equivalent of Alignable::align_up(). This uses inline_const to validate
    the alignment at compile time with a clear error message.
    
    Suggested-by: Danilo Krummrich <dakr@kernel.org>
    Suggested-by: Gary Guo <gary@garyguo.net>
    Suggested-by: Miguel Ojeda <ojeda@kernel.org>
    Signed-off-by: John Hubbard <jhubbard@nvidia.com>

diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
index 5b6a382637fe..7e86429a9cb5 100644
--- a/rust/kernel/ptr.rs
+++ b/rust/kernel/ptr.rs
@@ -225,3 +225,29 @@ fn align_up(self, alignment: Alignment) -> Option<Self> {
 }
 
 impl_alignable_uint!(u8, u16, u32, u64, usize);
+
+/// Aligns `value` up to `ALIGN` at compile time.
+///
+/// This is the const-compatible equivalent of [`Alignable::align_up`].
+/// `ALIGN` must be a power of two (enforced at compile time).
+///
+/// Returns [`None`] on overflow.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::ptr::const_align_up;
+/// use kernel::sizes::SZ_4K;
+///
+/// assert_eq!(const_align_up::<16>(0x4f), Some(0x50));
+/// assert_eq!(const_align_up::<16>(0x40), Some(0x40));
+/// assert_eq!(const_align_up::<SZ_4K>(1), Some(SZ_4K));
+/// ```
+#[inline(always)]
+pub const fn const_align_up<const ALIGN: usize>(value: usize) -> Option<usize> {
+    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
+    match value.checked_add(ALIGN - 1) {
+        Some(v) => Some(v & !(ALIGN - 1)),
+        None => None,
+    }
+}


thanks,
-- 
John Hubbard


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-04 18:53                 ` John Hubbard
@ 2026-03-04 19:04                   ` Gary Guo
  2026-03-04 19:14                     ` John Hubbard
  0 siblings, 1 reply; 68+ messages in thread
From: Gary Guo @ 2026-03-04 19:04 UTC (permalink / raw)
  To: John Hubbard, Gary Guo, Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Wed Mar 4, 2026 at 6:53 PM GMT, John Hubbard wrote:
> On 3/4/26 3:18 AM, Gary Guo wrote:
>> On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
> ...
>> The implementation doesn't address any of my original comment and all my points
>> still apply.
>> 
>
> OK, so that implies that you want to return an Option, I believe,
> like this?
>
> commit b41512390999f85bcb2a3809c68f392e936b09ab
> Author: John Hubbard <jhubbard@nvidia.com>
> Date:   Thu Feb 19 14:44:02 2026 -0800
>
>     rust: ptr: add const_align_up()
>     
>     Add const_align_up<ALIGN>() to kernel::ptr as the const-compatible
>     equivalent of Alignable::align_up(). This uses inline_const to validate
>     the alignment at compile time with a clear error message.
>     
>     Suggested-by: Danilo Krummrich <dakr@kernel.org>
>     Suggested-by: Gary Guo <gary@garyguo.net>
>     Suggested-by: Miguel Ojeda <ojeda@kernel.org>
>     Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>
> diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
> index 5b6a382637fe..7e86429a9cb5 100644
> --- a/rust/kernel/ptr.rs
> +++ b/rust/kernel/ptr.rs
> @@ -225,3 +225,29 @@ fn align_up(self, alignment: Alignment) -> Option<Self> {
>  }
>  
>  impl_alignable_uint!(u8, u16, u32, u64, usize);
> +
> +/// Aligns `value` up to `ALIGN` at compile time.
> +///
> +/// This is the const-compatible equivalent of [`Alignable::align_up`].
> +/// `ALIGN` must be a power of two (enforced at compile time).
> +///
> +/// Returns [`None`] on overflow.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// use kernel::ptr::const_align_up;
> +/// use kernel::sizes::SZ_4K;
> +///
> +/// assert_eq!(const_align_up::<16>(0x4f), Some(0x50));
> +/// assert_eq!(const_align_up::<16>(0x40), Some(0x40));
> +/// assert_eq!(const_align_up::<SZ_4K>(1), Some(SZ_4K));
> +/// ```
> +#[inline(always)]
> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> Option<usize> {
> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
> +    match value.checked_add(ALIGN - 1) {
> +        Some(v) => Some(v & !(ALIGN - 1)),
> +        None => None,
> +    }
> +}

I think your signature should probably just be

pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
    ...
}

Best,
Gary

>
>
> thanks,


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-04 19:04                   ` Gary Guo
@ 2026-03-04 19:14                     ` John Hubbard
  2026-03-05  1:23                       ` Alexandre Courbot
  0 siblings, 1 reply; 68+ messages in thread
From: John Hubbard @ 2026-03-04 19:14 UTC (permalink / raw)
  To: Gary Guo, Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 3/4/26 11:04 AM, Gary Guo wrote:
> On Wed Mar 4, 2026 at 6:53 PM GMT, John Hubbard wrote:
>> On 3/4/26 3:18 AM, Gary Guo wrote:
>>> On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
>> ...
>> +#[inline(always)]
>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> Option<usize> {
>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>> +    match value.checked_add(ALIGN - 1) {
>> +        Some(v) => Some(v & !(ALIGN - 1)),
>> +        None => None,
>> +    }
>> +}
> 
> I think your signature should probably just be
> 
> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>     ...
> }
> 

OK yes that's a bit nicer. I've done that for v6, thanks!

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-04 19:14                     ` John Hubbard
@ 2026-03-05  1:23                       ` Alexandre Courbot
  2026-03-05  1:31                         ` John Hubbard
  0 siblings, 1 reply; 68+ messages in thread
From: Alexandre Courbot @ 2026-03-05  1:23 UTC (permalink / raw)
  To: John Hubbard
  Cc: Gary Guo, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 4:14 AM JST, John Hubbard wrote:
> On 3/4/26 11:04 AM, Gary Guo wrote:
>> On Wed Mar 4, 2026 at 6:53 PM GMT, John Hubbard wrote:
>>> On 3/4/26 3:18 AM, Gary Guo wrote:
>>>> On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
>>> ...
>>> +#[inline(always)]
>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> Option<usize> {
>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>> +    match value.checked_add(ALIGN - 1) {
>>> +        Some(v) => Some(v & !(ALIGN - 1)),
>>> +        None => None,
>>> +    }
>>> +}
>> 
>> I think your signature should probably just be
>> 
>> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>>     ...
>> }
>> 
>
> OK yes that's a bit nicer. I've done that for v6, thanks!

Hold on a bit - if we are purposing this new method for use in const
contexts, what use do we have for a `None` return value? By definition
we would know both `value` and `align` and thus the result is
deterministic.

We do have an alignment method for non-const contexts already. Gary's
initial comment was:

> Either this function is always used in const context, in which case
> you take `ALIGN` as normal function parameter and use `build_assert` and
> `build_error`

So why not make both arguments generic in this new method, and fail at
build in case of overflow? 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05  1:23                       ` Alexandre Courbot
@ 2026-03-05  1:31                         ` John Hubbard
  2026-03-05  7:07                           ` Alexandre Courbot
  0 siblings, 1 reply; 68+ messages in thread
From: John Hubbard @ 2026-03-05  1:31 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Gary Guo, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On 3/4/26 5:23 PM, Alexandre Courbot wrote:
> On Thu Mar 5, 2026 at 4:14 AM JST, John Hubbard wrote:
>> On 3/4/26 11:04 AM, Gary Guo wrote:
>>> On Wed Mar 4, 2026 at 6:53 PM GMT, John Hubbard wrote:
>>>> On 3/4/26 3:18 AM, Gary Guo wrote:
>>>>> On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
>>>> ...
>>>> +#[inline(always)]
>>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> Option<usize> {
>>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>>> +    match value.checked_add(ALIGN - 1) {
>>>> +        Some(v) => Some(v & !(ALIGN - 1)),
>>>> +        None => None,
>>>> +    }
>>>> +}
>>>
>>> I think your signature should probably just be
>>>
>>> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>>>     ...
>>> }
>>>
>>
>> OK yes that's a bit nicer. I've done that for v6, thanks!
> 
> Hold on a bit - if we are purposing this new method for use in const
> contexts, what use do we have for a `None` return value? By definition
> we would know both `value` and `align` and thus the result is
> deterministic.
> 
> We do have an alignment method for non-const contexts already. Gary's
> initial comment was:
> 
>> Either this function is always used in const context, in which case
>> you take `ALIGN` as normal function parameter and use `build_assert` and
>> `build_error`
> 
> So why not make both arguments generic in this new method, and fail at
> build in case of overflow? 

At this point, it is completely impossible to write a patch that complies
with Gary, Danilo, and Alex. It's all over the map.

Here's what is staged in v6 so far. I don't care anymore what we end up
with, but let's pick something:

Author: John Hubbard <jhubbard@nvidia.com>
Date:   Thu Feb 19 14:44:02 2026 -0800

    rust: ptr: add const_align_up()
    
    Add const_align_up() to kernel::ptr as the const-compatible equivalent
    of Alignable::align_up().
    
    Suggested-by: Danilo Krummrich <dakr@kernel.org>
    Suggested-by: Gary Guo <gary@garyguo.net>
    Suggested-by: Miguel Ojeda <ojeda@kernel.org>
    Signed-off-by: John Hubbard <jhubbard@nvidia.com>

diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
index 5b6a382637fe..1e15aac4faca 100644
--- a/rust/kernel/ptr.rs
+++ b/rust/kernel/ptr.rs
@@ -225,3 +225,27 @@ fn align_up(self, alignment: Alignment) -> Option<Self> {
 }
 
 impl_alignable_uint!(u8, u16, u32, u64, usize);
+
+/// Aligns `value` up to `align`.
+///
+/// This is the const-compatible equivalent of [`Alignable::align_up`].
+///
+/// Returns [`None`] on overflow.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::ptr::{const_align_up, Alignment};
+/// use kernel::sizes::SZ_4K;
+///
+/// assert_eq!(const_align_up(0x4f, Alignment::new::<16>()), Some(0x50));
+/// assert_eq!(const_align_up(0x40, Alignment::new::<16>()), Some(0x40));
+/// assert_eq!(const_align_up(1, Alignment::new::<SZ_4K>()), Some(SZ_4K));
+/// ```
+#[inline(always)]
+pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
+    match value.checked_add(align.as_usize() - 1) {
+        Some(v) => Some(v & align.mask()),
+        None => None,
+    }
+}



thanks,
-- 
John Hubbard


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05  1:31                         ` John Hubbard
@ 2026-03-05  7:07                           ` Alexandre Courbot
  2026-03-05 12:28                             ` Gary Guo
  0 siblings, 1 reply; 68+ messages in thread
From: Alexandre Courbot @ 2026-03-05  7:07 UTC (permalink / raw)
  To: John Hubbard
  Cc: Gary Guo, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 10:31 AM JST, John Hubbard wrote:
> On 3/4/26 5:23 PM, Alexandre Courbot wrote:
>> On Thu Mar 5, 2026 at 4:14 AM JST, John Hubbard wrote:
>>> On 3/4/26 11:04 AM, Gary Guo wrote:
>>>> On Wed Mar 4, 2026 at 6:53 PM GMT, John Hubbard wrote:
>>>>> On 3/4/26 3:18 AM, Gary Guo wrote:
>>>>>> On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
>>>>> ...
>>>>> +#[inline(always)]
>>>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> Option<usize> {
>>>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>>>> +    match value.checked_add(ALIGN - 1) {
>>>>> +        Some(v) => Some(v & !(ALIGN - 1)),
>>>>> +        None => None,
>>>>> +    }
>>>>> +}
>>>>
>>>> I think your signature should probably just be
>>>>
>>>> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>>>>     ...
>>>> }
>>>>
>>>
>>> OK yes that's a bit nicer. I've done that for v6, thanks!
>> 
>> Hold on a bit - if we are purposing this new method for use in const
>> contexts, what use do we have for a `None` return value? By definition
>> we would know both `value` and `align` and thus the result is
>> deterministic.
>> 
>> We do have an alignment method for non-const contexts already. Gary's
>> initial comment was:
>> 
>>> Either this function is always used in const context, in which case
>>> you take `ALIGN` as normal function parameter and use `build_assert` and
>>> `build_error`
>> 
>> So why not make both arguments generic in this new method, and fail at
>> build in case of overflow? 
>
> At this point, it is completely impossible to write a patch that complies
> with Gary, Danilo, and Alex. It's all over the map.

IIUC it is possible. Let's summarize the constraints:

- Gary wants to avoid a panic in case this gets called at runtime,
- Danilo suggested returning a Result that can be discarded in const
  context (but took that suggestion back as we already have methods for
  non-const contexts and thus wouldn't bring any benefit),
- I also pointed out that there is not reason to have a failure path for
  const context and suggested two generic arguments.

So here is what I had in mind, if using a standalone function:

  pub const fn const_align_up<const ALIGN: usize, const VALUE: usize>() -> usize {
      const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
      const {
          assert!(
              VALUE <= usize::MAX - (ALIGN - 1),
              "requested alignment would overflow"
          )
      };

      (VALUE + (ALIGN - 1)) & !(ALIGN - 1)
  }

  const TEST_ALIGN: usize = const_align_up::<256, 10>();

This uses purely const asserts, but you have to work with two `usize`
arguments. The version below looks a bit nicer as it leverages the
power-of-two invariant of `Alignment`:

  impl Alignment {
      const fn const_align_up(self, value: usize) -> usize {
          build_assert!(value <= usize::MAX - !self.mask());

          (value + !self.mask()) & self.mask()
      }
  }

  const TEST_ALIGN2: usize = Alignment::new::<256>().const_align_up(10);

It has to trade the const asserts for `build_assert`, which could cause
these cryptic error messages if called in a non-const context, so we
should document that this is only to be called in const contexts. But
otherwise it fits the bill and looks reasonable imho.

Unfortunate that we cannot make it generic against all integer types
without `const_trait_impl`, but generating `const_align_usize_up`,
`const_align_u32_up`... etc using a macro should be doable if needed.

Oh and if this cannot reach consensus I am ok with just dropping this
patch for now and doing something like this in the next one:

  const SZ_128K_ALIGN_MASK: usize = Alignment::new::<SZ_128K>().mask();

  const PMU_RESERVED_SIZE: usize = SZ_8M + SZ_16M + SZ_4K;
  // Align to 128K.
  const PMU_RESERVED_SIZE_ALIGNED: u32 = num::usize_into_u32::<
      { (PMU_RESERVED_SIZE + !SZ_128K_ALIGN_MASK) & SZ_128K_ALIGN_MASK },
  >();


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05  7:07                           ` Alexandre Courbot
@ 2026-03-05 12:28                             ` Gary Guo
  2026-03-05 12:36                               ` Danilo Krummrich
  2026-03-05 13:59                               ` Alexandre Courbot
  0 siblings, 2 replies; 68+ messages in thread
From: Gary Guo @ 2026-03-05 12:28 UTC (permalink / raw)
  To: Alexandre Courbot, John Hubbard
  Cc: Gary Guo, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 7:07 AM GMT, Alexandre Courbot wrote:
> On Thu Mar 5, 2026 at 10:31 AM JST, John Hubbard wrote:
>> On 3/4/26 5:23 PM, Alexandre Courbot wrote:
>>> On Thu Mar 5, 2026 at 4:14 AM JST, John Hubbard wrote:
>>>> On 3/4/26 11:04 AM, Gary Guo wrote:
>>>>> On Wed Mar 4, 2026 at 6:53 PM GMT, John Hubbard wrote:
>>>>>> On 3/4/26 3:18 AM, Gary Guo wrote:
>>>>>>> On Wed Mar 4, 2026 at 3:47 AM GMT, John Hubbard wrote:
>>>>>> ...
>>>>>> +#[inline(always)]
>>>>>> +pub const fn const_align_up<const ALIGN: usize>(value: usize) -> Option<usize> {
>>>>>> +    const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>>>>> +    match value.checked_add(ALIGN - 1) {
>>>>>> +        Some(v) => Some(v & !(ALIGN - 1)),
>>>>>> +        None => None,
>>>>>> +    }
>>>>>> +}
>>>>>
>>>>> I think your signature should probably just be
>>>>>
>>>>> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>>>>>     ...
>>>>> }
>>>>>
>>>>
>>>> OK yes that's a bit nicer. I've done that for v6, thanks!
>>> 
>>> Hold on a bit - if we are purposing this new method for use in const
>>> contexts, what use do we have for a `None` return value? By definition
>>> we would know both `value` and `align` and thus the result is
>>> deterministic.
>>> 
>>> We do have an alignment method for non-const contexts already. Gary's
>>> initial comment was:
>>> 
>>>> Either this function is always used in const context, in which case
>>>> you take `ALIGN` as normal function parameter and use `build_assert` and
>>>> `build_error`
>>> 
>>> So why not make both arguments generic in this new method, and fail at
>>> build in case of overflow? 
>>
>> At this point, it is completely impossible to write a patch that complies
>> with Gary, Danilo, and Alex. It's all over the map.
>
> IIUC it is possible. Let's summarize the constraints:
>
> - Gary wants to avoid a panic in case this gets called at runtime,
> - Danilo suggested returning a Result that can be discarded in const
>   context (but took that suggestion back as we already have methods for
>   non-const contexts and thus wouldn't bring any benefit),
> - I also pointed out that there is not reason to have a failure path for
>   const context and suggested two generic arguments.
>
> So here is what I had in mind, if using a standalone function:
>
>   pub const fn const_align_up<const ALIGN: usize, const VALUE: usize>() -> usize {
>       const { assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>       const {
>           assert!(
>               VALUE <= usize::MAX - (ALIGN - 1),
>               "requested alignment would overflow"
>           )
>       };
>
>       (VALUE + (ALIGN - 1)) & !(ALIGN - 1)
>   }

Eh, no, please don't start put everything into const generics params. These are
severely limited in usability, you can only refer to actual constant values, and
won't be able to use them in, say, const functions.

If we're going down this route I'd just want

    pub const fn const_align_up(align: usize, value: usize) -> usize

and use build asserts inside. If this is only used in const, then using
`build_assert!` is perfectly fine.

I think John just want to express `usize.align_up(Alignment)` in const context,
which we can't do with stable features only right now, hence I sugggested a
specific const function that has the same signature as the extension trait
method.

Returning a `Option` isn't an issue in const contexts, you can just use
`Option::unwrap` which is const (might need to enable a feature in 1.78, but it
is stable for a while now).

So you just have

    const TEST_ALIGN: usize = const_align_up(10, Alignment::new::<256>()).unwrap();

which would become

    const TEST_ALIGN: usize = 10.align_up(Alignmnet::new::<256>()).unwrap();

when we have const trait impl.

>
>   const TEST_ALIGN: usize = const_align_up::<256, 10>();
>
> This uses purely const asserts, but you have to work with two `usize`
> arguments. The version below looks a bit nicer as it leverages the
> power-of-two invariant of `Alignment`:
>
>   impl Alignment {
>       const fn const_align_up(self, value: usize) -> usize {
>           build_assert!(value <= usize::MAX - !self.mask());
>
>           (value + !self.mask()) & self.mask()
>       }

This is fine, too, although I think just returning an `Option` and ask user to
unwrap it in const eval is better.

Best,
Gary

>   }
>
>   const TEST_ALIGN2: usize = Alignment::new::<256>().const_align_up(10);
>
> It has to trade the const asserts for `build_assert`, which could cause
> these cryptic error messages if called in a non-const context, so we
> should document that this is only to be called in const contexts. But
> otherwise it fits the bill and looks reasonable imho.
>
> Unfortunate that we cannot make it generic against all integer types
> without `const_trait_impl`, but generating `const_align_usize_up`,
> `const_align_u32_up`... etc using a macro should be doable if needed.
>
> Oh and if this cannot reach consensus I am ok with just dropping this
> patch for now and doing something like this in the next one:
>
>   const SZ_128K_ALIGN_MASK: usize = Alignment::new::<SZ_128K>().mask();
>
>   const PMU_RESERVED_SIZE: usize = SZ_8M + SZ_16M + SZ_4K;
>   // Align to 128K.
>   const PMU_RESERVED_SIZE_ALIGNED: u32 = num::usize_into_u32::<
>       { (PMU_RESERVED_SIZE + !SZ_128K_ALIGN_MASK) & SZ_128K_ALIGN_MASK },
>   >();


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05 12:28                             ` Gary Guo
@ 2026-03-05 12:36                               ` Danilo Krummrich
  2026-03-05 12:59                                 ` Gary Guo
  2026-03-05 13:59                               ` Alexandre Courbot
  1 sibling, 1 reply; 68+ messages in thread
From: Danilo Krummrich @ 2026-03-05 12:36 UTC (permalink / raw)
  To: Gary Guo
  Cc: Alexandre Courbot, John Hubbard, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 1:28 PM CET, Gary Guo wrote:
> This is fine, too, although I think just returning an `Option` and ask user to
> unwrap it in const eval is better.

Well, that was my initial proposal (except that it was Result and
const_expect()).

IIRC, you were against this in favor of build_assert!()?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05 12:36                               ` Danilo Krummrich
@ 2026-03-05 12:59                                 ` Gary Guo
  0 siblings, 0 replies; 68+ messages in thread
From: Gary Guo @ 2026-03-05 12:59 UTC (permalink / raw)
  To: Danilo Krummrich, Gary Guo
  Cc: Alexandre Courbot, John Hubbard, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 12:36 PM GMT, Danilo Krummrich wrote:
> On Thu Mar 5, 2026 at 1:28 PM CET, Gary Guo wrote:
>> This is fine, too, although I think just returning an `Option` and ask user to
>> unwrap it in const eval is better.
>
> Well, that was my initial proposal (except that it was Result and
> const_expect()).
>
> IIRC, you were against this in favor of build_assert!()?

I think I was just commenting about the API design, and not the idea in general.

My view is that should either completely mimick the non-const API (just with
functions so it doesn't need const trait impl), or we should go all the way to
`build_assert!()`. Either is fine.

Best,
Gary

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05 12:28                             ` Gary Guo
  2026-03-05 12:36                               ` Danilo Krummrich
@ 2026-03-05 13:59                               ` Alexandre Courbot
  2026-03-05 14:05                                 ` Gary Guo
  1 sibling, 1 reply; 68+ messages in thread
From: Alexandre Courbot @ 2026-03-05 13:59 UTC (permalink / raw)
  To: Gary Guo
  Cc: John Hubbard, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 9:28 PM JST, Gary Guo wrote:
>>
>>   const TEST_ALIGN: usize = const_align_up::<256, 10>();
>>
>> This uses purely const asserts, but you have to work with two `usize`
>> arguments. The version below looks a bit nicer as it leverages the
>> power-of-two invariant of `Alignment`:
>>
>>   impl Alignment {
>>       const fn const_align_up(self, value: usize) -> usize {
>>           build_assert!(value <= usize::MAX - !self.mask());
>>
>>           (value + !self.mask()) & self.mask()
>>       }
>
> This is fine, too, although I think just returning an `Option` and ask user to
> unwrap it in const eval is better.

Why? Aren't unwraps something we want to avoid?

We already have fallible methods for non-const contexts, so why give
another method that essentially behaves the same when we want to use it
in scenarios where we know the result will be successful anyway?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05 13:59                               ` Alexandre Courbot
@ 2026-03-05 14:05                                 ` Gary Guo
  2026-03-05 15:17                                   ` Alexandre Courbot
  0 siblings, 1 reply; 68+ messages in thread
From: Gary Guo @ 2026-03-05 14:05 UTC (permalink / raw)
  To: Alexandre Courbot, Gary Guo
  Cc: John Hubbard, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 1:59 PM GMT, Alexandre Courbot wrote:
> On Thu Mar 5, 2026 at 9:28 PM JST, Gary Guo wrote:
>>>
>>>   const TEST_ALIGN: usize = const_align_up::<256, 10>();
>>>
>>> This uses purely const asserts, but you have to work with two `usize`
>>> arguments. The version below looks a bit nicer as it leverages the
>>> power-of-two invariant of `Alignment`:
>>>
>>>   impl Alignment {
>>>       const fn const_align_up(self, value: usize) -> usize {
>>>           build_assert!(value <= usize::MAX - !self.mask());
>>>
>>>           (value + !self.mask()) & self.mask()
>>>       }
>>
>> This is fine, too, although I think just returning an `Option` and ask user to
>> unwrap it in const eval is better.
>
> Why? Aren't unwraps something we want to avoid?
>
> We already have fallible methods for non-const contexts, so why give
> another method that essentially behaves the same when we want to use it
> in scenarios where we know the result will be successful anyway?

Unwrap is only bad when it can panic. It's not an issue for things that is
apparently const eval only, e.g. inside `const {}` or `const FOO: Bar = baz`.

Best,
Gary

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature
  2026-03-05 14:05                                 ` Gary Guo
@ 2026-03-05 15:17                                   ` Alexandre Courbot
  0 siblings, 0 replies; 68+ messages in thread
From: Alexandre Courbot @ 2026-03-05 15:17 UTC (permalink / raw)
  To: Gary Guo
  Cc: John Hubbard, Danilo Krummrich, Joel Fernandes, Alistair Popple,
	Eliot Courtney, Zhi Wang, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML

On Thu Mar 5, 2026 at 11:05 PM JST, Gary Guo wrote:
> On Thu Mar 5, 2026 at 1:59 PM GMT, Alexandre Courbot wrote:
>> On Thu Mar 5, 2026 at 9:28 PM JST, Gary Guo wrote:
>>>>
>>>>   const TEST_ALIGN: usize = const_align_up::<256, 10>();
>>>>
>>>> This uses purely const asserts, but you have to work with two `usize`
>>>> arguments. The version below looks a bit nicer as it leverages the
>>>> power-of-two invariant of `Alignment`:
>>>>
>>>>   impl Alignment {
>>>>       const fn const_align_up(self, value: usize) -> usize {
>>>>           build_assert!(value <= usize::MAX - !self.mask());
>>>>
>>>>           (value + !self.mask()) & self.mask()
>>>>       }
>>>
>>> This is fine, too, although I think just returning an `Option` and ask user to
>>> unwrap it in const eval is better.
>>
>> Why? Aren't unwraps something we want to avoid?
>>
>> We already have fallible methods for non-const contexts, so why give
>> another method that essentially behaves the same when we want to use it
>> in scenarios where we know the result will be successful anyway?
>
> Unwrap is only bad when it can panic. It's not an issue for things that is
> apparently const eval only, e.g. inside `const {}` or `const FOO: Bar = baz`.

If we go that route we are just providing const versions of the
`align_up` methods of `Alignable`. Why not, but it looks very redundant
to me.

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2026-03-05 15:17 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-21  2:09 [PATCH v5 00/38] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
2026-02-21  2:09 ` [PATCH v5 01/38] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
2026-02-24 14:47   ` Danilo Krummrich
2026-02-27 15:37     ` Gary Guo
2026-02-27 15:41       ` Gary Guo
2026-02-27 16:05         ` Danilo Krummrich
2026-02-27 16:29           ` John Hubbard
2026-02-21  2:09 ` [PATCH v5 02/38] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
2026-02-21  2:09 ` [PATCH v5 03/38] gpu: nova-core: print FB sizes, along with ranges John Hubbard
2026-02-21  2:09 ` [PATCH v5 04/38] gpu: nova-core: add FbRange.len() and use it in boot.rs John Hubbard
2026-02-21  2:09 ` [PATCH v5 05/38] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
2026-02-21  2:09 ` [PATCH v5 06/38] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section() John Hubbard
2026-02-21  2:09 ` [PATCH v5 07/38] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
2026-02-21  2:09 ` [PATCH v5 08/38] gpu: nova-core: apply the one "use" item per line policy to commands.rs John Hubbard
2026-02-21  2:09 ` [PATCH v5 09/38] gpu: nova-core: move GPU init and DMA mask setup into Gpu::new() John Hubbard
2026-02-21  2:09 ` [PATCH v5 10/38] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-02-21  2:09 ` [PATCH v5 11/38] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
2026-02-21  2:09 ` [PATCH v5 12/38] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
2026-02-21  2:09 ` [PATCH v5 13/38] gpu: nova-core: factor out an elf_str() function John Hubbard
2026-02-21  2:09 ` [PATCH v5 14/38] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-02-21  2:09 ` [PATCH v5 15/38] gpu: nova-core: add support for 32-bit " John Hubbard
2026-02-21  2:09 ` [PATCH v5 16/38] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-02-21  2:09 ` [PATCH v5 17/38] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
2026-02-21  2:09 ` [PATCH v5 18/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-02-21  2:09 ` [PATCH v5 19/38] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-02-21  2:09 ` [PATCH v5 20/38] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-02-21  2:09 ` [PATCH v5 21/38] rust: ptr: add const_align_up() and enable inline_const feature John Hubbard
2026-02-21 20:50   ` Miguel Ojeda
2026-02-22 19:03     ` John Hubbard
2026-02-22 19:08       ` Miguel Ojeda
2026-02-23  3:36         ` Alexandre Courbot
2026-02-22  7:46   ` Gary Guo
2026-02-22 19:04     ` John Hubbard
2026-02-23 11:07       ` Danilo Krummrich
2026-02-23 14:16         ` Gary Guo
2026-02-23 14:20           ` Danilo Krummrich
2026-03-04  3:47             ` John Hubbard
2026-03-04 11:18               ` Gary Guo
2026-03-04 18:53                 ` John Hubbard
2026-03-04 19:04                   ` Gary Guo
2026-03-04 19:14                     ` John Hubbard
2026-03-05  1:23                       ` Alexandre Courbot
2026-03-05  1:31                         ` John Hubbard
2026-03-05  7:07                           ` Alexandre Courbot
2026-03-05 12:28                             ` Gary Guo
2026-03-05 12:36                               ` Danilo Krummrich
2026-03-05 12:59                                 ` Gary Guo
2026-03-05 13:59                               ` Alexandre Courbot
2026-03-05 14:05                                 ` Gary Guo
2026-03-05 15:17                                   ` Alexandre Courbot
2026-02-23 11:23   ` Alice Ryhl
2026-02-21  2:09 ` [PATCH v5 22/38] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
2026-02-21  2:09 ` [PATCH v5 23/38] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
2026-02-21  2:09 ` [PATCH v5 24/38] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-02-21  2:09 ` [PATCH v5 25/38] gpu: nova-core: Hopper/Blackwell: add FSP message structures John Hubbard
2026-02-21  2:09 ` [PATCH v5 26/38] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-02-21  2:09 ` [PATCH v5 27/38] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-02-21  2:09 ` [PATCH v5 28/38] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
2026-02-21  2:09 ` [PATCH v5 29/38] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-02-21  2:09 ` [PATCH v5 30/38] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-02-21  2:09 ` [PATCH v5 31/38] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-02-21  2:09 ` [PATCH v5 32/38] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-02-21  2:09 ` [PATCH v5 33/38] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run() John Hubbard
2026-02-21  2:09 ` [PATCH v5 34/38] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-02-21  2:09 ` [PATCH v5 35/38] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-02-21  2:09 ` [PATCH v5 36/38] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot() John Hubbard
2026-02-21  2:09 ` [PATCH v5 37/38] rust: sizes: add u64 variants of SZ_* constants John Hubbard
2026-02-21  2:09 ` [PATCH v5 38/38] gpu: nova-core: use SZ_*_U64 constants from kernel::sizes John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox