[PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support

public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support
@ 2026-03-17 22:53 John Hubbard
  2026-03-17 22:53 ` [PATCH v7 01/31] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
                   ` (31 more replies)
  0 siblings, 32 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

This is based on today's drm-rust-next, which has Alex's register!()
macro series. A git branch is here:

    https://github.com/johnhubbard/linux/tree/nova-core-blackwell-v7

It's been re-tested on Turing, Ampere and Blackwell:

    NovaCore 0000:e1:00.0: GPU name: NVIDIA GeForce GTX 1650
    NovaCore 0000:e1:00.0: GPU name: NVIDIA RTX A4000
    NovaCore 0000:01:00.0: GPU name: NVIDIA RTX PRO 6000 Blackwell Max-Q
    Workstation Edition

Changes in v7:
* Rebased onto Alexandre Courbot's rust register!() series in
  drm-rust-next, including the related generic I/O accessor and
  IoCapable changes.

* Rebased onto drm-rust-next (v7.0-rc4 based).

* Dropped the v6 patches that are already in drm-rust-next: the
  aux-device fix, the pdev helper macro patch, and the one-item-per-line
  use cleanup.

* Reworked the GPU init pieces per review. DMA mask setup now stays in
  driver probe, with the mask width selected by GPU architecture, and
  the GFW boot policy now lives in a dedicated GPU HAL.

* Reworked firmware image parsing per review around a single ElfFormat
  trait with associated header types. Also added support for both ELF32
  and ELF64 images, with automatic format detection.

* Reworked the MCTP/NVDM protocol code to use bitfield! and typed
  accessors, removing the open-coded bit handling.

* Reworked the FSP messaging part of the series so that the message
  structures are introduced in the first patches that use them, instead
  of as a standalone dead-code-only patch. Also changed fmc_full to use
  KVec<u8> from the start.

* Split the WPR heap overflow handling out into a separate prep patch.
  That patch makes management_overhead() and wpr_heap_size() fallible,
  uses checked arithmetic, and leaves the larger WPR2 heap patch with
  only the Hopper and Blackwell sizing changes.

* Added a code comment documenting the Hopper and Blackwell PCI config
  mirror base change.

Changes in v6:

* Rebased onto drm-rust-next (v7.0-rc1 based).

* Dropped the first two patches from v5 (aux device fix and pdev
  macros), which have since been merged independently.

* const_align_up(): reworked per review from Gary Guo, Miguel Ojeda,
  and Danilo Krummrich: now returns Option<usize> instead of panicking,
  takes an Alignment argument instead of a const generic, and no longer
  needs the inline_const feature addition in scripts/Makefile.build.

* The rust/sizes and SZ_*_U64 patches from v5 are no longer included.
  I plan to post those as a separate series that depends on this one.

Changes in v5:

* Rebased onto linux.git master.

* Split MCTP protocol into its own module and file.

* Many Rust-based improvements: more use of types, especially. Also
  used Result and Option more.

* Lots of cleanup of comments and print output and error handling.

* Added const_align_up() to rust/ and used it in nova-core. This
  required enabling a Rust feature: inline_const, as recommended by
  Miguel Ojeda.

* Refactoring various things, such as Gpu::new() to own Spec creation,
  and several more such things.

* Fixed three Delta::ZERO busy-polls (patches 21, 24, 31) to use
  non-zero sleep intervals (after just realizing that it was a bad
  choice to have zero in there).

* Reduced GH100/GB100 HAL duplication. Made FSP_PKEY_SIZE/FSP_SIG_SIZE
  consistent across patches. Replaced fragile architecture checks with
  chipset.arch(). Renamed LIBOS_BLACKWELL.

* Narrowed the scope of some of the #![expect(dead_code)] cases,
  although that really only matters within the series, not once it is
  fully applied.

John Hubbard (31):
  gpu: nova-core: Hopper/Blackwell: basic GPU identification
  gpu: nova-core: factor .fwsignature* selection into a new
    find_gsp_sigs_section()
  gpu: nova-core: use GPU Architecture to simplify HAL selections
  gpu: nova-core: move GPU init into Gpu::new()
  gpu: nova-core: set DMA mask width based on GPU architecture
  gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  gpu: nova-core: move firmware image parsing code to firmware.rs
  gpu: nova-core: factor out an elf_str() function
  gpu: nova-core: don't assume 64-bit firmware images
  gpu: nova-core: add support for 32-bit firmware images
  gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
    of FSP
  gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  rust: ptr: add const_align_up()
  gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  gpu: nova-core: add MCTP/NVDM protocol types for firmware
    communication
  gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
    waiting
  gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  gpu: nova-core: Blackwell: use correct sysmem flush registers
  gpu: nova-core: make WPR heap sizing fallible
  gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  gpu: nova-core: refactor SEC2 booter loading into
    BooterFirmware::run()
  gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()

 drivers/gpu/nova-core/driver.rs          |  28 +-
 drivers/gpu/nova-core/falcon.rs          |   1 +
 drivers/gpu/nova-core/falcon/fsp.rs      | 220 ++++++++++
 drivers/gpu/nova-core/falcon/hal.rs      |  20 +-
 drivers/gpu/nova-core/fb.rs              |  26 +-
 drivers/gpu/nova-core/fb/hal.rs          |  38 +-
 drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs    |  75 ++++
 drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
 drivers/gpu/nova-core/fb/hal/gh100.rs    |  38 ++
 drivers/gpu/nova-core/firmware.rs        | 204 +++++++++
 drivers/gpu/nova-core/firmware/booter.rs |  35 +-
 drivers/gpu/nova-core/firmware/fsp.rs    |  47 ++
 drivers/gpu/nova-core/firmware/gsp.rs    | 128 ++----
 drivers/gpu/nova-core/fsp.rs             | 527 +++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs             |  86 +++-
 drivers/gpu/nova-core/gpu/hal.rs         |  54 +++
 drivers/gpu/nova-core/gsp/boot.rs        | 298 ++++++++++---
 drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
 drivers/gpu/nova-core/gsp/fw.rs          |  83 +++-
 drivers/gpu/nova-core/gsp/fw/commands.rs |  20 +-
 drivers/gpu/nova-core/mctp.rs            | 119 +++++
 drivers/gpu/nova-core/nova_core.rs       |   2 +
 drivers/gpu/nova-core/regs.rs            |  96 +++++
 rust/kernel/ptr.rs                       |  24 ++
 25 files changed, 2001 insertions(+), 240 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp.rs
 create mode 100644 drivers/gpu/nova-core/gpu/hal.rs
 create mode 100644 drivers/gpu/nova-core/mctp.rs

base-commit: d19ab42867ae7c68be84ed957d95712b7934773f
-- 
2.53.0

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 01/31] gpu: nova-core: Hopper/Blackwell: basic GPU identification
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 02/31] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section() John Hubbard
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Hopper (GH100) and Blackwell identification, including ELF
.fwsignature_* items.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/hal.rs   |  3 ++-
 drivers/gpu/nova-core/fb/hal.rs       |  5 ++---
 drivers/gpu/nova-core/firmware/gsp.rs |  9 +++++++--
 drivers/gpu/nova-core/gpu.rs          | 22 ++++++++++++++++++++++
 4 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index a7e5ea8d0272..c7f12f2a7a35 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -80,7 +80,8 @@ pub(super) fn falcon_hal<E: FalconEngine + 'static>(
         TU102 | TU104 | TU106 | TU116 | TU117 => {
             KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
+        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
+        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => {
             KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
         _ => return Err(ENOTSUPP),
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index aba0abd8ee00..e709affaa7e8 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -34,8 +34,7 @@ pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset {
         TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
         GA100 => ga100::GA100_HAL,
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
-            ga102::GA102_HAL
-        }
+        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
+        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => ga102::GA102_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 9488a626352f..c1f0a606f5c0 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -213,8 +213,7 @@ pub(crate) fn new<'a>(
                 signatures: {
                     let sigs_section = match chipset.arch() {
                         Architecture::Turing
-                            if matches!(chipset, Chipset::TU116 | Chipset::TU117) =>
-                        {
+                            if matches!(chipset, Chipset::TU116 | Chipset::TU117) => {
                             ".fwsignature_tu11x"
                         }
                         Architecture::Turing => ".fwsignature_tu10x",
@@ -222,6 +221,12 @@ pub(crate) fn new<'a>(
                         Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
                         Architecture::Ampere => ".fwsignature_ga10x",
                         Architecture::Ada => ".fwsignature_ad10x",
+                        Architecture::Hopper => ".fwsignature_gh10x",
+                        Architecture::Blackwell
+                            if matches!(chipset, Chipset::GB100 | Chipset::GB102) => {
+                            ".fwsignature_gb10x"
+                        }
+                        Architecture::Blackwell => ".fwsignature_gb20x",
                     };
 
                     elf::elf64_section(firmware.data(), sigs_section)
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 8579d632e717..3b4ccc3d18b9 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -83,12 +83,22 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
     GA104 = 0x174,
     GA106 = 0x176,
     GA107 = 0x177,
+    // Hopper
+    GH100 = 0x180,
     // Ada
     AD102 = 0x192,
     AD103 = 0x193,
     AD104 = 0x194,
     AD106 = 0x196,
     AD107 = 0x197,
+    // Blackwell
+    GB100 = 0x1a0,
+    GB102 = 0x1a2,
+    GB202 = 0x1b2,
+    GB203 = 0x1b3,
+    GB205 = 0x1b5,
+    GB206 = 0x1b6,
+    GB207 = 0x1b7,
 });
 
 impl Chipset {
@@ -100,9 +110,17 @@ pub(crate) const fn arch(self) -> Architecture {
             Self::GA100 | Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => {
                 Architecture::Ampere
             }
+            Self::GH100 => Architecture::Hopper,
             Self::AD102 | Self::AD103 | Self::AD104 | Self::AD106 | Self::AD107 => {
                 Architecture::Ada
             }
+            Self::GB100
+            | Self::GB102
+            | Self::GB202
+            | Self::GB203
+            | Self::GB205
+            | Self::GB206
+            | Self::GB207 => Architecture::Blackwell,
         }
     }
 
@@ -139,7 +157,9 @@ pub(crate) enum Architecture {
     #[default]
     Turing = 0x16,
     Ampere = 0x17,
+    Hopper = 0x18,
     Ada = 0x19,
+    Blackwell = 0x1b,
 }
 
 impl TryFrom<u8> for Architecture {
@@ -149,7 +169,9 @@ fn try_from(value: u8) -> Result<Self> {
         match value {
             0x16 => Ok(Self::Turing),
             0x17 => Ok(Self::Ampere),
+            0x18 => Ok(Self::Hopper),
             0x19 => Ok(Self::Ada),
+            0x1b => Ok(Self::Blackwell),
             _ => Err(ENODEV),
         }
     }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 02/31] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section()
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-03-17 22:53 ` [PATCH v7 01/31] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 03/31] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Keep Gsp::new() from getting too cluttered, by factoring out the
selection of .fwsignature* items. This will continue to grow as we add
GPUs.

Cc: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/gsp.rs | 36 ++++++++++++++-------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index c1f0a606f5c0..8bbc3809c640 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -146,6 +146,24 @@ pub(crate) struct GspFirmware {
 }
 
 impl GspFirmware {
+    fn find_gsp_sigs_section(chipset: Chipset) -> Option<&'static str> {
+        match chipset.arch() {
+            Architecture::Turing if matches!(chipset, Chipset::TU116 | Chipset::TU117) => {
+                Some(".fwsignature_tu11x")
+            }
+            Architecture::Turing => Some(".fwsignature_tu10x"),
+            // GA100 uses the same firmware as Turing
+            Architecture::Ampere if chipset == Chipset::GA100 => Some(".fwsignature_tu10x"),
+            Architecture::Ampere => Some(".fwsignature_ga10x"),
+            Architecture::Ada => Some(".fwsignature_ad10x"),
+            Architecture::Hopper => Some(".fwsignature_gh10x"),
+            Architecture::Blackwell if matches!(chipset, Chipset::GB100 | Chipset::GB102) => {
+                Some(".fwsignature_gb10x")
+            }
+            Architecture::Blackwell => Some(".fwsignature_gb20x"),
+        }
+    }
+
     /// Loads the GSP firmware binaries, map them into `dev`'s address-space, and creates the page
     /// tables expected by the GSP bootloader to load it.
     pub(crate) fn new<'a>(
@@ -211,23 +229,7 @@ pub(crate) fn new<'a>(
                 },
                 size,
                 signatures: {
-                    let sigs_section = match chipset.arch() {
-                        Architecture::Turing
-                            if matches!(chipset, Chipset::TU116 | Chipset::TU117) => {
-                            ".fwsignature_tu11x"
-                        }
-                        Architecture::Turing => ".fwsignature_tu10x",
-                        // GA100 uses the same firmware as Turing
-                        Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
-                        Architecture::Ampere => ".fwsignature_ga10x",
-                        Architecture::Ada => ".fwsignature_ad10x",
-                        Architecture::Hopper => ".fwsignature_gh10x",
-                        Architecture::Blackwell
-                            if matches!(chipset, Chipset::GB100 | Chipset::GB102) => {
-                            ".fwsignature_gb10x"
-                        }
-                        Architecture::Blackwell => ".fwsignature_gb20x",
-                    };
+                    let sigs_section = Self::find_gsp_sigs_section(chipset).ok_or(ENOTSUPP)?;
 
                     elf::elf64_section(firmware.data(), sigs_section)
                         .ok_or(EINVAL)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 03/31] gpu: nova-core: use GPU Architecture to simplify HAL selections
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-03-17 22:53 ` [PATCH v7 01/31] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
  2026-03-17 22:53 ` [PATCH v7 02/31] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section() John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 04/31] gpu: nova-core: move GPU init into Gpu::new() John Hubbard
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Replace per-chipset match arms with Architecture-based matching in the
falcon and FB HAL selection functions. This reduces the number of match
arms that need updating when new chipsets are added within an existing
architecture.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/hal.rs | 21 +++++++++++++--------
 drivers/gpu/nova-core/fb/hal.rs     | 17 +++++++++--------
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index c7f12f2a7a35..721c82f6a831 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -9,7 +9,10 @@
         FalconBromParams,
         FalconEngine, //
     },
-    gpu::Chipset,
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
 };
 
 mod ga102;
@@ -74,17 +77,19 @@ fn signature_reg_fuse_version(
 pub(super) fn falcon_hal<E: FalconEngine + 'static>(
     chipset: Chipset,
 ) -> Result<KBox<dyn FalconHal<E>>> {
-    use Chipset::*;
-
-    let hal = match chipset {
-        TU102 | TU104 | TU106 | TU116 | TU117 => {
+    let hal = match chipset.arch() {
+        Architecture::Turing => {
             KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
-        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => {
+        // TODO: support GA100. Its boot sequence is a lot like Turing, except that it handles the
+        // FRTS steps differently (specifically, it skips FWSEC-FRTS).
+        Architecture::Ampere if chipset == Chipset::GA100 => return Err(ENOTSUPP),
+        Architecture::Ampere
+        | Architecture::Hopper
+        | Architecture::Ada
+        | Architecture::Blackwell => {
             KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        _ => return Err(ENOTSUPP),
     };
 
     Ok(hal)
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index e709affaa7e8..d33ca0f96417 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -4,7 +4,10 @@
 
 use crate::{
     driver::Bar0,
-    gpu::Chipset, //
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
 };
 
 mod ga100;
@@ -29,12 +32,10 @@ pub(crate) trait FbHal {
 
 /// Returns the HAL corresponding to `chipset`.
 pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
-    use Chipset::*;
-
-    match chipset {
-        TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
-        GA100 => ga100::GA100_HAL,
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
-        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => ga102::GA102_HAL,
+    match chipset.arch() {
+        Architecture::Turing => tu102::TU102_HAL,
+        Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
+        Architecture::Ampere => ga102::GA102_HAL,
+        Architecture::Ada | Architecture::Hopper | Architecture::Blackwell => ga102::GA102_HAL,
     }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 04/31] gpu: nova-core: move GPU init into Gpu::new()
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (2 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 03/31] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-23 12:45   ` Alexandre Courbot
  2026-03-17 22:53 ` [PATCH v7 05/31] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
                   ` (27 subsequent siblings)
  31 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Move Spec creation and the dev_info log from the driver's probe() into
Gpu::new(), so that GPU-specific identification lives in the Gpu
constructor.

Restructure Gpu::new() to use pin_init_scope wrapping try_pin_init!,
which allows running fallible setup code (Spec::new) before the
pin-initializer. Add Spec::chipset() accessor for use by later patches.

The DMA mask setup stays in probe() where the safety argument for
dma_set_mask_and_coherent is straightforward.

Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 49 +++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 3b4ccc3d18b9..8f317d213908 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -102,7 +102,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
 });
 
 impl Chipset {
-    pub(crate) const fn arch(self) -> Architecture {
+    pub(crate) const fn arch(&self) -> Architecture {
         match self {
             Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => {
                 Architecture::Turing
@@ -241,6 +241,10 @@ fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
             dev_err!(dev, "Unsupported chipset: {}\n", boot42);
         })
     }
+
+    pub(crate) fn chipset(&self) -> Chipset {
+        self.chipset
+    }
 }
 
 impl TryFrom<regs::NV_PMC_BOOT_42> for Spec {
@@ -289,32 +293,37 @@ pub(crate) fn new<'a>(
         devres_bar: Arc<Devres<Bar0>>,
         bar: &'a Bar0,
     ) -> impl PinInit<Self, Error> + 'a {
-        try_pin_init!(Self {
-            spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| {
-                dev_info!(pdev,"NVIDIA ({})\n", spec);
-            })?,
+        pin_init::pin_init_scope(move || {
+            let spec = Spec::new(pdev.as_ref(), bar)?;
+            dev_info!(pdev, "NVIDIA ({})\n", spec);
+
+            let chipset = spec.chipset();
 
-            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
-            _: {
-                gfw::wait_gfw_boot_completion(bar)
-                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
-            },
+            Ok(try_pin_init!(Self {
+                // We must wait for GFW_BOOT completion before doing any significant setup
+                // on the GPU.
+                _: {
+                    gfw::wait_gfw_boot_completion(bar)
+                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                },
 
-            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
+                sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
 
-            gsp_falcon: Falcon::new(
-                pdev.as_ref(),
-                spec.chipset,
-            )
-            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
+                gsp_falcon: Falcon::new(
+                    pdev.as_ref(),
+                    chipset,
+                )
+                .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
 
-            sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?,
+                sec2_falcon: Falcon::new(pdev.as_ref(), chipset)?,
 
-            gsp <- Gsp::new(pdev),
+                gsp <- Gsp::new(pdev),
 
-            _: { gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)? },
+                _: { gsp.boot(pdev, bar, chipset, gsp_falcon, sec2_falcon)? },
 
-            bar: devres_bar,
+                bar: devres_bar,
+                spec,
+            }))
         })
     }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 04/31] gpu: nova-core: move GPU init into Gpu::new()
  2026-03-17 22:53 ` [PATCH v7 04/31] gpu: nova-core: move GPU init into Gpu::new() John Hubbard
@ 2026-03-23 12:45   ` Alexandre Courbot
  2026-03-25  3:23     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Alexandre Courbot @ 2026-03-23 12:45 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
> Move Spec creation and the dev_info log from the driver's probe() into
> Gpu::new(), so that GPU-specific identification lives in the Gpu
> constructor.

That's not what this patch does - Spec is already created in `Gpu::new()`.

>
> Restructure Gpu::new() to use pin_init_scope wrapping try_pin_init!,
> which allows running fallible setup code (Spec::new) before the
> pin-initializer. Add Spec::chipset() accessor for use by later patches.

What is missing is why we are doing this. It looks completely unneeded
even when looking at the following patches.

>
> The DMA mask setup stays in probe() where the safety argument for
> dma_set_mask_and_coherent is straightforward.

Knowing the history of the series I understand why this sentence is
here, but as of this revision it is not relevant.

>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Gary Guo <gary@garyguo.net>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/gpu.rs | 49 +++++++++++++++++++++---------------
>  1 file changed, 29 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 3b4ccc3d18b9..8f317d213908 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -102,7 +102,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
>  });
>  
>  impl Chipset {
> -    pub(crate) const fn arch(self) -> Architecture {
> +    pub(crate) const fn arch(&self) -> Architecture {

Why are we taking `self` by reference now? `Chipset` implements `Copy`
so this should not be needed.

>          match self {
>              Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => {
>                  Architecture::Turing
> @@ -241,6 +241,10 @@ fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
>              dev_err!(dev, "Unsupported chipset: {}\n", boot42);
>          })
>      }
> +
> +    pub(crate) fn chipset(&self) -> Chipset {
> +        self.chipset
> +    }

Short doccomment please, even if it is obvious by the method name what
it does.

>  }
>  
>  impl TryFrom<regs::NV_PMC_BOOT_42> for Spec {
> @@ -289,32 +293,37 @@ pub(crate) fn new<'a>(
>          devres_bar: Arc<Devres<Bar0>>,
>          bar: &'a Bar0,
>      ) -> impl PinInit<Self, Error> + 'a {
> -        try_pin_init!(Self {
> -            spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| {
> -                dev_info!(pdev,"NVIDIA ({})\n", spec);
> -            })?,
> +        pin_init::pin_init_scope(move || {
> +            let spec = Spec::new(pdev.as_ref(), bar)?;
> +            dev_info!(pdev, "NVIDIA ({})\n", spec);
> +
> +            let chipset = spec.chipset();
>  
> -            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
> -            _: {
> -                gfw::wait_gfw_boot_completion(bar)
> -                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
> -            },
> +            Ok(try_pin_init!(Self {
> +                // We must wait for GFW_BOOT completion before doing any significant setup
> +                // on the GPU.
> +                _: {
> +                    gfw::wait_gfw_boot_completion(bar)
> +                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
> +                },
>  
> -            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
> +                sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
>  
> -            gsp_falcon: Falcon::new(
> -                pdev.as_ref(),
> -                spec.chipset,
> -            )
> -            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
> +                gsp_falcon: Falcon::new(
> +                    pdev.as_ref(),
> +                    chipset,
> +                )
> +                .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
>  
> -            sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?,
> +                sec2_falcon: Falcon::new(pdev.as_ref(), chipset)?,
>  
> -            gsp <- Gsp::new(pdev),
> +                gsp <- Gsp::new(pdev),
>  
> -            _: { gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)? },
> +                _: { gsp.boot(pdev, bar, chipset, gsp_falcon, sec2_falcon)? },
>  
> -            bar: devres_bar,
> +                bar: devres_bar,
> +                spec,
> +            }))

This diff is basically a no-op? The commit message should help me make
sense of it but since it is incorrect, I can only guess - and I suspect
you are doing this dance because `Revision` and `Spec` do not implement
`Copy`.

`Spec` is 4 bytes, both it and `Revision` are prime candidates for a
`#[derive(Clone, Copy)]` (and `fmt::Debug` while we are at it).

I expect that just doing that will simplify the following patches as well.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 04/31] gpu: nova-core: move GPU init into Gpu::new()
  2026-03-23 12:45   ` Alexandre Courbot
@ 2026-03-25  3:23     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-25  3:23 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 3/23/26 5:45 AM, Alexandre Courbot wrote:
> On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
>> Move Spec creation and the dev_info log from the driver's probe() into
>> Gpu::new(), so that GPU-specific identification lives in the Gpu
>> constructor.
> 
> That's not what this patch does - Spec is already created in `Gpu::new()`.

Fixed in v8, along with all of the other comments below.

thanks,
-- 
John Hubbard
> 
>>
>> Restructure Gpu::new() to use pin_init_scope wrapping try_pin_init!,
>> which allows running fallible setup code (Spec::new) before the
>> pin-initializer. Add Spec::chipset() accessor for use by later patches.
> 
> What is missing is why we are doing this. It looks completely unneeded
> even when looking at the following patches.
> 
>>
>> The DMA mask setup stays in probe() where the safety argument for
>> dma_set_mask_and_coherent is straightforward.
> 
> Knowing the history of the series I understand why this sentence is
> here, but as of this revision it is not relevant.
> 
>>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Gary Guo <gary@garyguo.net>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/gpu.rs | 49 +++++++++++++++++++++---------------
>>  1 file changed, 29 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index 3b4ccc3d18b9..8f317d213908 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -102,7 +102,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
>>  });
>>  
>>  impl Chipset {
>> -    pub(crate) const fn arch(self) -> Architecture {
>> +    pub(crate) const fn arch(&self) -> Architecture {
> 
> Why are we taking `self` by reference now? `Chipset` implements `Copy`
> so this should not be needed.
> 
>>          match self {
>>              Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => {
>>                  Architecture::Turing
>> @@ -241,6 +241,10 @@ fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
>>              dev_err!(dev, "Unsupported chipset: {}\n", boot42);
>>          })
>>      }
>> +
>> +    pub(crate) fn chipset(&self) -> Chipset {
>> +        self.chipset
>> +    }
> 
> Short doccomment please, even if it is obvious by the method name what
> it does.
> 
>>  }
>>  
>>  impl TryFrom<regs::NV_PMC_BOOT_42> for Spec {
>> @@ -289,32 +293,37 @@ pub(crate) fn new<'a>(
>>          devres_bar: Arc<Devres<Bar0>>,
>>          bar: &'a Bar0,
>>      ) -> impl PinInit<Self, Error> + 'a {
>> -        try_pin_init!(Self {
>> -            spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| {
>> -                dev_info!(pdev,"NVIDIA ({})\n", spec);
>> -            })?,
>> +        pin_init::pin_init_scope(move || {
>> +            let spec = Spec::new(pdev.as_ref(), bar)?;
>> +            dev_info!(pdev, "NVIDIA ({})\n", spec);
>> +
>> +            let chipset = spec.chipset();
>>  
>> -            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
>> -            _: {
>> -                gfw::wait_gfw_boot_completion(bar)
>> -                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
>> -            },
>> +            Ok(try_pin_init!(Self {
>> +                // We must wait for GFW_BOOT completion before doing any significant setup
>> +                // on the GPU.
>> +                _: {
>> +                    gfw::wait_gfw_boot_completion(bar)
>> +                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
>> +                },
>>  
>> -            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
>> +                sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
>>  
>> -            gsp_falcon: Falcon::new(
>> -                pdev.as_ref(),
>> -                spec.chipset,
>> -            )
>> -            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
>> +                gsp_falcon: Falcon::new(
>> +                    pdev.as_ref(),
>> +                    chipset,
>> +                )
>> +                .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
>>  
>> -            sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?,
>> +                sec2_falcon: Falcon::new(pdev.as_ref(), chipset)?,
>>  
>> -            gsp <- Gsp::new(pdev),
>> +                gsp <- Gsp::new(pdev),
>>  
>> -            _: { gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)? },
>> +                _: { gsp.boot(pdev, bar, chipset, gsp_falcon, sec2_falcon)? },
>>  
>> -            bar: devres_bar,
>> +                bar: devres_bar,
>> +                spec,
>> +            }))
> 
> This diff is basically a no-op? The commit message should help me make
> sense of it but since it is incorrect, I can only guess - and I suspect
> you are doing this dance because `Revision` and `Spec` do not implement
> `Copy`.
> 
> `Spec` is 4 bytes, both it and `Revision` are prime candidates for a
> `#[derive(Clone, Copy)]` (and `fmt::Debug` while we are at it).
> 
> I expect that just doing that will simplify the following patches as well.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 05/31] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (3 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 04/31] gpu: nova-core: move GPU init into Gpu::new() John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-23 13:02   ` Alexandre Courbot
  2026-03-17 22:53 ` [PATCH v7 06/31] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
                   ` (26 subsequent siblings)
  31 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Replace the hardcoded 47-bit DMA mask with per-architecture values.
Hopper and Blackwell support 52-bit DMA addresses, while Turing,
Ampere, and Ada use 47-bit.

Add Architecture::dma_mask() as a const method with an exhaustive
match, so new architectures get a compile-time reminder to specify
their DMA mask width. Move Spec creation into probe() so the
architecture is known before setting the DMA mask, and pass the Spec
into Gpu::new().

Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs | 28 +++++++--------
 drivers/gpu/nova-core/gpu.rs    | 60 +++++++++++++++++++--------------
 2 files changed, 47 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 84b0e1703150..41227d29934e 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -5,7 +5,6 @@
     device::Core,
     devres::Devres,
     dma::Device,
-    dma::DmaMask,
     pci,
     pci::{
         Class,
@@ -23,7 +22,10 @@
     },
 };
 
-use crate::gpu::Gpu;
+use crate::gpu::{
+    Gpu,
+    Spec, //
+};
 
 /// Counter for generating unique auxiliary device IDs.
 static AUXILIARY_ID_COUNTER: Atomic<u32> = Atomic::new(0);
@@ -38,14 +40,6 @@ pub(crate) struct NovaCore {
 
 const BAR0_SIZE: usize = SZ_16M;
 
-// For now we only support Ampere which can use up to 47-bit DMA addresses.
-//
-// TODO: Add an abstraction for this to support newer GPUs which may support
-// larger DMA addresses. Limiting these GPUs to smaller address widths won't
-// have any adverse affects, unless installed on systems which require larger
-// DMA addresses. These systems should be quite rare.
-const GPU_DMA_BITS: u32 = 47;
-
 pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
 
 kernel::pci_device_table!(
@@ -84,18 +78,20 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
             pdev.enable_device_mem()?;
             pdev.set_master();
 
-            // SAFETY: No concurrent DMA allocations or mappings can be made because
-            // the device is still being probed and therefore isn't being used by
-            // other threads of execution.
-            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
-
             let bar = Arc::pin_init(
                 pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
                 GFP_KERNEL,
             )?;
+            let spec = Spec::new(pdev.as_ref(), bar.access(pdev.as_ref())?)?;
+            dev_info!(pdev, "NVIDIA ({})\n", spec);
+
+            // SAFETY: No concurrent DMA allocations or mappings can be made because
+            // the device is still being probed and therefore isn't being used by
+            // other threads of execution.
+            unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
 
             Ok(try_pin_init!(Self {
-                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
+                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?, spec),
                 _reg <- auxiliary::Registration::new(
                     pdev.as_ref(),
                     c"nova-drm",
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 8f317d213908..9e140463603b 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -3,6 +3,7 @@
 use kernel::{
     device,
     devres::Devres,
+    dma::DmaMask,
     fmt,
     pci,
     prelude::*,
@@ -162,6 +163,19 @@ pub(crate) enum Architecture {
     Blackwell = 0x1b,
 }
 
+impl Architecture {
+    /// Returns the DMA mask supported by this architecture.
+    ///
+    /// Hopper and Blackwell support 52-bit DMA addresses, while earlier
+    /// architectures (Turing, Ampere, Ada) support 47-bit.
+    pub(crate) const fn dma_mask(&self) -> DmaMask {
+        match self {
+            Self::Turing | Self::Ampere | Self::Ada => DmaMask::new::<47>(),
+            Self::Hopper | Self::Blackwell => DmaMask::new::<52>(),
+        }
+    }
+}
+
 impl TryFrom<u8> for Architecture {
     type Error = Error;
 
@@ -211,7 +225,7 @@ pub(crate) struct Spec {
 }
 
 impl Spec {
-    fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
+    pub(crate) fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
         // Some brief notes about boot0 and boot42, in chronological order:
         //
         // NV04 through NV50:
@@ -292,38 +306,34 @@ pub(crate) fn new<'a>(
         pdev: &'a pci::Device<device::Bound>,
         devres_bar: Arc<Devres<Bar0>>,
         bar: &'a Bar0,
+        spec: Spec,
     ) -> impl PinInit<Self, Error> + 'a {
-        pin_init::pin_init_scope(move || {
-            let spec = Spec::new(pdev.as_ref(), bar)?;
-            dev_info!(pdev, "NVIDIA ({})\n", spec);
-
-            let chipset = spec.chipset();
+        let chipset = spec.chipset();
 
-            Ok(try_pin_init!(Self {
-                // We must wait for GFW_BOOT completion before doing any significant setup
-                // on the GPU.
-                _: {
-                    gfw::wait_gfw_boot_completion(bar)
-                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
-                },
+        try_pin_init!(Self {
+            // We must wait for GFW_BOOT completion before doing any significant setup
+            // on the GPU.
+            _: {
+                gfw::wait_gfw_boot_completion(bar)
+                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+            },
 
-                sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
+            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
 
-                gsp_falcon: Falcon::new(
-                    pdev.as_ref(),
-                    chipset,
-                )
-                .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
+            gsp_falcon: Falcon::new(
+                pdev.as_ref(),
+                chipset,
+            )
+            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
 
-                sec2_falcon: Falcon::new(pdev.as_ref(), chipset)?,
+            sec2_falcon: Falcon::new(pdev.as_ref(), chipset)?,
 
-                gsp <- Gsp::new(pdev),
+            gsp <- Gsp::new(pdev),
 
-                _: { gsp.boot(pdev, bar, chipset, gsp_falcon, sec2_falcon)? },
+            _: { gsp.boot(pdev, bar, chipset, gsp_falcon, sec2_falcon)? },
 
-                bar: devres_bar,
-                spec,
-            }))
+            bar: devres_bar,
+            spec,
         })
     }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 05/31] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-03-17 22:53 ` [PATCH v7 05/31] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
@ 2026-03-23 13:02   ` Alexandre Courbot
  2026-03-25  3:26     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Alexandre Courbot @ 2026-03-23 13:02 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
> Replace the hardcoded 47-bit DMA mask with per-architecture values.
> Hopper and Blackwell support 52-bit DMA addresses, while Turing,
> Ampere, and Ada use 47-bit.
>
> Add Architecture::dma_mask() as a const method with an exhaustive
> match, so new architectures get a compile-time reminder to specify
> their DMA mask width. Move Spec creation into probe() so the
> architecture is known before setting the DMA mask, and pass the Spec
> into Gpu::new().
>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Gary Guo <gary@garyguo.net>

Why the Ccs? Some patches in the series seem to have random people Cc'd
to them.

> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/driver.rs | 28 +++++++--------
>  drivers/gpu/nova-core/gpu.rs    | 60 +++++++++++++++++++--------------
>  2 files changed, 47 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
> index 84b0e1703150..41227d29934e 100644
> --- a/drivers/gpu/nova-core/driver.rs
> +++ b/drivers/gpu/nova-core/driver.rs
> @@ -5,7 +5,6 @@
>      device::Core,
>      devres::Devres,
>      dma::Device,
> -    dma::DmaMask,
>      pci,
>      pci::{
>          Class,
> @@ -23,7 +22,10 @@
>      },
>  };
>  
> -use crate::gpu::Gpu;
> +use crate::gpu::{
> +    Gpu,
> +    Spec, //
> +};
>  
>  /// Counter for generating unique auxiliary device IDs.
>  static AUXILIARY_ID_COUNTER: Atomic<u32> = Atomic::new(0);
> @@ -38,14 +40,6 @@ pub(crate) struct NovaCore {
>  
>  const BAR0_SIZE: usize = SZ_16M;
>  
> -// For now we only support Ampere which can use up to 47-bit DMA addresses.
> -//
> -// TODO: Add an abstraction for this to support newer GPUs which may support
> -// larger DMA addresses. Limiting these GPUs to smaller address widths won't
> -// have any adverse affects, unless installed on systems which require larger
> -// DMA addresses. These systems should be quite rare.
> -const GPU_DMA_BITS: u32 = 47;
> -
>  pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
>  
>  kernel::pci_device_table!(
> @@ -84,18 +78,20 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
>              pdev.enable_device_mem()?;
>              pdev.set_master();
>  
> -            // SAFETY: No concurrent DMA allocations or mappings can be made because
> -            // the device is still being probed and therefore isn't being used by
> -            // other threads of execution.
> -            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
> -
>              let bar = Arc::pin_init(
>                  pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
>                  GFP_KERNEL,
>              )?;
> +            let spec = Spec::new(pdev.as_ref(), bar.access(pdev.as_ref())?)?;
> +            dev_info!(pdev, "NVIDIA ({})\n", spec);
> +
> +            // SAFETY: No concurrent DMA allocations or mappings can be made because
> +            // the device is still being probed and therefore isn't being used by
> +            // other threads of execution.
> +            unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
>  
>              Ok(try_pin_init!(Self {
> -                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
> +                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?, spec),
>                  _reg <- auxiliary::Registration::new(
>                      pdev.as_ref(),
>                      c"nova-drm",
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 8f317d213908..9e140463603b 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -3,6 +3,7 @@
>  use kernel::{
>      device,
>      devres::Devres,
> +    dma::DmaMask,
>      fmt,
>      pci,
>      prelude::*,
> @@ -162,6 +163,19 @@ pub(crate) enum Architecture {
>      Blackwell = 0x1b,
>  }
>  
> +impl Architecture {
> +    /// Returns the DMA mask supported by this architecture.
> +    ///
> +    /// Hopper and Blackwell support 52-bit DMA addresses, while earlier
> +    /// architectures (Turing, Ampere, Ada) support 47-bit.

This last sentence is unneeded, we describe what methods provide in
doccomments, not how they do it or what the result will be.

> +    pub(crate) const fn dma_mask(&self) -> DmaMask {
> +        match self {
> +            Self::Turing | Self::Ampere | Self::Ada => DmaMask::new::<47>(),
> +            Self::Hopper | Self::Blackwell => DmaMask::new::<52>(),
> +        }
> +    }
> +}

I see you introduce a `Gpu` HAL in the next patch. I think this should
also be part of the HAL - there is no benefit in having this method
const since the architecture is probed at runtime anyway.

> +
>  impl TryFrom<u8> for Architecture {
>      type Error = Error;
>  
> @@ -211,7 +225,7 @@ pub(crate) struct Spec {
>  }
>  
>  impl Spec {
> -    fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
> +    pub(crate) fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
>          // Some brief notes about boot0 and boot42, in chronological order:
>          //
>          // NV04 through NV50:
> @@ -292,38 +306,34 @@ pub(crate) fn new<'a>(
>          pdev: &'a pci::Device<device::Bound>,
>          devres_bar: Arc<Devres<Bar0>>,
>          bar: &'a Bar0,
> +        spec: Spec,
>      ) -> impl PinInit<Self, Error> + 'a {
> -        pin_init::pin_init_scope(move || {
> -            let spec = Spec::new(pdev.as_ref(), bar)?;
> -            dev_info!(pdev, "NVIDIA ({})\n", spec);
> -
> -            let chipset = spec.chipset();
> +        let chipset = spec.chipset();
>  
> -            Ok(try_pin_init!(Self {
> -                // We must wait for GFW_BOOT completion before doing any significant setup
> -                // on the GPU.
> -                _: {
> -                    gfw::wait_gfw_boot_completion(bar)
> -                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
> -                },
> +        try_pin_init!(Self {

What, and now we undo what we just did in patch 4? 0_o; What was that
all for?

I did a `git diff` between this step in the series and the state two
steps above, and it seems to confirm my intuition on patch 4: you just
need a few more `Copy` implementations. They can be added in this patch,
and patch 4 dropped altogether.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 05/31] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-03-23 13:02   ` Alexandre Courbot
@ 2026-03-25  3:26     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-25  3:26 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 3/23/26 6:02 AM, Alexandre Courbot wrote:
> On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
>> Replace the hardcoded 47-bit DMA mask with per-architecture values.
>> Hopper and Blackwell support 52-bit DMA addresses, while Turing,
>> Ampere, and Ada use 47-bit.
>>
>> Add Architecture::dma_mask() as a const method with an exhaustive
>> match, so new architectures get a compile-time reminder to specify
>> their DMA mask width. Move Spec creation into probe() so the
>> architecture is known before setting the DMA mask, and pass the Spec
>> into Gpu::new().
>>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Gary Guo <gary@garyguo.net>
> 
> Why the Ccs? Some patches in the series seem to have random people Cc'd
> to them.

Many many months ago, I wanted to follow a convention of adding
people to Cc if they had responded with comments. Several versions
into it, I just gave up, it's too much trouble for a large patchset.

So I'll just remove all Cc's for v8.

I've fixed all of the other comments below, in v8.


thanks,
-- 
John Hubbard

> 
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/driver.rs | 28 +++++++--------
>>  drivers/gpu/nova-core/gpu.rs    | 60 +++++++++++++++++++--------------
>>  2 files changed, 47 insertions(+), 41 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
>> index 84b0e1703150..41227d29934e 100644
>> --- a/drivers/gpu/nova-core/driver.rs
>> +++ b/drivers/gpu/nova-core/driver.rs
>> @@ -5,7 +5,6 @@
>>      device::Core,
>>      devres::Devres,
>>      dma::Device,
>> -    dma::DmaMask,
>>      pci,
>>      pci::{
>>          Class,
>> @@ -23,7 +22,10 @@
>>      },
>>  };
>>  
>> -use crate::gpu::Gpu;
>> +use crate::gpu::{
>> +    Gpu,
>> +    Spec, //
>> +};
>>  
>>  /// Counter for generating unique auxiliary device IDs.
>>  static AUXILIARY_ID_COUNTER: Atomic<u32> = Atomic::new(0);
>> @@ -38,14 +40,6 @@ pub(crate) struct NovaCore {
>>  
>>  const BAR0_SIZE: usize = SZ_16M;
>>  
>> -// For now we only support Ampere which can use up to 47-bit DMA addresses.
>> -//
>> -// TODO: Add an abstraction for this to support newer GPUs which may support
>> -// larger DMA addresses. Limiting these GPUs to smaller address widths won't
>> -// have any adverse affects, unless installed on systems which require larger
>> -// DMA addresses. These systems should be quite rare.
>> -const GPU_DMA_BITS: u32 = 47;
>> -
>>  pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
>>  
>>  kernel::pci_device_table!(
>> @@ -84,18 +78,20 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
>>              pdev.enable_device_mem()?;
>>              pdev.set_master();
>>  
>> -            // SAFETY: No concurrent DMA allocations or mappings can be made because
>> -            // the device is still being probed and therefore isn't being used by
>> -            // other threads of execution.
>> -            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
>> -
>>              let bar = Arc::pin_init(
>>                  pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
>>                  GFP_KERNEL,
>>              )?;
>> +            let spec = Spec::new(pdev.as_ref(), bar.access(pdev.as_ref())?)?;
>> +            dev_info!(pdev, "NVIDIA ({})\n", spec);
>> +
>> +            // SAFETY: No concurrent DMA allocations or mappings can be made because
>> +            // the device is still being probed and therefore isn't being used by
>> +            // other threads of execution.
>> +            unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
>>  
>>              Ok(try_pin_init!(Self {
>> -                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
>> +                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?, spec),
>>                  _reg <- auxiliary::Registration::new(
>>                      pdev.as_ref(),
>>                      c"nova-drm",
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index 8f317d213908..9e140463603b 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -3,6 +3,7 @@
>>  use kernel::{
>>      device,
>>      devres::Devres,
>> +    dma::DmaMask,
>>      fmt,
>>      pci,
>>      prelude::*,
>> @@ -162,6 +163,19 @@ pub(crate) enum Architecture {
>>      Blackwell = 0x1b,
>>  }
>>  
>> +impl Architecture {
>> +    /// Returns the DMA mask supported by this architecture.
>> +    ///
>> +    /// Hopper and Blackwell support 52-bit DMA addresses, while earlier
>> +    /// architectures (Turing, Ampere, Ada) support 47-bit.
> 
> This last sentence is unneeded, we describe what methods provide in
> doccomments, not how they do it or what the result will be.
> 
>> +    pub(crate) const fn dma_mask(&self) -> DmaMask {
>> +        match self {
>> +            Self::Turing | Self::Ampere | Self::Ada => DmaMask::new::<47>(),
>> +            Self::Hopper | Self::Blackwell => DmaMask::new::<52>(),
>> +        }
>> +    }
>> +}
> 
> I see you introduce a `Gpu` HAL in the next patch. I think this should
> also be part of the HAL - there is no benefit in having this method
> const since the architecture is probed at runtime anyway.
> 
>> +
>>  impl TryFrom<u8> for Architecture {
>>      type Error = Error;
>>  
>> @@ -211,7 +225,7 @@ pub(crate) struct Spec {
>>  }
>>  
>>  impl Spec {
>> -    fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
>> +    pub(crate) fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
>>          // Some brief notes about boot0 and boot42, in chronological order:
>>          //
>>          // NV04 through NV50:
>> @@ -292,38 +306,34 @@ pub(crate) fn new<'a>(
>>          pdev: &'a pci::Device<device::Bound>,
>>          devres_bar: Arc<Devres<Bar0>>,
>>          bar: &'a Bar0,
>> +        spec: Spec,
>>      ) -> impl PinInit<Self, Error> + 'a {
>> -        pin_init::pin_init_scope(move || {
>> -            let spec = Spec::new(pdev.as_ref(), bar)?;
>> -            dev_info!(pdev, "NVIDIA ({})\n", spec);
>> -
>> -            let chipset = spec.chipset();
>> +        let chipset = spec.chipset();
>>  
>> -            Ok(try_pin_init!(Self {
>> -                // We must wait for GFW_BOOT completion before doing any significant setup
>> -                // on the GPU.
>> -                _: {
>> -                    gfw::wait_gfw_boot_completion(bar)
>> -                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
>> -                },
>> +        try_pin_init!(Self {
> 
> What, and now we undo what we just did in patch 4? 0_o; What was that
> all for?
> 
> I did a `git diff` between this step in the series and the state two
> steps above, and it seems to confirm my intuition on patch 4: you just
> need a few more `Copy` implementations. They can be added in this patch,
> and patch 4 dropped altogether.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 06/31] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (4 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 05/31] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-23 13:13   ` Alexandre Courbot
  2026-03-17 22:53 ` [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
                   ` (25 subsequent siblings)
  31 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell GPUs use FSP-based secure boot and do not require
waiting for GFW_BOOT completion. Skip this step for these architectures.

Move the GFW_BOOT policy into a dedicated GPU HAL in gpu/hal.rs. This
keeps the decision out of gpu.rs while avoiding unrelated subsystems
such as fb. Pre-Hopper families still wait for GFW_BOOT completion,
while Hopper and later use the FSP Chain of Trust boot path instead.

Cc: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs     | 14 ++++++---
 drivers/gpu/nova-core/gpu/hal.rs | 54 ++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/nova-core/gpu/hal.rs

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 9e140463603b..93f861ba20f3 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -23,6 +23,8 @@
     regs,
 };
 
+mod hal;
+
 macro_rules! define_chipset {
     ({ $($variant:ident = $value:expr),* $(,)* }) =>
     {
@@ -309,13 +311,17 @@ pub(crate) fn new<'a>(
         spec: Spec,
     ) -> impl PinInit<Self, Error> + 'a {
         let chipset = spec.chipset();
+        let hal = hal::gpu_hal(chipset);
 
         try_pin_init!(Self {
-            // We must wait for GFW_BOOT completion before doing any significant setup
-            // on the GPU.
             _: {
-                gfw::wait_gfw_boot_completion(bar)
-                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                // GFW_BOOT is the "GPU firmware boot complete" signal for the
+                // legacy devinit/FWSEC path. Pre-Hopper GPUs must wait for it
+                // before most GPU initialization. Hopper and later boot via FSP.
+                if hal.needs_gfw_boot() {
+                    gfw::wait_gfw_boot_completion(bar)
+                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                }
             },
 
             sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
diff --git a/drivers/gpu/nova-core/gpu/hal.rs b/drivers/gpu/nova-core/gpu/hal.rs
new file mode 100644
index 000000000000..859c5e5fa21f
--- /dev/null
+++ b/drivers/gpu/nova-core/gpu/hal.rs
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use crate::gpu::{
+    Architecture,
+    Chipset, //
+};
+
+pub(crate) trait GpuHal {
+    /// Returns whether this hardware family still requires waiting for GFW_BOOT.
+    fn needs_gfw_boot(&self) -> bool;
+}
+
+struct Tu102;
+struct Ga100;
+struct Ga102;
+struct Fsp;
+
+impl GpuHal for Tu102 {
+    fn needs_gfw_boot(&self) -> bool {
+        true
+    }
+}
+
+impl GpuHal for Ga100 {
+    fn needs_gfw_boot(&self) -> bool {
+        true
+    }
+}
+
+impl GpuHal for Ga102 {
+    fn needs_gfw_boot(&self) -> bool {
+        true
+    }
+}
+
+impl GpuHal for Fsp {
+    fn needs_gfw_boot(&self) -> bool {
+        false
+    }
+}
+
+const TU102: Tu102 = Tu102;
+const GA100: Ga100 = Ga100;
+const GA102: Ga102 = Ga102;
+const FSP: Fsp = Fsp;
+
+pub(super) fn gpu_hal(chipset: Chipset) -> &'static dyn GpuHal {
+    match chipset.arch() {
+        Architecture::Turing => &TU102,
+        Architecture::Ampere if chipset == Chipset::GA100 => &GA100,
+        Architecture::Ampere | Architecture::Ada => &GA102,
+        Architecture::Hopper | Architecture::Blackwell => &FSP,
+    }
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 06/31] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  2026-03-17 22:53 ` [PATCH v7 06/31] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
@ 2026-03-23 13:13   ` Alexandre Courbot
  2026-03-25  3:26     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Alexandre Courbot @ 2026-03-23 13:13 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
> Hopper and Blackwell GPUs use FSP-based secure boot and do not require
> waiting for GFW_BOOT completion. Skip this step for these architectures.
>
> Move the GFW_BOOT policy into a dedicated GPU HAL in gpu/hal.rs. This
> keeps the decision out of gpu.rs while avoiding unrelated subsystems
> such as fb. Pre-Hopper families still wait for GFW_BOOT completion,
> while Hopper and later use the FSP Chain of Trust boot path instead.
>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/gpu.rs     | 14 ++++++---
>  drivers/gpu/nova-core/gpu/hal.rs | 54 ++++++++++++++++++++++++++++++++
>  2 files changed, 64 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/gpu/nova-core/gpu/hal.rs
>
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 9e140463603b..93f861ba20f3 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -23,6 +23,8 @@
>      regs,
>  };
>  
> +mod hal;
> +
>  macro_rules! define_chipset {
>      ({ $($variant:ident = $value:expr),* $(,)* }) =>
>      {
> @@ -309,13 +311,17 @@ pub(crate) fn new<'a>(
>          spec: Spec,
>      ) -> impl PinInit<Self, Error> + 'a {
>          let chipset = spec.chipset();
> +        let hal = hal::gpu_hal(chipset);
>  
>          try_pin_init!(Self {
> -            // We must wait for GFW_BOOT completion before doing any significant setup
> -            // on the GPU.
>              _: {
> -                gfw::wait_gfw_boot_completion(bar)
> -                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
> +                // GFW_BOOT is the "GPU firmware boot complete" signal for the
> +                // legacy devinit/FWSEC path. Pre-Hopper GPUs must wait for it
> +                // before most GPU initialization. Hopper and later boot via FSP.
> +                if hal.needs_gfw_boot() {
> +                    gfw::wait_gfw_boot_completion(bar)
> +                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
> +                }

Good that this is moved to a HAL method, but we can go one step further
and perform the actual wait in the HAL. I.e., this should become
`hal.wait_gfw_boot_completion` and the wait would happen in the HAL
itself (where such low-level stuff belongs), not here.

We could even move the contents of the `gfw` module into the correct HAL
since it won't be used anywhere, and simplify our top-level directory.

>              },
>  
>              sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
> diff --git a/drivers/gpu/nova-core/gpu/hal.rs b/drivers/gpu/nova-core/gpu/hal.rs
> new file mode 100644
> index 000000000000..859c5e5fa21f
> --- /dev/null
> +++ b/drivers/gpu/nova-core/gpu/hal.rs
> @@ -0,0 +1,54 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use crate::gpu::{
> +    Architecture,
> +    Chipset, //
> +};
> +
> +pub(crate) trait GpuHal {
> +    /// Returns whether this hardware family still requires waiting for GFW_BOOT.
> +    fn needs_gfw_boot(&self) -> bool;
> +}
> +
> +struct Tu102;
> +struct Ga100;
> +struct Ga102;
> +struct Fsp;
> +
> +impl GpuHal for Tu102 {
> +    fn needs_gfw_boot(&self) -> bool {
> +        true
> +    }
> +}
> +
> +impl GpuHal for Ga100 {
> +    fn needs_gfw_boot(&self) -> bool {
> +        true
> +    }
> +}
> +
> +impl GpuHal for Ga102 {
> +    fn needs_gfw_boot(&self) -> bool {
> +        true
> +    }
> +}
> +
> +impl GpuHal for Fsp {
> +    fn needs_gfw_boot(&self) -> bool {
> +        false
> +    }
> +}

3 of the HALs do exactly the same thing. You only need two: `Tu102` and
`Gh100`. `Fsp` is also not a valid name for a HAL, so far they have been
named after the first chip that makes use of them.

> +
> +const TU102: Tu102 = Tu102;
> +const GA100: Ga100 = Ga100;
> +const GA102: Ga102 = Ga102;
> +const FSP: Fsp = Fsp;
> +
> +pub(super) fn gpu_hal(chipset: Chipset) -> &'static dyn GpuHal {
> +    match chipset.arch() {
> +        Architecture::Turing => &TU102,
> +        Architecture::Ampere if chipset == Chipset::GA100 => &GA100,
> +        Architecture::Ampere | Architecture::Ada => &GA102,

This must be a copy/paste from somewhere because GA100 does not warrant
any exception here.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 06/31] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  2026-03-23 13:13   ` Alexandre Courbot
@ 2026-03-25  3:26     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-25  3:26 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 3/23/26 6:13 AM, Alexandre Courbot wrote:
> On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
>> Hopper and Blackwell GPUs use FSP-based secure boot and do not require
>> waiting for GFW_BOOT completion. Skip this step for these architectures.
>>
>> Move the GFW_BOOT policy into a dedicated GPU HAL in gpu/hal.rs. This
>> keeps the decision out of gpu.rs while avoiding unrelated subsystems
>> such as fb. Pre-Hopper families still wait for GFW_BOOT completion,
>> while Hopper and later use the FSP Chain of Trust boot path instead.
>>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/gpu.rs     | 14 ++++++---
>>  drivers/gpu/nova-core/gpu/hal.rs | 54 ++++++++++++++++++++++++++++++++
>>  2 files changed, 64 insertions(+), 4 deletions(-)
>>  create mode 100644 drivers/gpu/nova-core/gpu/hal.rs
>>
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index 9e140463603b..93f861ba20f3 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -23,6 +23,8 @@
>>      regs,
>>  };
>>  
>> +mod hal;
>> +
>>  macro_rules! define_chipset {
>>      ({ $($variant:ident = $value:expr),* $(,)* }) =>
>>      {
>> @@ -309,13 +311,17 @@ pub(crate) fn new<'a>(
>>          spec: Spec,
>>      ) -> impl PinInit<Self, Error> + 'a {
>>          let chipset = spec.chipset();
>> +        let hal = hal::gpu_hal(chipset);
>>  
>>          try_pin_init!(Self {
>> -            // We must wait for GFW_BOOT completion before doing any significant setup
>> -            // on the GPU.
>>              _: {
>> -                gfw::wait_gfw_boot_completion(bar)
>> -                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
>> +                // GFW_BOOT is the "GPU firmware boot complete" signal for the
>> +                // legacy devinit/FWSEC path. Pre-Hopper GPUs must wait for it
>> +                // before most GPU initialization. Hopper and later boot via FSP.
>> +                if hal.needs_gfw_boot() {
>> +                    gfw::wait_gfw_boot_completion(bar)
>> +                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
>> +                }
> 
> Good that this is moved to a HAL method, but we can go one step further
> and perform the actual wait in the HAL. I.e., this should become
> `hal.wait_gfw_boot_completion` and the wait would happen in the HAL
> itself (where such low-level stuff belongs), not here.
> 
> We could even move the contents of the `gfw` module into the correct HAL
> since it won't be used anywhere, and simplify our top-level directory.

Yes. Done.

> 
>>              },
>>  
>>              sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
>> diff --git a/drivers/gpu/nova-core/gpu/hal.rs b/drivers/gpu/nova-core/gpu/hal.rs
>> new file mode 100644
>> index 000000000000..859c5e5fa21f
>> --- /dev/null
>> +++ b/drivers/gpu/nova-core/gpu/hal.rs
>> @@ -0,0 +1,54 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +use crate::gpu::{
>> +    Architecture,
>> +    Chipset, //
>> +};
>> +
>> +pub(crate) trait GpuHal {
>> +    /// Returns whether this hardware family still requires waiting for GFW_BOOT.
>> +    fn needs_gfw_boot(&self) -> bool;
>> +}
>> +
>> +struct Tu102;
>> +struct Ga100;
>> +struct Ga102;
>> +struct Fsp;
>> +
>> +impl GpuHal for Tu102 {
>> +    fn needs_gfw_boot(&self) -> bool {
>> +        true
>> +    }
>> +}
>> +
>> +impl GpuHal for Ga100 {
>> +    fn needs_gfw_boot(&self) -> bool {
>> +        true
>> +    }
>> +}
>> +
>> +impl GpuHal for Ga102 {
>> +    fn needs_gfw_boot(&self) -> bool {
>> +        true
>> +    }
>> +}
>> +
>> +impl GpuHal for Fsp {
>> +    fn needs_gfw_boot(&self) -> bool {
>> +        false
>> +    }
>> +}
> 
> 3 of the HALs do exactly the same thing. You only need two: `Tu102` and
> `Gh100`. `Fsp` is also not a valid name for a HAL, so far they have been
> named after the first chip that makes use of them.

Right.

> 
>> +
>> +const TU102: Tu102 = Tu102;
>> +const GA100: Ga100 = Ga100;
>> +const GA102: Ga102 = Ga102;
>> +const FSP: Fsp = Fsp;
>> +
>> +pub(super) fn gpu_hal(chipset: Chipset) -> &'static dyn GpuHal {
>> +    match chipset.arch() {
>> +        Architecture::Turing => &TU102,
>> +        Architecture::Ampere if chipset == Chipset::GA100 => &GA100,
>> +        Architecture::Ampere | Architecture::Ada => &GA102,
> 
> This must be a copy/paste from somewhere because GA100 does not warrant
> any exception here.

Fixed.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (5 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 06/31] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-23 13:19   ` Alexandre Courbot
  2026-03-17 22:53 ` [PATCH v7 08/31] gpu: nova-core: factor out an elf_str() function John Hubbard
                   ` (24 subsequent siblings)
  31 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Up until now, only the GSP required parsing of its firmware headers.
However, upcoming support for Hopper/Blackwell+ adds another firmware
image (FMC), along with another format (ELF32).

Therefore, the current ELF64 section parsing support needs to be moved
up a level, so that both of the above can use it.

There are no functional changes. This is pure code movement.

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     | 88 +++++++++++++++++++++++++
 drivers/gpu/nova-core/firmware/gsp.rs | 93 ++-------------------------
 2 files changed, 94 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 2bb20081befd..177b8ede151c 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -457,3 +457,91 @@ pub(crate) const fn create(
         this.0
     }
 }
+
+/// Ad-hoc and temporary module to extract sections from ELF images.
+///
+/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
+/// to specific and related bits of data. Future firmware versions are scheduled to move away from
+/// that scheme before nova-core becomes stable, which means this module will eventually be
+/// removed.
+mod elf {
+    use core::mem::size_of;
+
+    use kernel::{
+        bindings,
+        str::CStr,
+        transmute::FromBytes, //
+    };
+
+    /// Newtype to provide a [`FromBytes`] implementation.
+    #[repr(transparent)]
+    struct Elf64Hdr(bindings::elf64_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64Hdr {}
+
+    #[repr(transparent)]
+    struct Elf64SHdr(bindings::elf64_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64SHdr {}
+
+    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
+    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
+        let hdr = &elf
+            .get(0..size_of::<bindings::elf64_hdr>())
+            .and_then(Elf64Hdr::from_bytes)?
+            .0;
+
+        // Get all the section headers.
+        let mut shdr = {
+            let shdr_num = usize::from(hdr.e_shnum);
+            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
+            let shdr_end = shdr_num
+                .checked_mul(size_of::<Elf64SHdr>())
+                .and_then(|v| v.checked_add(shdr_start))?;
+
+            elf.get(shdr_start..shdr_end)
+                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
+        };
+
+        // Get the strings table.
+        let strhdr = shdr
+            .clone()
+            .nth(usize::from(hdr.e_shstrndx))
+            .and_then(Elf64SHdr::from_bytes)?;
+
+        // Find the section which name matches `name` and return it.
+        shdr.find(|&sh| {
+            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
+                return false;
+            };
+
+            let Some(name_idx) = strhdr
+                .0
+                .sh_offset
+                .checked_add(u64::from(hdr.0.sh_name))
+                .and_then(|idx| usize::try_from(idx).ok())
+            else {
+                return false;
+            };
+
+            // Get the start of the name.
+            elf.get(name_idx..)
+                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
+                // Convert into str.
+                .and_then(|c_str| c_str.to_str().ok())
+                // Check that the name matches.
+                .map(|str| str == name)
+                .unwrap_or(false)
+        })
+        // Return the slice containing the section.
+        .and_then(|sh| {
+            let hdr = Elf64SHdr::from_bytes(sh)?;
+            let start = usize::try_from(hdr.0.sh_offset).ok()?;
+            let end = usize::try_from(hdr.0.sh_size)
+                .ok()
+                .and_then(|sh_size| start.checked_add(sh_size))?;
+
+            elf.get(start..end)
+        })
+    }
+}
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 8bbc3809c640..c6e71339b28e 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::mem::size_of_val;
+
 use kernel::{
     device,
     dma::{
@@ -16,7 +18,10 @@
 
 use crate::{
     dma::DmaObject,
-    firmware::riscv::RiscvFirmware,
+    firmware::{
+        elf,
+        riscv::RiscvFirmware, //
+    },
     gpu::{
         Architecture,
         Chipset, //
@@ -25,92 +30,6 @@
     num::FromSafeCast,
 };
 
-/// Ad-hoc and temporary module to extract sections from ELF images.
-///
-/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
-/// to specific and related bits of data. Future firmware versions are scheduled to move away from
-/// that scheme before nova-core becomes stable, which means this module will eventually be
-/// removed.
-mod elf {
-    use kernel::{
-        bindings,
-        prelude::*,
-        transmute::FromBytes, //
-    };
-
-    /// Newtype to provide a [`FromBytes`] implementation.
-    #[repr(transparent)]
-    struct Elf64Hdr(bindings::elf64_hdr);
-    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
-    unsafe impl FromBytes for Elf64Hdr {}
-
-    #[repr(transparent)]
-    struct Elf64SHdr(bindings::elf64_shdr);
-    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
-    unsafe impl FromBytes for Elf64SHdr {}
-
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
-        let hdr = &elf
-            .get(0..size_of::<bindings::elf64_hdr>())
-            .and_then(Elf64Hdr::from_bytes)?
-            .0;
-
-        // Get all the section headers.
-        let mut shdr = {
-            let shdr_num = usize::from(hdr.e_shnum);
-            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
-            let shdr_end = shdr_num
-                .checked_mul(size_of::<Elf64SHdr>())
-                .and_then(|v| v.checked_add(shdr_start))?;
-
-            elf.get(shdr_start..shdr_end)
-                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
-        };
-
-        // Get the strings table.
-        let strhdr = shdr
-            .clone()
-            .nth(usize::from(hdr.e_shstrndx))
-            .and_then(Elf64SHdr::from_bytes)?;
-
-        // Find the section which name matches `name` and return it.
-        shdr.find(|&sh| {
-            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
-                return false;
-            };
-
-            let Some(name_idx) = strhdr
-                .0
-                .sh_offset
-                .checked_add(u64::from(hdr.0.sh_name))
-                .and_then(|idx| usize::try_from(idx).ok())
-            else {
-                return false;
-            };
-
-            // Get the start of the name.
-            elf.get(name_idx..)
-                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
-                // Convert into str.
-                .and_then(|c_str| c_str.to_str().ok())
-                // Check that the name matches.
-                .map(|str| str == name)
-                .unwrap_or(false)
-        })
-        // Return the slice containing the section.
-        .and_then(|sh| {
-            let hdr = Elf64SHdr::from_bytes(sh)?;
-            let start = usize::try_from(hdr.0.sh_offset).ok()?;
-            let end = usize::try_from(hdr.0.sh_size)
-                .ok()
-                .and_then(|sh_size| start.checked_add(sh_size))?;
-
-            elf.get(start..end)
-        })
-    }
-}
-
 /// GSP firmware with 3-level radix page tables for the GSP bootloader.
 ///
 /// The bootloader expects firmware to be mapped starting at address 0 in GSP's virtual address
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-03-17 22:53 ` [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
@ 2026-03-23 13:19   ` Alexandre Courbot
  2026-03-25  3:30     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Alexandre Courbot @ 2026-03-23 13:19 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
> Up until now, only the GSP required parsing of its firmware headers.
> However, upcoming support for Hopper/Blackwell+ adds another firmware
> image (FMC), along with another format (ELF32).
>
> Therefore, the current ELF64 section parsing support needs to be moved
> up a level, so that both of the above can use it.
>
> There are no functional changes. This is pure code movement.
>
> Reviewed-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware.rs     | 88 +++++++++++++++++++++++++
>  drivers/gpu/nova-core/firmware/gsp.rs | 93 ++-------------------------
>  2 files changed, 94 insertions(+), 87 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
> index 2bb20081befd..177b8ede151c 100644
> --- a/drivers/gpu/nova-core/firmware.rs
> +++ b/drivers/gpu/nova-core/firmware.rs
> @@ -457,3 +457,91 @@ pub(crate) const fn create(
>          this.0
>      }
>  }
> +
> +/// Ad-hoc and temporary module to extract sections from ELF images.
> +///
> +/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
> +/// to specific and related bits of data. Future firmware versions are scheduled to move away from
> +/// that scheme before nova-core becomes stable, which means this module will eventually be
> +/// removed.
> +mod elf {
> +    use core::mem::size_of;

This import is not needed, `size_of` is already in the prelude.

> +
> +    use kernel::{
> +        bindings,
> +        str::CStr,
> +        transmute::FromBytes, //
> +    };
> +
> +    /// Newtype to provide a [`FromBytes`] implementation.
> +    #[repr(transparent)]
> +    struct Elf64Hdr(bindings::elf64_hdr);
> +    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
> +    unsafe impl FromBytes for Elf64Hdr {}
> +
> +    #[repr(transparent)]
> +    struct Elf64SHdr(bindings::elf64_shdr);
> +    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
> +    unsafe impl FromBytes for Elf64SHdr {}
> +
> +    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
> +    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
> +        let hdr = &elf
> +            .get(0..size_of::<bindings::elf64_hdr>())
> +            .and_then(Elf64Hdr::from_bytes)?
> +            .0;
> +
> +        // Get all the section headers.
> +        let mut shdr = {
> +            let shdr_num = usize::from(hdr.e_shnum);
> +            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
> +            let shdr_end = shdr_num
> +                .checked_mul(size_of::<Elf64SHdr>())
> +                .and_then(|v| v.checked_add(shdr_start))?;
> +
> +            elf.get(shdr_start..shdr_end)
> +                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
> +        };
> +
> +        // Get the strings table.
> +        let strhdr = shdr
> +            .clone()
> +            .nth(usize::from(hdr.e_shstrndx))
> +            .and_then(Elf64SHdr::from_bytes)?;
> +
> +        // Find the section which name matches `name` and return it.
> +        shdr.find(|&sh| {
> +            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
> +                return false;
> +            };
> +
> +            let Some(name_idx) = strhdr
> +                .0
> +                .sh_offset
> +                .checked_add(u64::from(hdr.0.sh_name))
> +                .and_then(|idx| usize::try_from(idx).ok())
> +            else {
> +                return false;
> +            };
> +
> +            // Get the start of the name.
> +            elf.get(name_idx..)
> +                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
> +                // Convert into str.
> +                .and_then(|c_str| c_str.to_str().ok())
> +                // Check that the name matches.
> +                .map(|str| str == name)
> +                .unwrap_or(false)
> +        })
> +        // Return the slice containing the section.
> +        .and_then(|sh| {
> +            let hdr = Elf64SHdr::from_bytes(sh)?;
> +            let start = usize::try_from(hdr.0.sh_offset).ok()?;
> +            let end = usize::try_from(hdr.0.sh_size)
> +                .ok()
> +                .and_then(|sh_size| start.checked_add(sh_size))?;
> +
> +            elf.get(start..end)
> +        })
> +    }
> +}
> diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
> index 8bbc3809c640..c6e71339b28e 100644
> --- a/drivers/gpu/nova-core/firmware/gsp.rs
> +++ b/drivers/gpu/nova-core/firmware/gsp.rs
> @@ -1,5 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0
>  
> +use core::mem::size_of_val;

And this one is unneeded as well. Actually I mentioned that in my v6
review [1].

[1] https://lore.kernel.org/all/DGZ150DHI878.2YXL15FY7W0GG@nvidia.com/


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-03-23 13:19   ` Alexandre Courbot
@ 2026-03-25  3:30     ` John Hubbard
  2026-03-25 11:06       ` Alexandre Courbot
  2026-03-25 11:16       ` Miguel Ojeda
  0 siblings, 2 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-25  3:30 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 3/23/26 6:19 AM, Alexandre Courbot wrote:
> On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
>> Up until now, only the GSP required parsing of its firmware headers.
>> However, upcoming support for Hopper/Blackwell+ adds another firmware
>> image (FMC), along with another format (ELF32).
>>
>> Therefore, the current ELF64 section parsing support needs to be moved
>> up a level, so that both of the above can use it.
>>
>> There are no functional changes. This is pure code movement.
>>
>> Reviewed-by: Gary Guo <gary@garyguo.net>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/firmware.rs     | 88 +++++++++++++++++++++++++
>>  drivers/gpu/nova-core/firmware/gsp.rs | 93 ++-------------------------
>>  2 files changed, 94 insertions(+), 87 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
>> index 2bb20081befd..177b8ede151c 100644
>> --- a/drivers/gpu/nova-core/firmware.rs
>> +++ b/drivers/gpu/nova-core/firmware.rs
>> @@ -457,3 +457,91 @@ pub(crate) const fn create(
>>          this.0
>>      }
>>  }
>> +
>> +/// Ad-hoc and temporary module to extract sections from ELF images.
>> +///
>> +/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
>> +/// to specific and related bits of data. Future firmware versions are scheduled to move away from
>> +/// that scheme before nova-core becomes stable, which means this module will eventually be
>> +/// removed.
>> +mod elf {
>> +    use core::mem::size_of;
> 
> This import is not needed, `size_of` is already in the prelude.


The `mod elf` block is a nested module that doesn't have `use
kernel::prelude::*;` in scope, so `size_of` isn't available without the
explicit import.

Or some it seems. I'm not experienced with the module scoping game,
and so I may have it wrong.

> 
>> +
>> +    use kernel::{
>> +        bindings,
>> +        str::CStr,
>> +        transmute::FromBytes, //
>> +    };
>> +
>> +    /// Newtype to provide a [`FromBytes`] implementation.
>> +    #[repr(transparent)]
>> +    struct Elf64Hdr(bindings::elf64_hdr);
>> +    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
>> +    unsafe impl FromBytes for Elf64Hdr {}
>> +
>> +    #[repr(transparent)]
>> +    struct Elf64SHdr(bindings::elf64_shdr);
>> +    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
>> +    unsafe impl FromBytes for Elf64SHdr {}
>> +
>> +    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
>> +    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
>> +        let hdr = &elf
>> +            .get(0..size_of::<bindings::elf64_hdr>())
>> +            .and_then(Elf64Hdr::from_bytes)?
>> +            .0;
>> +
>> +        // Get all the section headers.
>> +        let mut shdr = {
>> +            let shdr_num = usize::from(hdr.e_shnum);
>> +            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
>> +            let shdr_end = shdr_num
>> +                .checked_mul(size_of::<Elf64SHdr>())
>> +                .and_then(|v| v.checked_add(shdr_start))?;
>> +
>> +            elf.get(shdr_start..shdr_end)
>> +                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
>> +        };
>> +
>> +        // Get the strings table.
>> +        let strhdr = shdr
>> +            .clone()
>> +            .nth(usize::from(hdr.e_shstrndx))
>> +            .and_then(Elf64SHdr::from_bytes)?;
>> +
>> +        // Find the section which name matches `name` and return it.
>> +        shdr.find(|&sh| {
>> +            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
>> +                return false;
>> +            };
>> +
>> +            let Some(name_idx) = strhdr
>> +                .0
>> +                .sh_offset
>> +                .checked_add(u64::from(hdr.0.sh_name))
>> +                .and_then(|idx| usize::try_from(idx).ok())
>> +            else {
>> +                return false;
>> +            };
>> +
>> +            // Get the start of the name.
>> +            elf.get(name_idx..)
>> +                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
>> +                // Convert into str.
>> +                .and_then(|c_str| c_str.to_str().ok())
>> +                // Check that the name matches.
>> +                .map(|str| str == name)
>> +                .unwrap_or(false)
>> +        })
>> +        // Return the slice containing the section.
>> +        .and_then(|sh| {
>> +            let hdr = Elf64SHdr::from_bytes(sh)?;
>> +            let start = usize::try_from(hdr.0.sh_offset).ok()?;
>> +            let end = usize::try_from(hdr.0.sh_size)
>> +                .ok()
>> +                .and_then(|sh_size| start.checked_add(sh_size))?;
>> +
>> +            elf.get(start..end)
>> +        })
>> +    }
>> +}
>> diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
>> index 8bbc3809c640..c6e71339b28e 100644
>> --- a/drivers/gpu/nova-core/firmware/gsp.rs
>> +++ b/drivers/gpu/nova-core/firmware/gsp.rs
>> @@ -1,5 +1,7 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  
>> +use core::mem::size_of_val;
> 
> And this one is unneeded as well. Actually I mentioned that in my v6
> review [1].

Same situation: firmware/gsp.rs doesn't import the kernel prelude, so
`size_of_val` needs the explicit import.

> 
> [1] https://lore.kernel.org/all/DGZ150DHI878.2YXL15FY7W0GG@nvidia.com/
> 

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-03-25  3:30     ` John Hubbard
@ 2026-03-25 11:06       ` Alexandre Courbot
  2026-03-25 11:18         ` Miguel Ojeda
  2026-03-25 11:16       ` Miguel Ojeda
  1 sibling, 1 reply; 66+ messages in thread
From: Alexandre Courbot @ 2026-03-25 11:06 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 25, 2026 at 12:30 PM JST, John Hubbard wrote:
> On 3/23/26 6:19 AM, Alexandre Courbot wrote:
>> On Wed Mar 18, 2026 at 7:53 AM JST, John Hubbard wrote:
>>> Up until now, only the GSP required parsing of its firmware headers.
>>> However, upcoming support for Hopper/Blackwell+ adds another firmware
>>> image (FMC), along with another format (ELF32).
>>>
>>> Therefore, the current ELF64 section parsing support needs to be moved
>>> up a level, so that both of the above can use it.
>>>
>>> There are no functional changes. This is pure code movement.
>>>
>>> Reviewed-by: Gary Guo <gary@garyguo.net>
>>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>>> ---
>>>  drivers/gpu/nova-core/firmware.rs     | 88 +++++++++++++++++++++++++
>>>  drivers/gpu/nova-core/firmware/gsp.rs | 93 ++-------------------------
>>>  2 files changed, 94 insertions(+), 87 deletions(-)
>>>
>>> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
>>> index 2bb20081befd..177b8ede151c 100644
>>> --- a/drivers/gpu/nova-core/firmware.rs
>>> +++ b/drivers/gpu/nova-core/firmware.rs
>>> @@ -457,3 +457,91 @@ pub(crate) const fn create(
>>>          this.0
>>>      }
>>>  }
>>> +
>>> +/// Ad-hoc and temporary module to extract sections from ELF images.
>>> +///
>>> +/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
>>> +/// to specific and related bits of data. Future firmware versions are scheduled to move away from
>>> +/// that scheme before nova-core becomes stable, which means this module will eventually be
>>> +/// removed.
>>> +mod elf {
>>> +    use core::mem::size_of;
>> 
>> This import is not needed, `size_of` is already in the prelude.
>
>
> The `mod elf` block is a nested module that doesn't have `use
> kernel::prelude::*;` in scope, so `size_of` isn't available without the
> explicit import.
>
> Or some it seems. I'm not experienced with the module scoping game,
> and so I may have it wrong.

No, you are correct; removing them works fine with the latest Rust, so I
assumed they were unneeded - but 1.78 fails to build without them.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-03-25 11:06       ` Alexandre Courbot
@ 2026-03-25 11:18         ` Miguel Ojeda
  0 siblings, 0 replies; 66+ messages in thread
From: Miguel Ojeda @ 2026-03-25 11:18 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: John Hubbard, Danilo Krummrich, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Shashank Sharma, Zhi Wang,
	David Airlie, Simona Vetter, Bjorn Helgaas, Miguel Ojeda,
	Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
	rust-for-linux, LKML

On Wed, Mar 25, 2026 at 12:06 PM Alexandre Courbot <acourbot@nvidia.com> wrote:
>
> No, you are correct; removing them works fine with the latest Rust, so I
> assumed they were unneeded - but 1.78 fails to build without them.

So that wouldn't work in general, but it happens that for `size_of`
and friends were added to the standard library prelude in 1.80.0.

So I added those to our own prelude so that 1.78 would work too if one
imports our prelude to minimize confusion:

  72b04a8af7fb ("rust: prelude: re-export `core::mem::{align,size}_of{,_val}`")

I hope that clarifies!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-03-25  3:30     ` John Hubbard
  2026-03-25 11:06       ` Alexandre Courbot
@ 2026-03-25 11:16       ` Miguel Ojeda
  1 sibling, 0 replies; 66+ messages in thread
From: Miguel Ojeda @ 2026-03-25 11:16 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Danilo Krummrich, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Shashank Sharma, Zhi Wang,
	David Airlie, Simona Vetter, Bjorn Helgaas, Miguel Ojeda,
	Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
	rust-for-linux, LKML

On Wed, Mar 25, 2026 at 4:30 AM John Hubbard <jhubbard@nvidia.com> wrote:
>
> The `mod elf` block is a nested module that doesn't have `use
> kernel::prelude::*;` in scope, so `size_of` isn't available without the
> explicit import.
>
> Or some it seems. I'm not experienced with the module scoping game,
> and so I may have it wrong.

Yeah, our "prelude" so far is a normal crate in Rust.

With the "custom prelude" feature that I have been asking upstream
Rust for a long time, that need would go away, i.e. we would be able
to define something like the standard library prelude -- I have a bit
more context in the wish list:

  https://github.com/Rust-for-Linux/linux/issues/354

For doctests, I add it "manually" (but only at the top-level, i.e.
same issue, but at least we save most of those `use`s).

Having said that, instead of importing things from the prelude
manually, even in a submodule, if you need something from it, please
just import the prelude again, like we do at the top-level.

Thanks!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 08/31] gpu: nova-core: factor out an elf_str() function
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (6 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 09/31] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Factor out a chunk of complexity into a new subroutine. This is an
incremental step in adding ELF32 support to the existing ELF64 section
support, for handling GPU firmware.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 40 ++++++++++++-------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 177b8ede151c..6c2ab69cb605 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -484,6 +484,13 @@ unsafe impl FromBytes for Elf64Hdr {}
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64SHdr {}
 
+    /// Returns a NULL-terminated string from the ELF image at `offset`.
+    fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
+        let idx = usize::try_from(offset).ok()?;
+        let bytes = elf.get(idx..)?;
+        CStr::from_bytes_until_nul(bytes).ok()?.to_str().ok()
+    }
+
     /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
     pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
         let hdr = &elf
@@ -510,32 +517,15 @@ pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a
             .and_then(Elf64SHdr::from_bytes)?;
 
         // Find the section which name matches `name` and return it.
-        shdr.find(|&sh| {
-            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
-                return false;
-            };
-
-            let Some(name_idx) = strhdr
-                .0
-                .sh_offset
-                .checked_add(u64::from(hdr.0.sh_name))
-                .and_then(|idx| usize::try_from(idx).ok())
-            else {
-                return false;
-            };
-
-            // Get the start of the name.
-            elf.get(name_idx..)
-                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
-                // Convert into str.
-                .and_then(|c_str| c_str.to_str().ok())
-                // Check that the name matches.
-                .map(|str| str == name)
-                .unwrap_or(false)
-        })
-        // Return the slice containing the section.
-        .and_then(|sh| {
+        shdr.find_map(|sh| {
             let hdr = Elf64SHdr::from_bytes(sh)?;
+            let name_offset = strhdr.0.sh_offset.checked_add(u64::from(hdr.0.sh_name))?;
+            let section_name = elf_str(elf, name_offset)?;
+
+            if section_name != name {
+                return None;
+            }
+
             let start = usize::try_from(hdr.0.sh_offset).ok()?;
             let end = usize::try_from(hdr.0.sh_size)
                 .ok()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 09/31] gpu: nova-core: don't assume 64-bit firmware images
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (7 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 08/31] gpu: nova-core: factor out an elf_str() function John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 10/31] gpu: nova-core: add support for 32-bit " John Hubbard
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Introduce a single ELF format abstraction that ties each ELF header
type to its matching section-header type. This keeps the shared
section parser ready for upcoming ELF32 support and avoids mixing
32-bit and 64-bit ELF layouts by mistake.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 111 ++++++++++++++++++++++--------
 1 file changed, 84 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 6c2ab69cb605..46c26d749a65 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -473,17 +473,72 @@ mod elf {
         transmute::FromBytes, //
     };
 
+    /// Trait to abstract over ELF header differences.
+    trait ElfHeader: FromBytes {
+        fn shnum(&self) -> u16;
+        fn shoff(&self) -> u64;
+        fn shstrndx(&self) -> u16;
+    }
+
+    /// Trait to abstract over ELF section-header differences.
+    trait ElfSectionHeader: FromBytes {
+        fn name(&self) -> u32;
+        fn offset(&self) -> u64;
+        fn size(&self) -> u64;
+    }
+
+    /// Trait describing a matching ELF header and section-header format.
+    trait ElfFormat {
+        type Header: ElfHeader;
+        type SectionHeader: ElfSectionHeader;
+    }
+
     /// Newtype to provide a [`FromBytes`] implementation.
     #[repr(transparent)]
     struct Elf64Hdr(bindings::elf64_hdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64Hdr {}
 
+    impl ElfHeader for Elf64Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            self.0.e_shoff
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
     #[repr(transparent)]
     struct Elf64SHdr(bindings::elf64_shdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64SHdr {}
 
+    impl ElfSectionHeader for Elf64SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            self.0.sh_offset
+        }
+
+        fn size(&self) -> u64 {
+            self.0.sh_size
+        }
+    }
+
+    struct Elf64Format;
+
+    impl ElfFormat for Elf64Format {
+        type Header = Elf64Hdr;
+        type SectionHeader = Elf64SHdr;
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -491,47 +546,49 @@ fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         CStr::from_bytes_until_nul(bytes).ok()?.to_str().ok()
     }
 
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
-        let hdr = &elf
-            .get(0..size_of::<bindings::elf64_hdr>())
-            .and_then(Elf64Hdr::from_bytes)?
-            .0;
-
-        // Get all the section headers.
-        let mut shdr = {
-            let shdr_num = usize::from(hdr.e_shnum);
-            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
-            let shdr_end = shdr_num
-                .checked_mul(size_of::<Elf64SHdr>())
-                .and_then(|v| v.checked_add(shdr_start))?;
-
-            elf.get(shdr_start..shdr_end)
-                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
-        };
+    fn elf_section_generic<'a, F>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
+    where
+        F: ElfFormat,
+    {
+        let hdr = F::Header::from_bytes(elf.get(0..size_of::<F::Header>())?)?;
+
+        let shdr_num = usize::from(hdr.shnum());
+        let shdr_start = usize::try_from(hdr.shoff()).ok()?;
+        let shdr_end = shdr_num
+            .checked_mul(size_of::<F::SectionHeader>())
+            .and_then(|v| v.checked_add(shdr_start))?;
+
+        // Get all the section headers as an iterator over byte chunks.
+        let shdr_bytes = elf.get(shdr_start..shdr_end)?;
+        let mut shdr_iter = shdr_bytes.chunks_exact(size_of::<F::SectionHeader>());
 
         // Get the strings table.
-        let strhdr = shdr
+        let strhdr = shdr_iter
             .clone()
-            .nth(usize::from(hdr.e_shstrndx))
-            .and_then(Elf64SHdr::from_bytes)?;
+            .nth(usize::from(hdr.shstrndx()))
+            .and_then(F::SectionHeader::from_bytes)?;
 
         // Find the section which name matches `name` and return it.
-        shdr.find_map(|sh| {
-            let hdr = Elf64SHdr::from_bytes(sh)?;
-            let name_offset = strhdr.0.sh_offset.checked_add(u64::from(hdr.0.sh_name))?;
+        shdr_iter.find_map(|sh_bytes| {
+            let sh = F::SectionHeader::from_bytes(sh_bytes)?;
+            let name_offset = strhdr.offset().checked_add(u64::from(sh.name()))?;
             let section_name = elf_str(elf, name_offset)?;
 
             if section_name != name {
                 return None;
             }
 
-            let start = usize::try_from(hdr.0.sh_offset).ok()?;
-            let end = usize::try_from(hdr.0.sh_size)
+            let start = usize::try_from(sh.offset()).ok()?;
+            let end = usize::try_from(sh.size())
                 .ok()
-                .and_then(|sh_size| start.checked_add(sh_size))?;
+                .and_then(|sz| start.checked_add(sz))?;
 
             elf.get(start..end)
         })
     }
+
+    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
+    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf64Format>(elf, name)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 10/31] gpu: nova-core: add support for 32-bit firmware images
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (8 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 09/31] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 11/31] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add ELF32 header and section header newtypes with ElfHeader and
ElfSectionHeader trait implementations, mirroring the existing ELF64
support. Add elf32_section() for extracting sections from ELF32 images.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 53 +++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 46c26d749a65..a0745c332d4d 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -539,6 +539,53 @@ impl ElfFormat for Elf64Format {
         type SectionHeader = Elf64SHdr;
     }
 
+    /// Newtype to provide [`FromBytes`] and [`ElfHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32Hdr(bindings::elf32_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32Hdr {}
+
+    impl ElfHeader for Elf32Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            u64::from(self.0.e_shoff)
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfSectionHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32SHdr(bindings::elf32_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32SHdr {}
+
+    impl ElfSectionHeader for Elf32SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            u64::from(self.0.sh_offset)
+        }
+
+        fn size(&self) -> u64 {
+            u64::from(self.0.sh_size)
+        }
+    }
+
+    struct Elf32Format;
+
+    impl ElfFormat for Elf32Format {
+        type Header = Elf32Hdr;
+        type SectionHeader = Elf32SHdr;
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -591,4 +638,10 @@ fn elf_section_generic<'a, F>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
     pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Format>(elf, name)
     }
+
+    /// Extract the section with name `name` from the ELF32 image `elf`.
+    #[expect(dead_code)]
+    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf32Format>(elf, name)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 11/31] gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (9 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 10/31] gpu: nova-core: add support for 32-bit " John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 12/31] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add elf_section() which automatically detects ELF32 vs ELF64 based on
the ELF header's class byte, and dispatches to the appropriate parser.
Switch gsp.rs callers from elf64_section() to elf_section(), making
both elf32_section() and elf64_section() private.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     | 22 ++++++++++++++++++----
 drivers/gpu/nova-core/firmware/gsp.rs |  4 ++--
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index a0745c332d4d..bc217bfc225f 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -634,14 +634,28 @@ fn elf_section_generic<'a, F>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
         })
     }
 
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    /// Extract the section with name `name` from the ELF64 image `elf`.
+    fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Format>(elf, name)
     }
 
     /// Extract the section with name `name` from the ELF32 image `elf`.
-    #[expect(dead_code)]
-    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf32Format>(elf, name)
     }
+
+    /// Automatically detects ELF32 vs ELF64 based on the ELF header.
+    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        // Check ELF magic.
+        if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
+            return None;
+        }
+
+        // Check ELF class: 1 = 32-bit, 2 = 64-bit.
+        match elf.get(4)? {
+            1 => elf32_section(elf, name),
+            2 => elf64_section(elf, name),
+            _ => None,
+        }
+    }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index c6e71339b28e..360cb7014073 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -93,7 +93,7 @@ pub(crate) fn new<'a>(
         pin_init::pin_init_scope(move || {
             let firmware = super::request_firmware(dev, chipset, "gsp", ver)?;
 
-            let fw_section = elf::elf64_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
+            let fw_section = elf::elf_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
 
             let size = fw_section.len();
 
@@ -150,7 +150,7 @@ pub(crate) fn new<'a>(
                 signatures: {
                     let sigs_section = Self::find_gsp_sigs_section(chipset).ok_or(ENOTSUPP)?;
 
-                    elf::elf64_section(firmware.data(), sigs_section)
+                    elf::elf_section(firmware.data(), sigs_section)
                         .ok_or(EINVAL)
                         .and_then(|data| DmaObject::from_data(dev, data))?
                 },
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 12/31] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (10 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 11/31] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 13/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

FSP is the Falcon that runs FMC firmware on Hopper and Blackwell.
Load the FMC ELF in two forms: the image section that FSP boots from,
and a CPU-side copy of the full ELF for later signature extraction
during Chain of Trust verification.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     |  1 +
 drivers/gpu/nova-core/firmware/fsp.rs | 47 +++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index bc217bfc225f..bc26807116e4 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -27,6 +27,7 @@
 };
 
 pub(crate) mod booter;
+pub(crate) mod fsp;
 pub(crate) mod fwsec;
 pub(crate) mod gsp;
 pub(crate) mod riscv;
diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
new file mode 100644
index 000000000000..5aedee8e6d41
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -0,0 +1,47 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP is a hardware unit that runs FMC firmware.
+
+use kernel::{
+    alloc::KVec,
+    device,
+    prelude::*, //
+};
+
+use crate::{
+    dma::DmaObject,
+    firmware::elf,
+    gpu::Chipset, //
+};
+
+#[expect(unused)]
+pub(crate) struct FspFirmware {
+    /// FMC firmware image data (only the "image" ELF section).
+    fmc_image: DmaObject,
+    /// Full FMC ELF data (for signature extraction).
+    pub(crate) fmc_full: KVec<u8>,
+}
+
+impl FspFirmware {
+    #[expect(unused)]
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: Chipset,
+        ver: &str,
+    ) -> Result<Self> {
+        let fw = super::request_firmware(dev, chipset, "fmc", ver)?;
+        let mut fmc_full = KVec::with_capacity(fw.data().len(), GFP_KERNEL)?;
+        fmc_full.extend_from_slice(fw.data(), GFP_KERNEL)?;
+
+        // FSP expects only the "image" section, not the entire ELF file.
+        let fmc_image_data = elf::elf_section(fw.data(), "image").ok_or_else(|| {
+            dev_err!(dev, "FMC ELF file missing 'image' section\n");
+            EINVAL
+        })?;
+
+        Ok(Self {
+            fmc_image: DmaObject::from_data(dev, fmc_image_data)?,
+            fmc_full,
+        })
+    }
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 13/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (11 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 12/31] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 14/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add the FSP (Firmware System Processor) falcon engine type that will
handle secure boot and Chain of Trust operations on Hopper and Blackwell
architectures.

The FSP falcon replaces SEC2's role in the boot sequence for these newer
architectures. This initial stub just defines the falcon type and its
base address.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs     |  1 +
 drivers/gpu/nova-core/falcon/fsp.rs | 30 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 7097a206ec3c..f515a4ff2f5f 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -33,6 +33,7 @@
     regs::macros::RegisterBase, //
 };
 
+pub(crate) mod fsp;
 pub(crate) mod gsp;
 mod hal;
 pub(crate) mod sec2;
diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
new file mode 100644
index 000000000000..c5ba1c2412cd
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP (Firmware System Processor) falcon engine for Hopper/Blackwell GPUs.
+//!
+//! The FSP falcon handles secure boot and Chain of Trust operations
+//! on Hopper and Blackwell architectures, replacing SEC2's role.
+
+use crate::{
+    falcon::{
+        FalconEngine,
+        PFalcon2Base,
+        PFalconBase, //
+    },
+    regs::macros::RegisterBase,
+};
+
+/// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
+pub(crate) struct Fsp(());
+
+impl RegisterBase<PFalconBase> for Fsp {
+    const BASE: usize = 0x8f2000;
+}
+
+impl RegisterBase<PFalcon2Base> for Fsp {
+    const BASE: usize = 0x8f3000;
+}
+
+impl FalconEngine for Fsp {
+    const ID: Self = Fsp(());
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 14/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (12 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 13/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 15/31] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add external memory (EMEM) read/write operations to the GPU's FSP falcon
engine. These operations use Falcon PIO (Programmed I/O) to communicate
with the FSP through indirect memory access.

Cc: Gary Guo <gary@garyguo.net>
Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 120 +++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs       |  12 +++
 2 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index c5ba1c2412cd..29a68d6934a9 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -5,13 +5,26 @@
 //! The FSP falcon handles secure boot and Chain of Trust operations
 //! on Hopper and Blackwell architectures, replacing SEC2's role.
 
+use kernel::{
+    io::{
+        Io,
+        IoCapable, //
+    },
+    prelude::*, //
+};
+
 use crate::{
+    driver::Bar0,
     falcon::{
+        Falcon,
         FalconEngine,
         PFalcon2Base,
         PFalconBase, //
     },
-    regs::macros::RegisterBase,
+    regs::{
+        self,
+        macros::RegisterBase, //
+    },
 };
 
 /// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
@@ -28,3 +41,108 @@ impl RegisterBase<PFalcon2Base> for Fsp {
 impl FalconEngine for Fsp {
     const ID: Self = Fsp(());
 }
+
+/// Maximum addressable EMEM size, derived from the 24-bit offset field
+/// in NV_PFALCON_FALCON_EMEM_CTL.
+const EMEM_MAX_SIZE: usize = 1 << 24;
+
+/// I/O backend for the FSP falcon's external memory (EMEM).
+///
+/// Each 32-bit access programs a byte offset via the EMEM_CTL register,
+/// then reads or writes through the EMEM_DATA register.
+pub(crate) struct Emem<'a> {
+    bar: &'a Bar0,
+}
+
+impl<'a> Emem<'a> {
+    fn new(bar: &'a Bar0) -> Self {
+        Self { bar }
+    }
+}
+
+impl IoCapable<u32> for Emem<'_> {
+    unsafe fn io_read(&self, address: usize) -> u32 {
+        // The Io trait validates that EMEM accesses fit within the 24-bit offset field.
+        let offset = address as u32;
+
+        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
+            .set_rd_mode(true)
+            .set_offset(offset)
+            .write(self.bar, &Fsp::ID);
+
+        regs::NV_PFALCON_FALCON_EMEM_DATA::read(self.bar, &Fsp::ID).data()
+    }
+
+    unsafe fn io_write(&self, value: u32, address: usize) {
+        // The Io trait validates that EMEM accesses fit within the 24-bit offset field.
+        let offset = address as u32;
+
+        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
+            .set_wr_mode(true)
+            .set_offset(offset)
+            .write(self.bar, &Fsp::ID);
+
+        regs::NV_PFALCON_FALCON_EMEM_DATA::default()
+            .set_data(value)
+            .write(self.bar, &Fsp::ID);
+    }
+}
+
+impl Io for Emem<'_> {
+    fn addr(&self) -> usize {
+        0
+    }
+
+    fn maxsize(&self) -> usize {
+        EMEM_MAX_SIZE
+    }
+}
+
+impl Falcon<Fsp> {
+    /// Returns an EMEM I/O accessor for this FSP falcon.
+    pub(crate) fn emem<'a>(&self, bar: &'a Bar0) -> Emem<'a> {
+        Emem::new(bar)
+    }
+
+    /// Writes `data` to FSP external memory at byte `offset`.
+    ///
+    /// Data is interpreted as little-endian 32-bit words.
+    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
+    #[expect(unused)]
+    pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        let emem = self.emem(bar);
+        let mut off = offset as usize;
+        for chunk in data.chunks_exact(4) {
+            let word = u32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
+            emem.try_write32(word, off)?;
+            off += 4;
+        }
+
+        Ok(())
+    }
+
+    /// Reads FSP external memory at byte `offset` into `data`.
+    ///
+    /// Data is stored as little-endian 32-bit words.
+    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
+    #[expect(unused)]
+    pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        let emem = self.emem(bar);
+        let mut off = offset as usize;
+        for chunk in data.chunks_exact_mut(4) {
+            let word = emem.try_read32(off)?;
+            chunk.copy_from_slice(&word.to_le_bytes());
+            off += 4;
+        }
+
+        Ok(())
+    }
+}
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 53f412f0ca32..f577800db3e3 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -461,6 +461,18 @@ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
     8:8     br_fetch as bool;
 });
 
+// Falcon EMEM PIO registers (used by FSP on Hopper/Blackwell).
+// These provide the falcon external memory communication interface.
+register!(NV_PFALCON_FALCON_EMEM_CTL @ PFalconBase[0x00000ac0] {
+    23:0    offset as u32;      // EMEM byte offset (must be 4-byte aligned)
+    24:24   wr_mode as bool;    // Write mode
+    25:25   rd_mode as bool;    // Read mode
+});
+
+register!(NV_PFALCON_FALCON_EMEM_DATA @ PFalconBase[0x00000ac4] {
+    31:0    data as u32;        // EMEM data register
+});
+
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 15/31] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (13 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 14/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 16/31] rust: ptr: add const_align_up() John Hubbard
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add the FSP messaging infrastructure needed for Chain of Trust
communication on Hopper/Blackwell GPUs.

Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 79 ++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs       | 18 +++++++
 2 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index 29a68d6934a9..faf923246ae9 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -108,7 +108,6 @@ pub(crate) fn emem<'a>(&self, bar: &'a Bar0) -> Emem<'a> {
     ///
     /// Data is interpreted as little-endian 32-bit words.
     /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
-    #[expect(unused)]
     pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
         if offset % 4 != 0 || data.len() % 4 != 0 {
             return Err(EINVAL);
@@ -129,7 +128,6 @@ pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result
     ///
     /// Data is stored as little-endian 32-bit words.
     /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
-    #[expect(unused)]
     pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
         if offset % 4 != 0 || data.len() % 4 != 0 {
             return Err(EINVAL);
@@ -145,4 +143,81 @@ pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Resu
 
         Ok(())
     }
+
+    /// Poll FSP for incoming data.
+    ///
+    /// Returns the size of available data in bytes, or 0 if no data is available.
+    ///
+    /// The FSP message queue is not circular - pointers are reset to 0 after each
+    /// message exchange, so `tail >= head` is always true when data is present.
+    #[expect(unused)]
+    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
+        let head = regs::NV_PFSP_MSGQ_HEAD::read(bar).address();
+        let tail = regs::NV_PFSP_MSGQ_TAIL::read(bar).address();
+
+        if head == tail {
+            return 0;
+        }
+
+        // TAIL points at last DWORD written, so add 4 to get total size
+        tail.saturating_sub(head) + 4
+    }
+
+    /// Send message to FSP.
+    ///
+    /// Writes a message to FSP EMEM and updates queue pointers to notify FSP.
+    ///
+    /// # Arguments
+    /// * `bar` - BAR0 memory mapping
+    /// * `packet` - Message data (must be 4-byte aligned in length)
+    ///
+    /// # Returns
+    /// `Ok(())` on success, `Err(EINVAL)` if packet is empty or not 4-byte aligned
+    #[expect(unused)]
+    pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
+        if packet.is_empty() {
+            return Err(EINVAL);
+        }
+
+        // Write message to EMEM at offset 0 (validates 4-byte alignment)
+        self.write_emem(bar, 0, packet)?;
+
+        // Update queue pointers - TAIL points at last DWORD written
+        let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
+        regs::NV_PFSP_QUEUE_TAIL::default()
+            .set_address(tail_offset)
+            .write(bar);
+        regs::NV_PFSP_QUEUE_HEAD::default()
+            .set_address(0)
+            .write(bar);
+
+        Ok(())
+    }
+
+    /// Receive message from FSP.
+    ///
+    /// Reads a message from FSP EMEM and resets queue pointers.
+    ///
+    /// # Arguments
+    /// * `bar` - BAR0 memory mapping
+    /// * `buffer` - Buffer to receive message data
+    /// * `size` - Size of message to read in bytes (from `poll_msgq`)
+    ///
+    /// # Returns
+    /// `Ok(bytes_read)` on success, `Err(EINVAL)` if size is 0, exceeds buffer, or not aligned
+    #[expect(unused)]
+    pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
+        if size == 0 || size > buffer.len() {
+            return Err(EINVAL);
+        }
+
+        // Read response from EMEM at offset 0 (validates 4-byte alignment)
+        self.read_emem(bar, 0, &mut buffer[..size])?;
+
+        // Reset message queue pointers after reading
+        regs::NV_PFSP_MSGQ_TAIL::default().set_address(0).write(bar);
+        regs::NV_PFSP_MSGQ_HEAD::default().set_address(0).write(bar);
+
+        Ok(size)
+    }
 }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index f577800db3e3..686556bb9f38 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -473,6 +473,24 @@ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
     31:0    data as u32;        // EMEM data register
 });
 
+// FSP (Firmware System Processor) queue registers for Hopper/Blackwell Chain of Trust
+// These registers manage falcon EMEM communication queues
+register!(NV_PFSP_QUEUE_HEAD @ 0x008f2c00 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_QUEUE_TAIL @ 0x008f2c04 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_MSGQ_HEAD @ 0x008f2c80 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_MSGQ_TAIL @ 0x008f2c84 {
+    31:0    address as u32;
+});
+
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (14 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 15/31] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-20  8:37   ` David Rheinsberg
  2026-03-20  9:48   ` Alice Ryhl
  2026-03-17 22:53 ` [PATCH v7 17/31] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
                   ` (15 subsequent siblings)
  31 siblings, 2 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add const_align_up() to kernel::ptr as the const-compatible equivalent
of Alignable::align_up().

Suggested-by: Danilo Krummrich <dakr@kernel.org>
Suggested-by: Gary Guo <gary@garyguo.net>
Suggested-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 rust/kernel/ptr.rs | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
index bdc2d79ff669..7e99f129543b 100644
--- a/rust/kernel/ptr.rs
+++ b/rust/kernel/ptr.rs
@@ -253,3 +253,27 @@ fn size(p: *const Self) -> usize {
         p.len() * size_of::<T>()
     }
 }
+
+/// Aligns `value` up to `align`.
+///
+/// This is the const-compatible equivalent of [`Alignable::align_up`].
+///
+/// Returns [`None`] on overflow.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::ptr::{const_align_up, Alignment};
+/// use kernel::sizes::SZ_4K;
+///
+/// assert_eq!(const_align_up(0x4f, Alignment::new::<16>()), Some(0x50));
+/// assert_eq!(const_align_up(0x40, Alignment::new::<16>()), Some(0x40));
+/// assert_eq!(const_align_up(1, Alignment::new::<SZ_4K>()), Some(SZ_4K));
+/// ```
+#[inline(always)]
+pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
+    match value.checked_add(align.as_usize() - 1) {
+        Some(v) => Some(v & align.mask()),
+        None => None,
+    }
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-17 22:53 ` [PATCH v7 16/31] rust: ptr: add const_align_up() John Hubbard
@ 2026-03-20  8:37   ` David Rheinsberg
  2026-03-20  8:44     ` Alice Ryhl
  2026-03-20  9:48   ` Alice Ryhl
  1 sibling, 1 reply; 66+ messages in thread
From: David Rheinsberg @ 2026-03-20  8:37 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

Hi

On Tue, Mar 17, 2026, at 11:53 PM, John Hubbard wrote:
> Add const_align_up() to kernel::ptr as the const-compatible equivalent
> of Alignable::align_up().
>
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Suggested-by: Gary Guo <gary@garyguo.net>
> Suggested-by: Miguel Ojeda <ojeda@kernel.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  rust/kernel/ptr.rs | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
>
> diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
> index bdc2d79ff669..7e99f129543b 100644
> --- a/rust/kernel/ptr.rs
> +++ b/rust/kernel/ptr.rs
> @@ -253,3 +253,27 @@ fn size(p: *const Self) -> usize {
>          p.len() * size_of::<T>()
>      }
>  }
> +
> +/// Aligns `value` up to `align`.
> +///
> +/// This is the const-compatible equivalent of [`Alignable::align_up`].
> +///
> +/// Returns [`None`] on overflow.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// use kernel::ptr::{const_align_up, Alignment};
> +/// use kernel::sizes::SZ_4K;
> +///
> +/// assert_eq!(const_align_up(0x4f, Alignment::new::<16>()), Some(0x50));
> +/// assert_eq!(const_align_up(0x40, Alignment::new::<16>()), Some(0x40));
> +/// assert_eq!(const_align_up(1, Alignment::new::<SZ_4K>()), Some(SZ_4K));
> +/// ```
> +#[inline(always)]
> +pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
> +    match value.checked_add(align.as_usize() - 1) {
> +        Some(v) => Some(v & align.mask()),
> +        None => None,
> +    }

This would return `None` if the value is already aligned, but the addition overflows `usize`, right? For instance, this would incorrectly return `None`: `const_align_up(usize::MAX - 1, 2.into())`

FYI, `core` provides `usize::checked_next_multiple_of()` ((const-)stable since 1.73). So an alternative would be:

pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
    value.checked_next_multiple_of(align.as_usize())
}

David

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20  8:37   ` David Rheinsberg
@ 2026-03-20  8:44     ` Alice Ryhl
  2026-03-20  8:58       ` David Rheinsberg
  0 siblings, 1 reply; 66+ messages in thread
From: Alice Ryhl @ 2026-03-20  8:44 UTC (permalink / raw)
  To: David Rheinsberg
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

On Fri, Mar 20, 2026 at 9:38 AM David Rheinsberg <david@readahead.eu> wrote:
>
> Hi
>
> On Tue, Mar 17, 2026, at 11:53 PM, John Hubbard wrote:
> > Add const_align_up() to kernel::ptr as the const-compatible equivalent
> > of Alignable::align_up().
> >
> > Suggested-by: Danilo Krummrich <dakr@kernel.org>
> > Suggested-by: Gary Guo <gary@garyguo.net>
> > Suggested-by: Miguel Ojeda <ojeda@kernel.org>
> > Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> > ---
> >  rust/kernel/ptr.rs | 24 ++++++++++++++++++++++++
> >  1 file changed, 24 insertions(+)
> >
> > diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
> > index bdc2d79ff669..7e99f129543b 100644
> > --- a/rust/kernel/ptr.rs
> > +++ b/rust/kernel/ptr.rs
> > @@ -253,3 +253,27 @@ fn size(p: *const Self) -> usize {
> >          p.len() * size_of::<T>()
> >      }
> >  }
> > +
> > +/// Aligns `value` up to `align`.
> > +///
> > +/// This is the const-compatible equivalent of [`Alignable::align_up`].
> > +///
> > +/// Returns [`None`] on overflow.
> > +///
> > +/// # Examples
> > +///
> > +/// ```
> > +/// use kernel::ptr::{const_align_up, Alignment};
> > +/// use kernel::sizes::SZ_4K;
> > +///
> > +/// assert_eq!(const_align_up(0x4f, Alignment::new::<16>()), Some(0x50));
> > +/// assert_eq!(const_align_up(0x40, Alignment::new::<16>()), Some(0x40));
> > +/// assert_eq!(const_align_up(1, Alignment::new::<SZ_4K>()), Some(SZ_4K));
> > +/// ```
> > +#[inline(always)]
> > +pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
> > +    match value.checked_add(align.as_usize() - 1) {
> > +        Some(v) => Some(v & align.mask()),
> > +        None => None,
> > +    }
>
> This would return `None` if the value is already aligned, but the addition overflows `usize`, right? For instance, this would incorrectly return `None`: `const_align_up(usize::MAX - 1, 2.into())`

No, in that case it computes `usize::MAX-1 + (2-1)` which is just
usize::MAX and does not overflow. After applying the mask, it returns
`usize::MAX-1` as the return value.

> FYI, `core` provides `usize::checked_next_multiple_of()` ((const-)stable since 1.73). So an alternative would be:
>
> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>     value.checked_next_multiple_of(align.as_usize())
> }

That would return value+align when value is already aligned, which is wrong.

Alice

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20  8:44     ` Alice Ryhl
@ 2026-03-20  8:58       ` David Rheinsberg
  2026-03-20  9:03         ` Alice Ryhl
  0 siblings, 1 reply; 66+ messages in thread
From: David Rheinsberg @ 2026-03-20  8:58 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

Hi

On Fri, Mar 20, 2026, at 9:44 AM, Alice Ryhl wrote:
> On Fri, Mar 20, 2026 at 9:38 AM David Rheinsberg <david@readahead.eu> wrote:
>>
>> Hi
>>
>> On Tue, Mar 17, 2026, at 11:53 PM, John Hubbard wrote:
>> > Add const_align_up() to kernel::ptr as the const-compatible equivalent
>> > of Alignable::align_up().
>> >
>> > Suggested-by: Danilo Krummrich <dakr@kernel.org>
>> > Suggested-by: Gary Guo <gary@garyguo.net>
>> > Suggested-by: Miguel Ojeda <ojeda@kernel.org>
>> > Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> > ---
>> >  rust/kernel/ptr.rs | 24 ++++++++++++++++++++++++
>> >  1 file changed, 24 insertions(+)
>> >
>> > diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
>> > index bdc2d79ff669..7e99f129543b 100644
>> > --- a/rust/kernel/ptr.rs
>> > +++ b/rust/kernel/ptr.rs
>> > @@ -253,3 +253,27 @@ fn size(p: *const Self) -> usize {
>> >          p.len() * size_of::<T>()
>> >      }
>> >  }
>> > +
>> > +/// Aligns `value` up to `align`.
>> > +///
>> > +/// This is the const-compatible equivalent of [`Alignable::align_up`].
>> > +///
>> > +/// Returns [`None`] on overflow.
>> > +///
>> > +/// # Examples
>> > +///
>> > +/// ```
>> > +/// use kernel::ptr::{const_align_up, Alignment};
>> > +/// use kernel::sizes::SZ_4K;
>> > +///
>> > +/// assert_eq!(const_align_up(0x4f, Alignment::new::<16>()), Some(0x50));
>> > +/// assert_eq!(const_align_up(0x40, Alignment::new::<16>()), Some(0x40));
>> > +/// assert_eq!(const_align_up(1, Alignment::new::<SZ_4K>()), Some(SZ_4K));
>> > +/// ```
>> > +#[inline(always)]
>> > +pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>> > +    match value.checked_add(align.as_usize() - 1) {
>> > +        Some(v) => Some(v & align.mask()),
>> > +        None => None,
>> > +    }
>>
>> This would return `None` if the value is already aligned, but the addition overflows `usize`, right? For instance, this would incorrectly return `None`: `const_align_up(usize::MAX - 1, 2.into())`
>
> No, in that case it computes `usize::MAX-1 + (2-1)` which is just
> usize::MAX and does not overflow. After applying the mask, it returns
> `usize::MAX-1` as the return value.

My bad! If alignment is below `usize::MAX / 2` then this always works. Once alignment gets bigger, the function fails, though. The following assertion does not hold:

assert_eq!(const_align_up(usize::MAX - 1, (usize::MAX - 1).into()), Some(usize::MAX - 1));

Doesn't matter too much, I guess, but with `checked_next_multiple_of()` this assertion holds.

>> FYI, `core` provides `usize::checked_next_multiple_of()` ((const-)stable since 1.73). So an alternative would be:
>>
>> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>>     value.checked_next_multiple_of(align.as_usize())
>> }
>
> That would return value+align when value is already aligned, which is wrong.

You sure? (emphasis mine:)

"Calculates the smallest value greater than or **EQUAL TO** self that is a multiple of rhs."

assert_eq!(16_u64.next_multiple_of(8), 16);

Thanks
David

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20  8:58       ` David Rheinsberg
@ 2026-03-20  9:03         ` Alice Ryhl
  2026-03-20  9:26           ` David Rheinsberg
  0 siblings, 1 reply; 66+ messages in thread
From: Alice Ryhl @ 2026-03-20  9:03 UTC (permalink / raw)
  To: David Rheinsberg
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

On Fri, Mar 20, 2026 at 9:58 AM David Rheinsberg <david@readahead.eu> wrote:
>
> Hi
>
> On Fri, Mar 20, 2026, at 9:44 AM, Alice Ryhl wrote:
> > On Fri, Mar 20, 2026 at 9:38 AM David Rheinsberg <david@readahead.eu> wrote:
> >>
> >> Hi
> >>
> >> On Tue, Mar 17, 2026, at 11:53 PM, John Hubbard wrote:
> >> > Add const_align_up() to kernel::ptr as the const-compatible equivalent
> >> > of Alignable::align_up().
> >> >
> >> > Suggested-by: Danilo Krummrich <dakr@kernel.org>
> >> > Suggested-by: Gary Guo <gary@garyguo.net>
> >> > Suggested-by: Miguel Ojeda <ojeda@kernel.org>
> >> > Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> >> > ---
> >> >  rust/kernel/ptr.rs | 24 ++++++++++++++++++++++++
> >> >  1 file changed, 24 insertions(+)
> >> >
> >> > diff --git a/rust/kernel/ptr.rs b/rust/kernel/ptr.rs
> >> > index bdc2d79ff669..7e99f129543b 100644
> >> > --- a/rust/kernel/ptr.rs
> >> > +++ b/rust/kernel/ptr.rs
> >> > @@ -253,3 +253,27 @@ fn size(p: *const Self) -> usize {
> >> >          p.len() * size_of::<T>()
> >> >      }
> >> >  }
> >> > +
> >> > +/// Aligns `value` up to `align`.
> >> > +///
> >> > +/// This is the const-compatible equivalent of [`Alignable::align_up`].
> >> > +///
> >> > +/// Returns [`None`] on overflow.
> >> > +///
> >> > +/// # Examples
> >> > +///
> >> > +/// ```
> >> > +/// use kernel::ptr::{const_align_up, Alignment};
> >> > +/// use kernel::sizes::SZ_4K;
> >> > +///
> >> > +/// assert_eq!(const_align_up(0x4f, Alignment::new::<16>()), Some(0x50));
> >> > +/// assert_eq!(const_align_up(0x40, Alignment::new::<16>()), Some(0x40));
> >> > +/// assert_eq!(const_align_up(1, Alignment::new::<SZ_4K>()), Some(SZ_4K));
> >> > +/// ```
> >> > +#[inline(always)]
> >> > +pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
> >> > +    match value.checked_add(align.as_usize() - 1) {
> >> > +        Some(v) => Some(v & align.mask()),
> >> > +        None => None,
> >> > +    }
> >>
> >> This would return `None` if the value is already aligned, but the addition overflows `usize`, right? For instance, this would incorrectly return `None`: `const_align_up(usize::MAX - 1, 2.into())`
> >
> > No, in that case it computes `usize::MAX-1 + (2-1)` which is just
> > usize::MAX and does not overflow. After applying the mask, it returns
> > `usize::MAX-1` as the return value.
>
> My bad! If alignment is below `usize::MAX / 2` then this always works. Once alignment gets bigger, the function fails, though. The following assertion does not hold:
>
> assert_eq!(const_align_up(usize::MAX - 1, (usize::MAX - 1).into()), Some(usize::MAX - 1));
>
> Doesn't matter too much, I guess, but with `checked_next_multiple_of()` this assertion holds.

The Alignment type can only hold values that are a power of two.

> >> FYI, `core` provides `usize::checked_next_multiple_of()` ((const-)stable since 1.73). So an alternative would be:
> >>
> >> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
> >>     value.checked_next_multiple_of(align.as_usize())
> >> }
> >
> > That would return value+align when value is already aligned, which is wrong.
>
> You sure? (emphasis mine:)
>
> "Calculates the smallest value greater than or **EQUAL TO** self that is a multiple of rhs."
>
> assert_eq!(16_u64.next_multiple_of(8), 16);

Okay, well, weird naming then.

Alice

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20  9:03         ` Alice Ryhl
@ 2026-03-20  9:26           ` David Rheinsberg
  2026-03-20  9:47             ` Alice Ryhl
  0 siblings, 1 reply; 66+ messages in thread
From: David Rheinsberg @ 2026-03-20  9:26 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

Hi!

On Fri, Mar 20, 2026, at 10:03 AM, Alice Ryhl wrote:
> The Alignment type can only hold values that are a power of two.

That solves my concern!

Kinda off-topic: why doesn't `Alignment` store a u8 that represents the exponent, rather than the power? The left-shift when needing the power should be effectively free, shouldn't it? It would avoid all the unsafety in the impl.

>> >> FYI, `core` provides `usize::checked_next_multiple_of()` ((const-)stable since 1.73). So an alternative would be:
>> >>
>> >> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
>> >>     value.checked_next_multiple_of(align.as_usize())
>> >> }
>> >
>> > That would return value+align when value is already aligned, which is wrong.
>>
>> You sure? (emphasis mine:)
>>
>> "Calculates the smallest value greater than or **EQUAL TO** self that is a multiple of rhs."
>>
>> assert_eq!(16_u64.next_multiple_of(8), 16);
>
> Okay, well, weird naming then.

Do you need this helper then at all? I assume it is added because `Alignable` cannot be used in const. But it hard-codes `usize` as type, yet does not reflect that in the name. It comes down to which one is more readable, I guess:

    const_align_up(value, align)
vs
    value.checked_next_multiple_of(align.as_usize())

Meh, I don't mind too much. Just wanted to point out that the standard library provides this exactly.

David

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20  9:26           ` David Rheinsberg
@ 2026-03-20  9:47             ` Alice Ryhl
  2026-03-20 10:27               ` David Rheinsberg
  0 siblings, 1 reply; 66+ messages in thread
From: Alice Ryhl @ 2026-03-20  9:47 UTC (permalink / raw)
  To: David Rheinsberg
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

On Fri, Mar 20, 2026 at 10:27 AM David Rheinsberg <david@readahead.eu> wrote:
>
> Hi!
>
> On Fri, Mar 20, 2026, at 10:03 AM, Alice Ryhl wrote:
> > The Alignment type can only hold values that are a power of two.
>
> That solves my concern!
>
> Kinda off-topic: why doesn't `Alignment` store a u8 that represents the exponent, rather than the power? The left-shift when needing the power should be effectively free, shouldn't it? It would avoid all the unsafety in the impl.

For one, it mirrors the design of the unstable stdlib Alignment type.
For another, it'd make Alignment::of and similar a pain to implement.

> >> >> FYI, `core` provides `usize::checked_next_multiple_of()` ((const-)stable since 1.73). So an alternative would be:
> >> >>
> >> >> pub const fn const_align_up(value: usize, align: Alignment) -> Option<usize> {
> >> >>     value.checked_next_multiple_of(align.as_usize())
> >> >> }
> >> >
> >> > That would return value+align when value is already aligned, which is wrong.
> >>
> >> You sure? (emphasis mine:)
> >>
> >> "Calculates the smallest value greater than or **EQUAL TO** self that is a multiple of rhs."
> >>
> >> assert_eq!(16_u64.next_multiple_of(8), 16);
> >
> > Okay, well, weird naming then.
>
> Do you need this helper then at all? I assume it is added because `Alignable` cannot be used in const. But it hard-codes `usize` as type, yet does not reflect that in the name. It comes down to which one is more readable, I guess:
>
>     const_align_up(value, align)
> vs
>     value.checked_next_multiple_of(align.as_usize())
>
> Meh, I don't mind too much. Just wanted to point out that the standard library provides this exactly.

The stdlib implementation probably invokes an expensive division
operation in place of our bitwise-and operation when the argument is
not a compile-time constant value.

Alice

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20  9:47             ` Alice Ryhl
@ 2026-03-20 10:27               ` David Rheinsberg
  2026-03-20 11:12                 ` Alice Ryhl
  0 siblings, 1 reply; 66+ messages in thread
From: David Rheinsberg @ 2026-03-20 10:27 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

Hi Alice!

On Fri, Mar 20, 2026, at 10:47 AM, Alice Ryhl wrote:
> On Fri, Mar 20, 2026 at 10:27 AM David Rheinsberg <david@readahead.eu> wrote:
>> Kinda off-topic: why doesn't `Alignment` store a u8 that represents the exponent, rather than the power? The left-shift when needing the power should be effectively free, shouldn't it? It would avoid all the unsafety in the impl.
>
> For one, it mirrors the design of the unstable stdlib Alignment type.
> For another, it'd make Alignment::of and similar a pain to implement.

Fair enough!

>> Do you need this helper then at all? I assume it is added because `Alignable` cannot be used in const. But it hard-codes `usize` as type, yet does not reflect that in the name. It comes down to which one is more readable, I guess:
>>
>>     const_align_up(value, align)
>> vs
>>     value.checked_next_multiple_of(align.as_usize())
>>
>> Meh, I don't mind too much. Just wanted to point out that the standard library provides this exactly.
>
> The stdlib implementation probably invokes an expensive division
> operation in place of our bitwise-and operation when the argument is
> not a compile-time constant value.

Fortunately, it does not! Yay! `Alignment::as_nonzero()` has enough annotations to produce competitive assembly. If you care, see my example on x86_64 below, which uses:

align_up1: v.checked_next_multiple_of(align.as_nonzero())
align_up2: const_align_up(v, align)

Thanks
David


align_up1:
        lea     rax, [rsi - 1]
        and     rax, rdi
        sub     rsi, rax
        lea     rdx, [rsi + rdi]
        cmp     rdx, rdi
        setae   cl
        test    rax, rax
        sete    al
        cmove   rdx, rdi
        or      al, cl
        movzx   eax, al
        ret

align_up2:
        lea     rdx, [rdi + rsi]
        dec     rdx
        xor     eax, eax
        cmp     rdx, rdi
        setae   al
        neg     rsi
        and     rdx, rsi
        ret

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20 10:27               ` David Rheinsberg
@ 2026-03-20 11:12                 ` Alice Ryhl
  2026-03-20 13:14                   ` David Rheinsberg
  0 siblings, 1 reply; 66+ messages in thread
From: Alice Ryhl @ 2026-03-20 11:12 UTC (permalink / raw)
  To: David Rheinsberg
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

On Fri, Mar 20, 2026 at 11:27 AM David Rheinsberg <david@readahead.eu> wrote:
>
> Hi Alice!
>
> On Fri, Mar 20, 2026, at 10:47 AM, Alice Ryhl wrote:
> > On Fri, Mar 20, 2026 at 10:27 AM David Rheinsberg <david@readahead.eu> wrote:
> >> Kinda off-topic: why doesn't `Alignment` store a u8 that represents the exponent, rather than the power? The left-shift when needing the power should be effectively free, shouldn't it? It would avoid all the unsafety in the impl.
> >
> > For one, it mirrors the design of the unstable stdlib Alignment type.
> > For another, it'd make Alignment::of and similar a pain to implement.
>
> Fair enough!
>
> >> Do you need this helper then at all? I assume it is added because `Alignable` cannot be used in const. But it hard-codes `usize` as type, yet does not reflect that in the name. It comes down to which one is more readable, I guess:
> >>
> >>     const_align_up(value, align)
> >> vs
> >>     value.checked_next_multiple_of(align.as_usize())
> >>
> >> Meh, I don't mind too much. Just wanted to point out that the standard library provides this exactly.
> >
> > The stdlib implementation probably invokes an expensive division
> > operation in place of our bitwise-and operation when the argument is
> > not a compile-time constant value.
>
> Fortunately, it does not! Yay! `Alignment::as_nonzero()` has enough annotations to produce competitive assembly. If you care, see my example on x86_64 below, which uses:
>
> align_up1: v.checked_next_multiple_of(align.as_nonzero())
> align_up2: const_align_up(v, align)
>
> Thanks
> David
>
>
> align_up1:
>         lea     rax, [rsi - 1]
>         and     rax, rdi
>         sub     rsi, rax
>         lea     rdx, [rsi + rdi]
>         cmp     rdx, rdi
>         setae   cl
>         test    rax, rax
>         sete    al
>         cmove   rdx, rdi
>         or      al, cl
>         movzx   eax, al
>         ret
>
> align_up2:
>         lea     rdx, [rdi + rsi]
>         dec     rdx
>         xor     eax, eax
>         cmp     rdx, rdi
>         setae   al
>         neg     rsi
>         and     rdx, rsi
>         ret

The conditional move instruction still seems worse as it breaks data
dependencies in the cpu pipeline, right?

Alice

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20 11:12                 ` Alice Ryhl
@ 2026-03-20 13:14                   ` David Rheinsberg
  2026-03-20 13:16                     ` Miguel Ojeda
  0 siblings, 1 reply; 66+ messages in thread
From: David Rheinsberg @ 2026-03-20 13:14 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: John Hubbard, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Timur Tabi, Alistair Popple, Eliot Courtney, Shashank Sharma,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

Hi Alice!

On Fri, Mar 20, 2026, at 12:12 PM, Alice Ryhl wrote:
> On Fri, Mar 20, 2026 at 11:27 AM David Rheinsberg <david@readahead.eu> wrote:
>> align_up1:
>>         lea     rax, [rsi - 1]
>>         and     rax, rdi
>>         sub     rsi, rax
>>         lea     rdx, [rsi + rdi]
>>         cmp     rdx, rdi
>>         setae   cl
>>         test    rax, rax
>>         sete    al
>>         cmove   rdx, rdi
>>         or      al, cl
>>         movzx   eax, al
>>         ret
>>
>> align_up2:
>>         lea     rdx, [rdi + rsi]
>>         dec     rdx
>>         xor     eax, eax
>>         cmp     rdx, rdi
>>         setae   al
>>         neg     rsi
>>         and     rdx, rsi
>>         ret
>
> The conditional move instruction still seems worse as it breaks data
> dependencies in the cpu pipeline, right?

Oh yeah, it is definitely worse, but I think on an acceptable level. I would argue the cmove is a non-issue, because it is a reg<->reg move, but... meh, not sure I wanna dig deeper.

David

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20 13:14                   ` David Rheinsberg
@ 2026-03-20 13:16                     ` Miguel Ojeda
  2026-03-20 13:26                       ` Alice Ryhl
  0 siblings, 1 reply; 66+ messages in thread
From: Miguel Ojeda @ 2026-03-20 13:16 UTC (permalink / raw)
  To: David Rheinsberg
  Cc: Alice Ryhl, John Hubbard, Danilo Krummrich, Alexandre Courbot,
	Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, rust-for-linux, LKML

On Fri, Mar 20, 2026 at 2:15 PM David Rheinsberg <david@readahead.eu> wrote:
>
> Oh yeah, it is definitely worse, but I think on an acceptable level. I would argue the cmove is a non-issue, because it is a reg<->reg move, but... meh, not sure I wanna dig deeper.

If this is mostly meant to be used in `const` contexts like in this
patch series, then the assembly doesn't matter much to begin with.

In any case, the goal is to eventually use the `core` type, yeah, i.e.
we have `ptr_alignment_type` in our wishlist.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20 13:16                     ` Miguel Ojeda
@ 2026-03-20 13:26                       ` Alice Ryhl
  0 siblings, 0 replies; 66+ messages in thread
From: Alice Ryhl @ 2026-03-20 13:26 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: David Rheinsberg, John Hubbard, Danilo Krummrich,
	Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, rust-for-linux, LKML

On Fri, Mar 20, 2026 at 2:16 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> On Fri, Mar 20, 2026 at 2:15 PM David Rheinsberg <david@readahead.eu> wrote:
> >
> > Oh yeah, it is definitely worse, but I think on an acceptable level. I would argue the cmove is a non-issue, because it is a reg<->reg move, but... meh, not sure I wanna dig deeper.
>
> If this is mostly meant to be used in `const` contexts like in this
> patch series, then the assembly doesn't matter much to begin with.

We took away all the elements that make it suitable for const only
(panics), so I did make this suggestion:
https://lore.kernel.org/all/CAH5fLgjbZZaxRaAvodUM4xkKY4YOz=n-hcWG-JraLi_Acu7u8g@mail.gmail.com/

> In any case, the goal is to eventually use the `core` type, yeah, i.e.
> we have `ptr_alignment_type` in our wishlist.
>
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-17 22:53 ` [PATCH v7 16/31] rust: ptr: add const_align_up() John Hubbard
  2026-03-20  8:37   ` David Rheinsberg
@ 2026-03-20  9:48   ` Alice Ryhl
  2026-03-20 13:36     ` Gary Guo
  1 sibling, 1 reply; 66+ messages in thread
From: Alice Ryhl @ 2026-03-20  9:48 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Shashank Sharma, Zhi Wang,
	David Airlie, Simona Vetter, Bjorn Helgaas, Miguel Ojeda,
	Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, rust-for-linux,
	LKML

On Tue, Mar 17, 2026 at 11:54 PM John Hubbard <jhubbard@nvidia.com> wrote:
>
> Add const_align_up() to kernel::ptr as the const-compatible equivalent
> of Alignable::align_up().
>
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Suggested-by: Gary Guo <gary@garyguo.net>
> Suggested-by: Miguel Ojeda <ojeda@kernel.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

I think it would be nice to implement the trait versions of this in
terms of this method, so we only have one implementation.

Alice

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 16/31] rust: ptr: add const_align_up()
  2026-03-20  9:48   ` Alice Ryhl
@ 2026-03-20 13:36     ` Gary Guo
  0 siblings, 0 replies; 66+ messages in thread
From: Gary Guo @ 2026-03-20 13:36 UTC (permalink / raw)
  To: Alice Ryhl, John Hubbard
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Shashank Sharma, Zhi Wang,
	David Airlie, Simona Vetter, Bjorn Helgaas, Miguel Ojeda,
	Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, rust-for-linux,
	LKML

On Fri Mar 20, 2026 at 9:48 AM GMT, Alice Ryhl wrote:
> On Tue, Mar 17, 2026 at 11:54 PM John Hubbard <jhubbard@nvidia.com> wrote:
>>
>> Add const_align_up() to kernel::ptr as the const-compatible equivalent
>> of Alignable::align_up().
>>
>> Suggested-by: Danilo Krummrich <dakr@kernel.org>
>> Suggested-by: Gary Guo <gary@garyguo.net>
>> Suggested-by: Miguel Ojeda <ojeda@kernel.org>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>
> I think it would be nice to implement the trait versions of this in
> terms of this method, so we only have one implementation.
>
> Alice

The `align_up` is implemented for multiple integer types with macros, so I think
what we have right now is fine.

Best,
Gary

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 17/31] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (15 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 16/31] rust: ptr: add const_align_up() John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Various "reserved" areas of FB (frame buffer: vidmem) have to be
calculated, because the GSP booting process needs this information.

PMU_RESERVED_SIZE is computed at compile time using const_align_up().
The total reserved size is computed at runtime using Alignable::align_up
because it depends on the heap layout.

Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs     | 8 ++++++++
 drivers/gpu/nova-core/gsp/fw.rs | 6 +++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 6536d0035cb1..ffb996b918f8 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -10,6 +10,7 @@
     fmt,
     prelude::*,
     ptr::{
+        const_align_up,
         Alignable,
         Alignment, //
     },
@@ -270,3 +271,10 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         })
     }
 }
+
+/// PMU reserved size, aligned to 128KB.
+pub(crate) const PMU_RESERVED_SIZE: u32 =
+    match const_align_up(SZ_8M + SZ_16M + SZ_4K, Alignment::new::<SZ_128K>()) {
+        Some(v) => v as u32,
+        None => panic!("PMU_RESERVED_SIZE: alignment overflow"),
+    };
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index a061131b5412..92335e7fc34a 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -26,7 +26,10 @@
 };
 
 use crate::{
-    fb::FbLayout,
+    fb::{
+        FbLayout,
+        PMU_RESERVED_SIZE, //
+    },
     firmware::gsp::GspFirmware,
     gpu::Chipset,
     gsp::{
@@ -255,6 +258,7 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
             fbSize: fb_layout.fb.end - fb_layout.fb.start,
             vgaWorkspaceOffset: fb_layout.vga_workspace.start,
             vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
+            pmuReservedSize: PMU_RESERVED_SIZE,
             ..Default::default()
         })
     }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (16 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 17/31] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-18  0:01   ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 19/31] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add the MCTP (Management Component Transport Protocol) and NVDM (NVIDIA
Device Management) wire-format types used for communication between the
kernel driver and GPU firmware processors.

This includes typed MCTP transport headers, NVDM message headers, and
NVDM message type identifiers. Both the FSP boot path and the upcoming
GSP RPC message queue share this protocol layer.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/mctp.rs      | 126 +++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 2 files changed, 127 insertions(+)
 create mode 100644 drivers/gpu/nova-core/mctp.rs

diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs
new file mode 100644
index 000000000000..9e052d916e79
--- /dev/null
+++ b/drivers/gpu/nova-core/mctp.rs
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! MCTP/NVDM protocol types for NVIDIA GPU firmware communication.
+//!
+//! MCTP (Management Component Transport Protocol) carries NVDM (NVIDIA
+//! Device Management) messages between the kernel driver and GPU firmware
+//! processors such as FSP and GSP.
+
+#![expect(dead_code)]
+
+/// NVDM message type identifiers carried over MCTP.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+#[repr(u8)]
+pub(crate) enum NvdmType {
+    /// Chain of Trust boot message.
+    Cot = 0x14,
+    /// FSP command response.
+    FspResponse = 0x15,
+}
+
+impl TryFrom<u8> for NvdmType {
+    type Error = u8;
+
+    fn try_from(value: u8) -> Result<Self, Self::Error> {
+        match value {
+            x if x == Self::Cot as u8 => Ok(Self::Cot),
+            x if x == Self::FspResponse as u8 => Ok(Self::FspResponse),
+            _ => Err(value),
+        }
+    }
+}
+
+impl From<NvdmType> for u8 {
+    fn from(value: NvdmType) -> Self {
+        value as u8
+    }
+}
+
+bitfield! {
+    pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." {
+        31:31 som as bool, "Start-of-message bit.";
+        30:30 eom as bool, "End-of-message bit.";
+        29:28 seq as u8, "Packet sequence number.";
+        23:16 seid as u8, "Source endpoint ID.";
+    }
+}
+
+impl MctpHeader {
+    /// Build a single-packet MCTP header (SOM=1, EOM=1, SEQ=0, SEID=0).
+    pub(crate) fn single_packet() -> Self {
+        Self::default().set_som(true).set_eom(true)
+    }
+
+    /// Return the raw packed u32.
+    pub(crate) const fn raw(self) -> u32 {
+        self.0
+    }
+
+    /// Check if this is a complete single-packet message (SOM=1 and EOM=1).
+    pub(crate) fn is_single_packet(self) -> bool {
+        self.som() && self.eom()
+    }
+}
+
+impl From<u32> for MctpHeader {
+    fn from(raw: u32) -> Self {
+        Self(raw)
+    }
+}
+
+/// MCTP message type for PCI vendor-defined messages.
+const MSG_TYPE_VENDOR_PCI: u8 = 0x7e;
+
+/// NVIDIA PCI vendor ID.
+const VENDOR_ID_NV: u16 = 0x10de;
+
+bitfield! {
+    pub(crate) struct NvdmHeader(u32), "NVIDIA Vendor-Defined Message header over MCTP." {
+        31:24 raw_nvdm_type as u8, "Raw NVDM message type.";
+        23:8 vendor_id as u16, "PCI vendor ID.";
+        6:0 msg_type as u8, "MCTP vendor-defined message type.";
+    }
+}
+
+impl NvdmHeader {
+    /// Build an NVDM header for the given message type.
+    pub(crate) fn new(nvdm_type: NvdmType) -> Self {
+        Self::default()
+            .set_msg_type(MSG_TYPE_VENDOR_PCI)
+            .set_vendor_id(VENDOR_ID_NV)
+            .set_nvdm_type(nvdm_type)
+    }
+
+    /// Return the raw packed u32.
+    pub(crate) const fn raw(self) -> u32 {
+        self.0
+    }
+
+    /// Extract the NVDM type field as a typed value.
+    pub(crate) fn nvdm_type(self) -> core::result::Result<NvdmType, u8> {
+        NvdmType::try_from(self.raw_nvdm_type())
+    }
+
+    /// Extract the NVDM type field as a raw value.
+    pub(crate) fn nvdm_type_raw(self) -> u32 {
+        u32::from(self.raw_nvdm_type())
+    }
+
+    /// Set the NVDM type field from a typed value.
+    pub(crate) fn set_nvdm_type(self, nvdm_type: NvdmType) -> Self {
+        self.set_raw_nvdm_type(u8::from(nvdm_type))
+    }
+
+    /// Validate this header against the expected NVIDIA NVDM format and type.
+    pub(crate) fn validate(self, expected_type: NvdmType) -> bool {
+        self.msg_type() == MSG_TYPE_VENDOR_PCI
+            && self.vendor_id() == VENDOR_ID_NV
+            && matches!(self.nvdm_type(), Ok(nvdm_type) if nvdm_type == expected_type)
+    }
+}
+
+impl From<u32> for NvdmHeader {
+    fn from(raw: u32) -> Self {
+        Self(raw)
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index b5caf1044697..3bd9b1dd0264 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -13,6 +13,7 @@
 mod gfw;
 mod gpu;
 mod gsp;
+mod mctp;
 mod num;
 mod regs;
 mod sbuffer;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-03-17 22:53 ` [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
@ 2026-03-18  0:01   ` John Hubbard
  2026-03-18  0:21     ` Danilo Krummrich
  0 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-18  0:01 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On 3/17/26 3:53 PM, John Hubbard wrote:
...
> +bitfield! {
> +    pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." {
> +        31:31 som as bool, "Start-of-message bit.";
> +        30:30 eom as bool, "End-of-message bit.";
> +        29:28 seq as u8, "Packet sequence number.";
> +        23:16 seid as u8, "Source endpoint ID.";

hmmm, I seem to remember my very slightly younger self insisting
that fields be listed from lowest to highest bits. And now I've
violated that in both headers in this patch. arghh

I'll fix it if there is a v8 required.

thanks,
-- 
John Hubbard

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-03-18  0:01   ` John Hubbard
@ 2026-03-18  0:21     ` Danilo Krummrich
  2026-03-18  0:56       ` Alexandre Courbot
  2026-03-18 12:36       ` Gary Guo
  0 siblings, 2 replies; 66+ messages in thread
From: Danilo Krummrich @ 2026-03-18  0:21 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 18, 2026 at 1:01 AM CET, John Hubbard wrote:
> On 3/17/26 3:53 PM, John Hubbard wrote:
> ...
>> +bitfield! {
>> +    pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." {
>> +        31:31 som as bool, "Start-of-message bit.";
>> +        30:30 eom as bool, "End-of-message bit.";
>> +        29:28 seq as u8, "Packet sequence number.";
>> +        23:16 seid as u8, "Source endpoint ID.";
>
> hmmm, I seem to remember my very slightly younger self insisting
> that fields be listed from lowest to highest bits. And now I've
> violated that in both headers in this patch. arghh

My now slightly older self still thinks that what you have above is actually the
way to go. So, I think your current self intuitively did the right thing. :P

It should be either

	31:16
	15:0

or it should be

	0:15
	16:31

with a strong preference for the former, but this

	15:0
	31:16

still looks pretty odd to me.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-03-18  0:21     ` Danilo Krummrich
@ 2026-03-18  0:56       ` Alexandre Courbot
  2026-03-18 12:36       ` Gary Guo
  1 sibling, 0 replies; 66+ messages in thread
From: Alexandre Courbot @ 2026-03-18  0:56 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: John Hubbard, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 18, 2026 at 9:21 AM JST, Danilo Krummrich wrote:
> On Wed Mar 18, 2026 at 1:01 AM CET, John Hubbard wrote:
>> On 3/17/26 3:53 PM, John Hubbard wrote:
>> ...
>>> +bitfield! {
>>> +    pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." {
>>> +        31:31 som as bool, "Start-of-message bit.";
>>> +        30:30 eom as bool, "End-of-message bit.";
>>> +        29:28 seq as u8, "Packet sequence number.";
>>> +        23:16 seid as u8, "Source endpoint ID.";
>>
>> hmmm, I seem to remember my very slightly younger self insisting
>> that fields be listed from lowest to highest bits. And now I've
>> violated that in both headers in this patch. arghh
>
> My now slightly older self still thinks that what you have above is actually the
> way to go. So, I think your current self intuitively did the right thing. :P
>
> It should be either
>
> 	31:16
> 	15:0
>
> or it should be
>
> 	0:15
> 	16:31
>
> with a strong preference for the former, but this
>
> 	15:0
> 	31:16
>
> still looks pretty odd to me.

Mmm, that's the order `regs.rs` currently uses. I don't have any
particular problem with it tbh.

The second form (`0:15`) is going to be rejected by the macro - it
expects the high bit first.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-03-18  0:21     ` Danilo Krummrich
  2026-03-18  0:56       ` Alexandre Courbot
@ 2026-03-18 12:36       ` Gary Guo
  2026-03-18 19:14         ` John Hubbard
  1 sibling, 1 reply; 66+ messages in thread
From: Gary Guo @ 2026-03-18 12:36 UTC (permalink / raw)
  To: Danilo Krummrich, John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Wed Mar 18, 2026 at 12:21 AM GMT, Danilo Krummrich wrote:
> On Wed Mar 18, 2026 at 1:01 AM CET, John Hubbard wrote:
>> On 3/17/26 3:53 PM, John Hubbard wrote:
>> ...
>>> +bitfield! {
>>> +    pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." {
>>> +        31:31 som as bool, "Start-of-message bit.";
>>> +        30:30 eom as bool, "End-of-message bit.";
>>> +        29:28 seq as u8, "Packet sequence number.";
>>> +        23:16 seid as u8, "Source endpoint ID.";
>>
>> hmmm, I seem to remember my very slightly younger self insisting
>> that fields be listed from lowest to highest bits. And now I've
>> violated that in both headers in this patch. arghh
>
> My now slightly older self still thinks that what you have above is actually the
> way to go. So, I think your current self intuitively did the right thing. :P
>
> It should be either
>
> 	31:16
> 	15:0

This is my preferred form as it closer to what hardware world uses and the order
is consistent with most data sheets.

Best,
Gary

>
> or it should be
>
> 	0:15
> 	16:31
>
> with a strong preference for the former, but this
>
> 	15:0
> 	31:16
>
> still looks pretty odd to me.




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-03-18 12:36       ` Gary Guo
@ 2026-03-18 19:14         ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-18 19:14 UTC (permalink / raw)
  To: Gary Guo, Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 3/18/26 5:36 AM, Gary Guo wrote:
> On Wed Mar 18, 2026 at 12:21 AM GMT, Danilo Krummrich wrote:
>> On Wed Mar 18, 2026 at 1:01 AM CET, John Hubbard wrote:
>>> On 3/17/26 3:53 PM, John Hubbard wrote:
>>> ...
>>>> +bitfield! {
>>>> +    pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." {
>>>> +        31:31 som as bool, "Start-of-message bit.";
>>>> +        30:30 eom as bool, "End-of-message bit.";
>>>> +        29:28 seq as u8, "Packet sequence number.";
>>>> +        23:16 seid as u8, "Source endpoint ID.";
>>>
>>> hmmm, I seem to remember my very slightly younger self insisting
>>> that fields be listed from lowest to highest bits. And now I've
>>> violated that in both headers in this patch. arghh
>>
>> My now slightly older self still thinks that what you have above is actually the
>> way to go. So, I think your current self intuitively did the right thing. :P
>>
>> It should be either
>>
>> 	31:16
>> 	15:0
> 
> This is my preferred form as it closer to what hardware world uses and the order
> is consistent with most data sheets.
> 
> Best,
> Gary
> 

I'll let this patch alone in v8, then, as the most important
thing for me is the bit order. I don't worry so much about the
vertical order in which bitfields are listed.

Whew, I do hope this (along with my recent rustfmtcheck sins)
are the only remaining items. Getting close...

>>
>> or it should be
>>
>> 	0:15
>> 	16:31
>>
>> with a strong preference for the former, but this
>>
>> 	15:0
>> 	31:16
>>
>> still looks pretty odd to me.
> 

I will concede that the above is an acquired taste. :)

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 19/31] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (17 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 20/31] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add the FSP (Firmware System Processor) module for Hopper/Blackwell GPUs.
These architectures use a simplified firmware boot sequence:

    FMC --> FSP --> GSP, with no SEC2 involvement.

This commit adds the ability to wait for FSP secure boot completion by
polling the I2CS thermal scratch register until FSP signals success.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs       | 141 +++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 drivers/gpu/nova-core/regs.rs      |  29 ++++++
 3 files changed, 171 insertions(+)
 create mode 100644 drivers/gpu/nova-core/fsp.rs

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
new file mode 100644
index 000000000000..d464ad325881
--- /dev/null
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
+//!
+//! Hopper/Blackwell use a simplified firmware boot sequence: FMC --> FSP --> GSP.
+//! Unlike Turing/Ampere/Ada, there is NO SEC2 (Security Engine 2) usage.
+//! FSP handles secure boot directly using FMC firmware + Chain of Trust.
+
+use kernel::{
+    device,
+    io::poll::read_poll_timeout,
+    prelude::*,
+    time::Delta,
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    },
+};
+
+use crate::regs;
+
+/// FSP secure boot completion timeout in milliseconds.
+const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 4000;
+
+/// GSP FMC initialization parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspFmcInitParams {
+    /// CC initialization "registry keys".
+    regkeys: u32,
+}
+
+// SAFETY: GspFmcInitParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspFmcInitParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspFmcInitParams {}
+
+/// GSP ACR (Authenticated Code RAM) boot parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspAcrBootGspRmParams {
+    /// Physical memory aperture through which gspRmDescPa is accessed.
+    target: u32,
+    /// Size in bytes of the GSP-RM descriptor structure.
+    gsp_rm_desc_size: u32,
+    /// Physical offset in the target aperture of the GSP-RM descriptor structure.
+    gsp_rm_desc_offset: u64,
+    /// Physical offset in FB to set the start of the WPR containing GSP-RM.
+    wpr_carveout_offset: u64,
+    /// Size in bytes of the WPR containing GSP-RM.
+    wpr_carveout_size: u32,
+    /// Whether to boot GSP-RM or GSP-Proxy through ACR.
+    b_is_gsp_rm_boot: u32,
+}
+
+// SAFETY: GspAcrBootGspRmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspAcrBootGspRmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspAcrBootGspRmParams {}
+
+/// GSP RM boot parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspRmParams {
+    /// Physical memory aperture through which bootArgsOffset is accessed.
+    target: u32,
+    /// Physical offset in the memory aperture that will be passed to GSP-RM.
+    boot_args_offset: u64,
+}
+
+// SAFETY: GspRmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspRmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspRmParams {}
+
+/// GSP SPDM (Security Protocol and Data Model) parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspSpdmParams {
+    /// Physical memory aperture through which all addresses are accessed.
+    target: u32,
+    /// Physical offset in the memory aperture where SPDM payload buffer is stored.
+    payload_buffer_offset: u64,
+    /// Size of the above payload buffer.
+    payload_buffer_size: u32,
+}
+
+// SAFETY: GspSpdmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspSpdmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspSpdmParams {}
+
+/// Complete GSP FMC boot parameters passed to FSP.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+pub(crate) struct GspFmcBootParams {
+    init_params: GspFmcInitParams,
+    boot_gsp_rm_params: GspAcrBootGspRmParams,
+    gsp_rm_params: GspRmParams,
+    gsp_spdm_params: GspSpdmParams,
+}
+
+// SAFETY: GspFmcBootParams is composed of C structs with only primitive types.
+unsafe impl AsBytes for GspFmcBootParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspFmcBootParams {}
+
+/// FSP interface for Hopper/Blackwell GPUs.
+pub(crate) struct Fsp;
+
+impl Fsp {
+    /// Wait for FSP secure boot completion.
+    ///
+    /// Polls the thermal scratch register until FSP signals boot completion
+    /// or timeout occurs.
+    #[expect(dead_code)]
+    pub(crate) fn wait_secure_boot(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        arch: crate::gpu::Architecture,
+    ) -> Result {
+        debug_assert!(
+            regs::read_fsp_boot_complete_status(bar, arch).is_some(),
+            "wait_secure_boot called on non-FSP architecture"
+        );
+
+        let timeout = Delta::from_millis(FSP_SECURE_BOOT_TIMEOUT_MS);
+
+        read_poll_timeout(
+            || regs::read_fsp_boot_complete_status(bar, arch).ok_or(ENOTSUPP),
+            |&status| status == regs::FSP_BOOT_COMPLETE_SUCCESS,
+            Delta::from_millis(10),
+            timeout,
+        )
+        .map_err(|_| {
+            dev_err!(dev, "FSP secure boot completion timeout\n");
+            ETIMEDOUT
+        })
+        .map(|_| ())
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 3bd9b1dd0264..bdbe7136f873 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -10,6 +10,7 @@
 mod falcon;
 mod fb;
 mod firmware;
+mod fsp;
 mod gfw;
 mod gpu;
 mod gsp;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 686556bb9f38..183915a3bb31 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -8,6 +8,7 @@
 pub(crate) mod macros;
 
 use kernel::{
+    io::Io,
     prelude::*,
     time, //
 };
@@ -491,6 +492,34 @@ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
     31:0    address as u32;
 });
 
+// PTHERM registers
+
+// FSP secure boot completion status register used by FSP to signal boot completion.
+// This is the NV_THERM_I2CS_SCRATCH register.
+// Different architectures use different addresses:
+// - Hopper (GH100): 0x000200bc
+// - Blackwell (GB202): 0x00ad00bc
+pub(crate) fn fsp_thermal_scratch_reg_addr(arch: Architecture) -> Result<usize> {
+    match arch {
+        Architecture::Hopper => Ok(0x000200bc),
+        Architecture::Blackwell => Ok(0x00ad00bc),
+        _ => Err(kernel::error::code::ENOTSUPP),
+    }
+}
+
+/// FSP writes this value to indicate successful boot completion.
+pub(crate) const FSP_BOOT_COMPLETE_SUCCESS: u32 = 0xff;
+
+/// Read FSP boot completion status from the architecture-specific thermal scratch register.
+///
+/// Returns `None` if the architecture does not have an FSP.
+pub(crate) fn read_fsp_boot_complete_status(
+    bar: &crate::driver::Bar0,
+    arch: Architecture,
+) -> Option<u32> {
+    let addr = fsp_thermal_scratch_reg_addr(arch).ok()?;
+    Some(bar.read32(addr))
+}
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 20/31] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (18 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 19/31] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 21/31] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add extract_fmc_signatures() which extracts SHA-384 hash, RSA public
key, and RSA signature from FMC ELF32 firmware sections. These are
needed for FSP Chain of Trust verification.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs |  3 +-
 drivers/gpu/nova-core/fsp.rs      | 79 +++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index bc26807116e4..6d07715b3a49 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -26,6 +26,7 @@
     },
 };
 
+pub(crate) use elf::elf_section;
 pub(crate) mod booter;
 pub(crate) mod fsp;
 pub(crate) mod fwsec;
@@ -646,7 +647,7 @@ fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
     }
 
     /// Automatically detects ELF32 vs ELF64 based on the ELF header.
-    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    pub(crate) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         // Check ELF magic.
         if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
             return None;
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index d464ad325881..a13d883373f0 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -105,6 +105,18 @@ unsafe impl AsBytes for GspFmcBootParams {}
 // SAFETY: All bit patterns are valid for the primitive fields.
 unsafe impl FromBytes for GspFmcBootParams {}
 
+/// Size constraints for FSP security signatures (Hopper/Blackwell).
+const FSP_HASH_SIZE: usize = 48; // SHA-384 hash
+const FSP_PKEY_SIZE: usize = 384; // RSA-3072 public key
+const FSP_SIG_SIZE: usize = 384; // RSA-3072 signature
+
+/// Structure to hold FMC signatures.
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct FmcSignatures {
+    hash384: [u8; FSP_HASH_SIZE],
+    public_key: [u8; FSP_PKEY_SIZE],
+    signature: [u8; FSP_SIG_SIZE],
+}
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -138,4 +150,71 @@ pub(crate) fn wait_secure_boot(
         })
         .map(|_| ())
     }
+
+    /// Extract FMC firmware signatures for Chain of Trust verification.
+    ///
+    /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
+    /// Returns signatures in a heap-allocated structure to prevent stack overflow.
+    #[expect(dead_code)]
+    pub(crate) fn extract_fmc_signatures(
+        dev: &device::Device<device::Bound>,
+        fmc_fw_data: &[u8],
+    ) -> Result<KBox<FmcSignatures>> {
+        let hash_section = crate::firmware::elf_section(fmc_fw_data, "hash")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'hash' section\n"))?;
+
+        let pkey_section = crate::firmware::elf_section(fmc_fw_data, "publickey")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'publickey' section\n"))?;
+
+        let sig_section = crate::firmware::elf_section(fmc_fw_data, "signature")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'signature' section\n"))?;
+
+        if hash_section.len() != FSP_HASH_SIZE {
+            dev_err!(
+                dev,
+                "FMC hash section size {} != expected {}\n",
+                hash_section.len(),
+                FSP_HASH_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        if pkey_section.len() > FSP_PKEY_SIZE {
+            dev_err!(
+                dev,
+                "FMC publickey section size {} > maximum {}\n",
+                pkey_section.len(),
+                FSP_PKEY_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        if sig_section.len() > FSP_SIG_SIZE {
+            dev_err!(
+                dev,
+                "FMC signature section size {} > maximum {}\n",
+                sig_section.len(),
+                FSP_SIG_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        let mut signatures = KBox::new(
+            FmcSignatures {
+                hash384: [0u8; FSP_HASH_SIZE],
+                public_key: [0u8; FSP_PKEY_SIZE],
+                signature: [0u8; FSP_SIG_SIZE],
+            },
+            GFP_KERNEL,
+        )?;
+
+        signatures.hash384.copy_from_slice(hash_section);
+        signatures.public_key[..pkey_section.len()].copy_from_slice(pkey_section);
+        signatures.signature[..sig_section.len()].copy_from_slice(sig_section);
+
+        Ok(signatures)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 21/31] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (19 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 20/31] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 22/31] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add send_sync_fsp() which sends an MCTP/NVDM message to FSP and waits
for the response. Response validation uses the typed MctpHeader and
NvdmHeader wrappers from the previous commit.

A MessageToFsp trait provides the NVDM type constant for each message
struct, so send_sync_fsp() can verify that the response matches the
request.

Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs |   3 -
 drivers/gpu/nova-core/fsp.rs        | 123 ++++++++++++++++++++++++++++
 2 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index faf923246ae9..1dcfd155b99c 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -150,7 +150,6 @@ pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Resu
     ///
     /// The FSP message queue is not circular - pointers are reset to 0 after each
     /// message exchange, so `tail >= head` is always true when data is present.
-    #[expect(unused)]
     pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
         let head = regs::NV_PFSP_MSGQ_HEAD::read(bar).address();
         let tail = regs::NV_PFSP_MSGQ_TAIL::read(bar).address();
@@ -173,7 +172,6 @@ pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
     ///
     /// # Returns
     /// `Ok(())` on success, `Err(EINVAL)` if packet is empty or not 4-byte aligned
-    #[expect(unused)]
     pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
         if packet.is_empty() {
             return Err(EINVAL);
@@ -205,7 +203,6 @@ pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
     ///
     /// # Returns
     /// `Ok(bytes_read)` on success, `Err(EINVAL)` if size is 0, exceeds buffer, or not aligned
-    #[expect(unused)]
     pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
         if size == 0 || size > buffer.len() {
             return Err(EINVAL);
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index a13d883373f0..4fb932f91da2 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -19,6 +19,15 @@
 
 use crate::regs;
 
+use crate::mctp::{
+    MctpHeader,
+    NvdmHeader,
+    NvdmType, //
+};
+
+/// FSP message timeout in milliseconds.
+const FSP_MSG_TIMEOUT_MS: i64 = 2000;
+
 /// FSP secure boot completion timeout in milliseconds.
 const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 4000;
 
@@ -117,6 +126,37 @@ pub(crate) struct FmcSignatures {
     public_key: [u8; FSP_PKEY_SIZE],
     signature: [u8; FSP_SIG_SIZE],
 }
+
+/// FSP Command Response payload structure.
+/// NVDM_PAYLOAD_COMMAND_RESPONSE structure.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCommandResponse {
+    task_id: u32,
+    command_nvdm_type: u32,
+    error_code: u32,
+}
+
+/// Complete FSP response structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct FspResponse {
+    mctp_header: u32,
+    nvdm_header: u32,
+    response: NvdmPayloadCommandResponse,
+}
+
+// SAFETY: FspResponse is a packed C struct with only integral fields.
+unsafe impl FromBytes for FspResponse {}
+
+/// Trait implemented by types representing a message to send to FSP.
+///
+/// This provides [`Fsp::send_sync_fsp`] with the information it needs to send
+/// a given message, following the same pattern as GSP's `CommandToGsp`.
+pub(crate) trait MessageToFsp: AsBytes {
+    /// NVDM type identifying this message to FSP.
+    const NVDM_TYPE: u32;
+}
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -217,4 +257,87 @@ pub(crate) fn extract_fmc_signatures(
 
         Ok(signatures)
     }
+
+    /// Send message to FSP and wait for response.
+    #[expect(dead_code)]
+    fn send_sync_fsp<M>(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
+        msg: &M,
+    ) -> Result
+    where
+        M: MessageToFsp,
+    {
+        fsp_falcon.send_msg(bar, msg.as_bytes())?;
+
+        let timeout = Delta::from_millis(FSP_MSG_TIMEOUT_MS);
+        let packet_size = read_poll_timeout(
+            || Ok(fsp_falcon.poll_msgq(bar)),
+            |&size| size > 0,
+            Delta::from_millis(10),
+            timeout,
+        )
+        .map_err(|_| {
+            dev_err!(dev, "FSP response timeout\n");
+            ETIMEDOUT
+        })?;
+
+        let packet_size = packet_size as usize;
+        let mut response_buf = KVec::<u8>::new();
+        response_buf.resize(packet_size, 0, GFP_KERNEL)?;
+        fsp_falcon.recv_msg(bar, &mut response_buf, packet_size)?;
+
+        if response_buf.len() < core::mem::size_of::<FspResponse>() {
+            dev_err!(dev, "FSP response too small: {}\n", response_buf.len());
+            return Err(EIO);
+        }
+
+        let response = FspResponse::from_bytes(&response_buf[..]).ok_or(EIO)?;
+
+        let mctp_header: MctpHeader = response.mctp_header.into();
+        let nvdm_header: NvdmHeader = response.nvdm_header.into();
+        let command_nvdm_type = response.response.command_nvdm_type;
+        let error_code = response.response.error_code;
+
+        if !mctp_header.is_single_packet() {
+            dev_err!(
+                dev,
+                "Unexpected MCTP header in FSP reply: {:#x}\n",
+                mctp_header.raw()
+            );
+            return Err(EIO);
+        }
+
+        if !nvdm_header.validate(NvdmType::FspResponse) {
+            dev_err!(
+                dev,
+                "Unexpected NVDM header in FSP reply: {:#x}\n",
+                nvdm_header.raw()
+            );
+            return Err(EIO);
+        }
+
+        if command_nvdm_type != M::NVDM_TYPE {
+            dev_err!(
+                dev,
+                "Expected NVDM type {:#x} in reply, got {:#x}\n",
+                M::NVDM_TYPE,
+                command_nvdm_type
+            );
+            return Err(EIO);
+        }
+
+        if error_code != 0 {
+            dev_err!(
+                dev,
+                "NVDM command {:#x} failed with error {:#x}\n",
+                M::NVDM_TYPE,
+                error_code
+            );
+            return Err(EIO);
+        }
+
+        Ok(())
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 22/31] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (20 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 21/31] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 23/31] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add FspCotVersion to represent the FSP Chain of Trust protocol version,
and Chipset::fsp_cot_version() which returns the version for each
architecture. Hopper uses version 1, Blackwell uses version 2.
Non-FSP architectures return None.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs | 19 +++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs | 14 ++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 4fb932f91da2..18edf7a1a8e4 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -25,6 +25,25 @@
     NvdmType, //
 };
 
+/// FSP Chain of Trust protocol version.
+///
+/// Hopper (GH100) uses version 1, Blackwell uses version 2.
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct FspCotVersion(u16);
+
+impl FspCotVersion {
+    /// Create a new FSP COT version.
+    pub(crate) const fn new(version: u16) -> Self {
+        Self(version)
+    }
+
+    /// Return the raw protocol version number for the wire format.
+    #[expect(dead_code)]
+    pub(crate) const fn raw(self) -> u16 {
+        self.0
+    }
+}
+
 /// FSP message timeout in milliseconds.
 const FSP_MSG_TIMEOUT_MS: i64 = 2000;
 
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 93f861ba20f3..1d25513fef20 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -18,6 +18,7 @@
         Falcon, //
     },
     fb::SysmemFlush,
+    fsp::FspCotVersion,
     gfw,
     gsp::Gsp,
     regs,
@@ -133,6 +134,19 @@ pub(crate) const fn arch(&self) -> Architecture {
     pub(crate) const fn needs_fwsec_bootloader(self) -> bool {
         matches!(self.arch(), Architecture::Turing) || matches!(self, Self::GA100)
     }
+
+    /// Returns the FSP Chain of Trust (COT) protocol version for this chipset.
+    ///
+    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
+    /// Returns `None` for architectures that do not use FSP.
+    #[expect(dead_code)]
+    pub(crate) const fn fsp_cot_version(&self) -> Option<FspCotVersion> {
+        match self.arch() {
+            Architecture::Hopper => Some(FspCotVersion::new(1)),
+            Architecture::Blackwell => Some(FspCotVersion::new(2)),
+            _ => None,
+        }
+    }
 }
 
 // TODO
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 23/31] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (21 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 22/31] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add dedicated FB HALs for Hopper (GH100) and Blackwell (GB100) with
architecture-specific non-WPR heap sizes. Hopper uses 2 MiB, Blackwell
uses 2 MiB + 128 KiB. These are needed for the larger reserved memory
regions that Hopper/Blackwell GPUs require.

Also adds the non_wpr_heap_size() method to the FbHal trait, and
the total_reserved_size field to FbLayout.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs           | 16 ++++++++---
 drivers/gpu/nova-core/fb/hal.rs       | 16 ++++++++---
 drivers/gpu/nova-core/fb/hal/ga102.rs |  2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs | 38 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/fb/hal/gh100.rs | 38 +++++++++++++++++++++++++++
 5 files changed, 102 insertions(+), 8 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index ffb996b918f8..c12705f5f742 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -31,7 +31,7 @@
     regs,
 };
 
-mod hal;
+pub(crate) mod hal;
 
 /// Type holding the sysmem flush memory page, a page of memory to be written into the
 /// `NV_PFB_NISO_FLUSH_SYSMEM_ADDR*` registers and used to maintain memory coherency.
@@ -99,6 +99,15 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
     }
 }
 
+/// Calculate non-WPR heap size based on chipset architecture.
+/// This matches the logic used in FSP for consistency.
+pub(crate) fn calc_non_wpr_heap_size(chipset: Chipset) -> u64 {
+    hal::fb_hal(chipset)
+        .non_wpr_heap_size()
+        .map(u64::from)
+        .unwrap_or(usize_as_u64(SZ_1M))
+}
+
 pub(crate) struct FbRange(Range<u64>);
 
 impl FbRange {
@@ -253,9 +262,8 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         };
 
         let heap = {
-            const HEAP_SIZE: u64 = usize_as_u64(SZ_1M);
-
-            FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
+            let heap_size = calc_non_wpr_heap_size(chipset);
+            FbRange(wpr2.start - heap_size..wpr2.start)
         };
 
         Ok(Self {
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index d33ca0f96417..ebd12247f771 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -12,6 +12,8 @@
 
 mod ga100;
 mod ga102;
+mod gb100;
+mod gh100;
 mod tu102;
 
 pub(crate) trait FbHal {
@@ -28,14 +30,22 @@ pub(crate) trait FbHal {
 
     /// Returns the VRAM size, in bytes.
     fn vidmem_size(&self, bar: &Bar0) -> u64;
+
+    /// Returns the non-WPR heap size for GPUs that need large reserved memory.
+    ///
+    /// Returns `None` for GPUs that don't need extra reserved memory.
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        None
+    }
 }
 
 /// Returns the HAL corresponding to `chipset`.
-pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
+pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset.arch() {
         Architecture::Turing => tu102::TU102_HAL,
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
-        Architecture::Ampere => ga102::GA102_HAL,
-        Architecture::Ada | Architecture::Hopper | Architecture::Blackwell => ga102::GA102_HAL,
+        Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
+        Architecture::Hopper => gh100::GH100_HAL,
+        Architecture::Blackwell => gb100::GB100_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/ga102.rs b/drivers/gpu/nova-core/fb/hal/ga102.rs
index 734605905031..f8d8f01e3c5d 100644
--- a/drivers/gpu/nova-core/fb/hal/ga102.rs
+++ b/drivers/gpu/nova-core/fb/hal/ga102.rs
@@ -8,7 +8,7 @@
     regs, //
 };
 
-fn vidmem_size_ga102(bar: &Bar0) -> u64 {
+pub(super) fn vidmem_size_ga102(bar: &Bar0) -> u64 {
     regs::NV_USABLE_FB_SIZE_IN_MB::read(bar).usable_fb_size()
 }
 
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
new file mode 100644
index 000000000000..bead99a6ca76
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gb100;
+
+impl FbHal for Gb100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        // 2 MiB + 128 KiB non-WPR heap for Blackwell (see Open RM: kgspCalculateFbLayout_GB100).
+        Some(0x220000)
+    }
+}
+
+const GB100: Gb100 = Gb100;
+pub(super) const GB100_HAL: &dyn FbHal = &GB100;
diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
new file mode 100644
index 000000000000..32d7414e6243
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gh100;
+
+impl FbHal for Gh100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        // 2 MiB non-WPR heap for Hopper (see Open RM: kgspCalculateFbLayout_GH100).
+        Some(0x200000)
+    }
+}
+
+const GH100: Gh100 = Gh100;
+pub(super) const GH100_HAL: &dyn FbHal = &GH100;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (22 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 23/31] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 25/31] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add boot_fmc() which builds and sends the Chain of Trust message to FSP,
and FmcBootArgs which bundles the DMA-coherent boot parameters that FSP
reads at boot time. The FspFirmware struct fields become pub(crate) and
fmc_full changes from DmaObject to KVec<u8> for CPU-side signature
extraction.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/fsp.rs |  12 +-
 drivers/gpu/nova-core/fsp.rs          | 174 +++++++++++++++++++++++++-
 drivers/gpu/nova-core/gpu.rs          |   1 -
 drivers/gpu/nova-core/mctp.rs         |   7 --
 4 files changed, 179 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index 5aedee8e6d41..e5059d59a4b7 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -14,24 +14,22 @@
     gpu::Chipset, //
 };
 
-#[expect(unused)]
+#[expect(dead_code)]
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the "image" ELF section).
-    fmc_image: DmaObject,
+    pub(crate) fmc_image: DmaObject,
     /// Full FMC ELF data (for signature extraction).
     pub(crate) fmc_full: KVec<u8>,
 }
 
 impl FspFirmware {
-    #[expect(unused)]
+    #[expect(dead_code)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
         chipset: Chipset,
         ver: &str,
     ) -> Result<Self> {
         let fw = super::request_firmware(dev, chipset, "fmc", ver)?;
-        let mut fmc_full = KVec::with_capacity(fw.data().len(), GFP_KERNEL)?;
-        fmc_full.extend_from_slice(fw.data(), GFP_KERNEL)?;
 
         // FSP expects only the "image" section, not the entire ELF file.
         let fmc_image_data = elf::elf_section(fw.data(), "image").ok_or_else(|| {
@@ -39,6 +37,10 @@ pub(crate) fn new(
             EINVAL
         })?;
 
+        // Copy the full ELF into a kernel vector for CPU-side signature extraction.
+        let mut fmc_full = KVec::with_capacity(fw.data().len(), GFP_KERNEL)?;
+        fmc_full.extend_from_slice(fw.data(), GFP_KERNEL)?;
+
         Ok(Self {
             fmc_image: DmaObject::from_data(dev, fmc_image_data)?,
             fmc_full,
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 18edf7a1a8e4..68bcfe45aec6 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -8,8 +8,14 @@
 
 use kernel::{
     device,
+    dma::CoherentAllocation,
     io::poll::read_poll_timeout,
     prelude::*,
+    ptr::{
+        Alignable,
+        Alignment, //
+    },
+    sizes::{SZ_1M, SZ_2M},
     time::Delta,
     transmute::{
         AsBytes,
@@ -38,7 +44,6 @@ pub(crate) const fn new(version: u16) -> Self {
     }
 
     /// Return the raw protocol version number for the wire format.
-    #[expect(dead_code)]
     pub(crate) const fn raw(self) -> u16 {
         self.0
     }
@@ -156,6 +161,35 @@ struct NvdmPayloadCommandResponse {
     error_code: u32,
 }
 
+/// NVDM (NVIDIA Device Management) COT (Chain of Trust) payload structure.
+/// This is the main message payload sent to FSP for Chain of Trust.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCot {
+    version: u16,
+    size: u16,
+    gsp_fmc_sysmem_offset: u64,
+    frts_sysmem_offset: u64,
+    frts_sysmem_size: u32,
+    frts_vidmem_offset: u64,
+    frts_vidmem_size: u32,
+    hash384: [u8; FSP_HASH_SIZE],
+    public_key: [u8; FSP_PKEY_SIZE],
+    signature: [u8; FSP_SIG_SIZE],
+    gsp_boot_args_sysmem_offset: u64,
+}
+
+/// Complete FSP message structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct FspMessage {
+    mctp_header: u32,
+    nvdm_header: u32,
+    cot: NvdmPayloadCot,
+}
+
+// SAFETY: FspMessage is a packed C struct with only integral fields.
+unsafe impl AsBytes for FspMessage {}
 /// Complete FSP response structure with MCTP and NVDM headers.
 #[repr(C, packed)]
 #[derive(Clone, Copy)]
@@ -176,6 +210,84 @@ pub(crate) trait MessageToFsp: AsBytes {
     /// NVDM type identifying this message to FSP.
     const NVDM_TYPE: u32;
 }
+
+impl MessageToFsp for FspMessage {
+    const NVDM_TYPE: u32 = NvdmType::Cot as u32;
+}
+
+/// Bundled arguments for FMC boot via FSP Chain of Trust.
+pub(crate) struct FmcBootArgs<'a> {
+    chipset: crate::gpu::Chipset,
+    fmc_image_fw: &'a crate::dma::DmaObject,
+    fmc_boot_params: CoherentAllocation<GspFmcBootParams>,
+    resume: bool,
+    signatures: &'a FmcSignatures,
+}
+
+impl<'a> FmcBootArgs<'a> {
+    /// Build FMC boot arguments, allocating the DMA-coherent boot parameter
+    /// structure that FSP will read.
+    #[expect(dead_code)]
+    #[allow(clippy::too_many_arguments)]
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: crate::gpu::Chipset,
+        fmc_image_fw: &'a crate::dma::DmaObject,
+        wpr_meta_addr: u64,
+        wpr_meta_size: u32,
+        libos_addr: u64,
+        resume: bool,
+        signatures: &'a FmcSignatures,
+    ) -> Result<Self> {
+        // `GSP_DMA_TARGET_*` is not in the current Rust bindings yet.
+        const GSP_DMA_TARGET_COHERENT_SYSTEM: u32 = 1;
+        const GSP_DMA_TARGET_NONCOHERENT_SYSTEM: u32 = 2;
+
+        let fmc_boot_params = CoherentAllocation::<GspFmcBootParams>::alloc_coherent(
+            dev,
+            1,
+            GFP_KERNEL | __GFP_ZERO,
+        )?;
+
+        // Blackwell FSP expects wpr_carveout_offset and wpr_carveout_size to be zero;
+        // it obtains WPR info from other sources.
+        kernel::dma_write!(
+            fmc_boot_params,
+            [0]?.boot_gsp_rm_params,
+            GspAcrBootGspRmParams {
+                target: GSP_DMA_TARGET_COHERENT_SYSTEM,
+                gsp_rm_desc_size: wpr_meta_size,
+                gsp_rm_desc_offset: wpr_meta_addr,
+                b_is_gsp_rm_boot: 1,
+                ..Default::default()
+            }
+        );
+
+        kernel::dma_write!(
+            fmc_boot_params,
+            [0]?.gsp_rm_params,
+            GspRmParams {
+                target: GSP_DMA_TARGET_NONCOHERENT_SYSTEM,
+                boot_args_offset: libos_addr,
+            }
+        );
+
+        Ok(Self {
+            chipset,
+            fmc_image_fw,
+            fmc_boot_params,
+            resume,
+            signatures,
+        })
+    }
+
+    /// DMA address of the FMC boot parameters, needed after boot for lockdown
+    /// release polling.
+    #[expect(dead_code)]
+    pub(crate) fn boot_params_dma_handle(&self) -> u64 {
+        self.fmc_boot_params.dma_handle()
+    }
+}
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -277,8 +389,66 @@ pub(crate) fn extract_fmc_signatures(
         Ok(signatures)
     }
 
-    /// Send message to FSP and wait for response.
+    /// Boot GSP FMC via FSP Chain of Trust.
+    ///
+    /// Builds the COT message from the pre-configured [`FmcBootArgs`], sends it
+    /// to FSP, and waits for the response.
     #[expect(dead_code)]
+    pub(crate) fn boot_fmc(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
+        args: &FmcBootArgs<'_>,
+    ) -> Result {
+        dev_dbg!(dev, "Starting FSP boot sequence for {}\n", args.chipset);
+
+        let fmc_addr = args.fmc_image_fw.dma_handle();
+        let fmc_boot_params_addr = args.fmc_boot_params.dma_handle();
+
+        // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
+        let frts_offset = if !args.resume {
+            let frts_reserved_size = crate::fb::calc_non_wpr_heap_size(args.chipset)
+                .checked_add(u64::from(crate::fb::PMU_RESERVED_SIZE))
+                .ok_or(EINVAL)?;
+
+            frts_reserved_size
+                .align_up(Alignment::new::<SZ_2M>())
+                .ok_or(EINVAL)?
+        } else {
+            0
+        };
+        let frts_size: u32 = if !args.resume { SZ_1M as u32 } else { 0 };
+
+        let msg = KBox::new(
+            FspMessage {
+                mctp_header: MctpHeader::single_packet().raw(),
+                nvdm_header: NvdmHeader::new(NvdmType::Cot).raw(),
+
+                cot: NvdmPayloadCot {
+                    version: args.chipset.fsp_cot_version().ok_or(ENOTSUPP)?.raw(),
+                    size: u16::try_from(core::mem::size_of::<NvdmPayloadCot>())
+                        .map_err(|_| EINVAL)?,
+                    gsp_fmc_sysmem_offset: fmc_addr,
+                    frts_sysmem_offset: 0,
+                    frts_sysmem_size: 0,
+                    frts_vidmem_offset: frts_offset,
+                    frts_vidmem_size: frts_size,
+                    hash384: args.signatures.hash384,
+                    public_key: args.signatures.public_key,
+                    signature: args.signatures.signature,
+                    gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
+                },
+            },
+            GFP_KERNEL,
+        )?;
+
+        Self::send_sync_fsp(dev, bar, fsp_falcon, &*msg)?;
+
+        dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
+        Ok(())
+    }
+
+    /// Send message to FSP and wait for response.
     fn send_sync_fsp<M>(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 1d25513fef20..066bf1e03652 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -139,7 +139,6 @@ pub(crate) const fn needs_fwsec_bootloader(self) -> bool {
     ///
     /// Hopper (GH100) uses version 1, Blackwell uses version 2.
     /// Returns `None` for architectures that do not use FSP.
-    #[expect(dead_code)]
     pub(crate) const fn fsp_cot_version(&self) -> Option<FspCotVersion> {
         match self.arch() {
             Architecture::Hopper => Some(FspCotVersion::new(1)),
diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs
index 9e052d916e79..c23e8ec69636 100644
--- a/drivers/gpu/nova-core/mctp.rs
+++ b/drivers/gpu/nova-core/mctp.rs
@@ -6,8 +6,6 @@
 //! Device Management) messages between the kernel driver and GPU firmware
 //! processors such as FSP and GSP.
 
-#![expect(dead_code)]
-
 /// NVDM message type identifiers carried over MCTP.
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 #[repr(u8)]
@@ -101,11 +99,6 @@ pub(crate) fn nvdm_type(self) -> core::result::Result<NvdmType, u8> {
         NvdmType::try_from(self.raw_nvdm_type())
     }
 
-    /// Extract the NVDM type field as a raw value.
-    pub(crate) fn nvdm_type_raw(self) -> u32 {
-        u32::from(self.raw_nvdm_type())
-    }
-
     /// Set the NVDM type field from a typed value.
     pub(crate) fn set_nvdm_type(self, nvdm_type: NvdmType) -> Self {
         self.set_raw_nvdm_type(u8::from(nvdm_type))
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 25/31] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (23 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 26/31] gpu: nova-core: make WPR heap sizing fallible John Hubbard
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Blackwell GPUs moved the sysmem flush page registers away from the
legacy NV_PFB_NISO_FLUSH_SYSMEM_ADDR used by Ampere/Ada.

GB10x uses HSHUB0 registers, with both a primary and EG (egress) pair
that must be programmed to the same address. GB20x uses FBHUB0
registers.

Add separate GB100 and GB202 fb HALs, and split the Blackwell HAL
dispatch so that each uses its respective registers.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb/hal.rs       | 10 ++++-
 drivers/gpu/nova-core/fb/hal/gb100.rs | 47 +++++++++++++++++---
 drivers/gpu/nova-core/fb/hal/gb202.rs | 62 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/regs.rs         | 36 ++++++++++++++++
 4 files changed, 149 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs

diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index ebd12247f771..844b00868832 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -13,9 +13,14 @@
 mod ga100;
 mod ga102;
 mod gb100;
+mod gb202;
 mod gh100;
 mod tu102;
 
+/// Non-WPR heap size for Blackwell (2 MiB + 128 KiB).
+/// See Open RM: kgspCalculateFbLayout_GB100.
+const BLACKWELL_NON_WPR_HEAP_SIZE: u32 = 0x220000;
+
 pub(crate) trait FbHal {
     /// Returns the address of the currently-registered sysmem flush page.
     fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64;
@@ -46,6 +51,9 @@ pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
         Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
         Architecture::Hopper => gh100::GH100_HAL,
-        Architecture::Blackwell => gb100::GB100_HAL,
+        Architecture::Blackwell => match chipset {
+            Chipset::GB100 | Chipset::GB102 => gb100::GB100_HAL,
+            _ => gb202::GB202_HAL,
+        },
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
index bead99a6ca76..831a058a388b 100644
--- a/drivers/gpu/nova-core/fb/hal/gb100.rs
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -1,21 +1,59 @@
 // SPDX-License-Identifier: GPL-2.0
 
+//! Blackwell GB10x framebuffer HAL.
+//!
+//! GB10x GPUs use HSHUB0 registers for the sysmem flush page. Both the primary and EG (egress)
+//! register pairs must be programmed to the same address, as required by hardware.
+
 use kernel::prelude::*;
 
 use crate::{
     driver::Bar0,
-    fb::hal::FbHal, //
+    fb::hal::FbHal,
+    regs, //
 };
 
 struct Gb100;
 
+fn read_sysmem_flush_page_gb100(bar: &Bar0) -> u64 {
+    let lo = u64::from(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::read(bar).adr());
+    let hi = u64::from(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::read(bar).adr());
+
+    lo | (hi << 32)
+}
+
+fn write_sysmem_flush_page_gb100(bar: &Bar0, addr: u64) {
+    // CAST: lower 32 bits. Hardware ignores bits 7:0.
+    let addr_lo = addr as u32;
+    // CAST: upper 32 bits, then masked to 20 bits by the register field.
+    let addr_hi = (addr >> 32) as u32;
+
+    // Write HI first. The hardware will trigger the flush on the LO write.
+
+    // Primary HSHUB pair.
+    regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        .set_adr(addr_hi)
+        .write(bar);
+    regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        .set_adr(addr_lo)
+        .write(bar);
+
+    // EG (egress) pair -- must match the primary pair.
+    regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        .set_adr(addr_hi)
+        .write(bar);
+    regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        .set_adr(addr_lo)
+        .write(bar);
+}
+
 impl FbHal for Gb100 {
     fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
-        super::ga100::read_sysmem_flush_page_ga100(bar)
+        read_sysmem_flush_page_gb100(bar)
     }
 
     fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
-        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+        write_sysmem_flush_page_gb100(bar, addr);
 
         Ok(())
     }
@@ -29,8 +67,7 @@ fn vidmem_size(&self, bar: &Bar0) -> u64 {
     }
 
     fn non_wpr_heap_size(&self) -> Option<u32> {
-        // 2 MiB + 128 KiB non-WPR heap for Blackwell (see Open RM: kgspCalculateFbLayout_GB100).
-        Some(0x220000)
+        Some(super::BLACKWELL_NON_WPR_HEAP_SIZE)
     }
 }
 
diff --git a/drivers/gpu/nova-core/fb/hal/gb202.rs b/drivers/gpu/nova-core/fb/hal/gb202.rs
new file mode 100644
index 000000000000..2a4c3e7961b2
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb202.rs
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Blackwell GB20x framebuffer HAL.
+//!
+//! GB20x GPUs moved the sysmem flush registers from `NV_PFB_NISO_FLUSH_SYSMEM_ADDR` to
+//! `NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_{LO,HI}`.
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal,
+    regs, //
+};
+
+struct Gb202;
+
+fn read_sysmem_flush_page_gb202(bar: &Bar0) -> u64 {
+    let lo = u64::from(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::read(bar).adr());
+    let hi = u64::from(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::read(bar).adr());
+
+    lo | (hi << 32)
+}
+
+fn write_sysmem_flush_page_gb202(bar: &Bar0, addr: u64) {
+    // Write HI first. The hardware will trigger the flush on the LO write.
+    regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        // CAST: upper 32 bits, then masked to 20 bits by the register field.
+        .set_adr((addr >> 32) as u32)
+        .write(bar);
+    regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        // CAST: lower 32 bits. Hardware ignores bits 7:0.
+        .set_adr(addr as u32)
+        .write(bar);
+}
+
+impl FbHal for Gb202 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        read_sysmem_flush_page_gb202(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        write_sysmem_flush_page_gb202(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        Some(super::BLACKWELL_NON_WPR_HEAP_SIZE)
+    }
+}
+
+const GB202: Gb202 = Gb202;
+pub(super) const GB202_HAL: &dyn FbHal = &GB202;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 183915a3bb31..e70be122e1c9 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -116,6 +116,42 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
     23:0    adr_63_40 as u32;
 });
 
+// Blackwell GB10x sysmem flush registers (HSHUB0).
+//
+// GB10x GPUs use two pairs of HSHUB registers for sysmembar: a primary pair and an EG
+// (egress) pair. Both must be programmed to the same address. Hardware ignores bits 7:0
+// of each LO register. HSHUB0 base is 0x00891000.
+
+register!(NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x00891e50 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x00891e54 {
+    19:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x008916c0 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x008916c4 {
+    19:0    adr as u32;
+});
+
+// Blackwell GB20x sysmem flush registers (FBHUB0).
+//
+// Unlike the older NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an
+// 8-bit right-shift, these registers take the raw address split into lower/upper 32-bit halves.
+// The hardware ignores bits 7:0 of the LO register.
+
+register!(NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x008a1d58 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x008a1d5c {
+    19:0    adr as u32;
+});
+
 register!(NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE @ 0x00100ce0 {
     3:0     lower_scale as u8;
     9:4     lower_mag as u8;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 26/31] gpu: nova-core: make WPR heap sizing fallible
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (24 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 25/31] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Make management_overhead() fail on multiplication or alignment
overflow instead of silently saturating. Propagate that failure through
wpr_heap_size() and the framebuffer layout code that consumes it.

This is not Blackwell-specific, so keep it separate from the larger WPR2
heap change that follows.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs     |  2 +-
 drivers/gpu/nova-core/gsp/fw.rs | 16 +++++++++-------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index c12705f5f742..5943db2b619b 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -247,7 +247,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         let wpr2_heap = {
             const WPR2_HEAP_DOWN_ALIGN: Alignment = Alignment::new::<SZ_1M>();
             let wpr2_heap_size =
-                gsp::LibosParams::from_chipset(chipset).wpr_heap_size(chipset, fb.end);
+                gsp::LibosParams::from_chipset(chipset).wpr_heap_size(chipset, fb.end)?;
             let wpr2_heap_addr = (elf.start - wpr2_heap_size).align_down(WPR2_HEAP_DOWN_ALIGN);
 
             FbRange(wpr2_heap_addr..(elf.start).align_down(WPR2_HEAP_DOWN_ALIGN))
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 92335e7fc34a..4a8ba2721dd1 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -140,13 +140,14 @@ fn client_alloc_size() -> u64 {
 
     /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
     /// `fb_size`.
-    fn management_overhead(fb_size: u64) -> u64 {
+    fn management_overhead(fb_size: u64) -> Result<u64> {
         let fb_size_gb = fb_size.div_ceil(u64::from_safe_cast(kernel::sizes::SZ_1G));
 
         u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
-            .saturating_mul(fb_size_gb)
+            .checked_mul(fb_size_gb)
+            .ok_or(EINVAL)?
             .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+            .ok_or(EINVAL)
     }
 }
 
@@ -189,18 +190,19 @@ pub(crate) fn from_chipset(chipset: Chipset) -> &'static LibosParams {
 
     /// Returns the amount of memory (in bytes) to allocate for the WPR heap for a framebuffer size
     /// of `fb_size` (in bytes) for `chipset`.
-    pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
+    pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> Result<u64> {
         // The WPR heap will contain the following:
         // LIBOS carveout,
-        self.carveout_size
+        Ok(self
+            .carveout_size
             // RM boot working memory,
             .saturating_add(GspFwHeapParams::base_rm_size(chipset))
             // One RM client,
             .saturating_add(GspFwHeapParams::client_alloc_size())
             // Overhead for memory management.
-            .saturating_add(GspFwHeapParams::management_overhead(fb_size))
+            .saturating_add(GspFwHeapParams::management_overhead(fb_size)?)
             // Clamp to the supported heap sizes.
-            .clamp(self.allowed_heap_size.start, self.allowed_heap_size.end - 1)
+            .clamp(self.allowed_heap_size.start, self.allowed_heap_size.end - 1))
     }
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (25 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 26/31] gpu: nova-core: make WPR heap sizing fallible John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-18 16:12   ` kernel test robot
  2026-03-17 22:53 ` [PATCH v7 28/31] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run() John Hubbard
                   ` (4 subsequent siblings)
  31 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Hopper, Blackwell and later GPUs require a larger heap for WPR2.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/fw.rs | 61 +++++++++++++++++++++++++--------
 1 file changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 4a8ba2721dd1..c2eee984bd4d 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -121,21 +121,41 @@ enum GspFwHeapParams {}
 /// Minimum required alignment for the GSP heap.
 const GSP_HEAP_ALIGNMENT: Alignment = Alignment::new::<{ 1 << 20 }>();
 
+// These constants override the generated bindings for architecture-specific heap sizing.
+// See Open RM: kgspCalculateGspFwHeapSize and related functions.
+//
+// 14MB for Hopper/Blackwell+.
+const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100: u64 = 14 * num::usize_as_u64(SZ_1M);
+// 142MB client alloc for ~188MB total.
+const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100: u64 = 142 * num::usize_as_u64(SZ_1M);
+// Hopper/Blackwell+ minimum heap size: 170MB (88 + 12 + 70).
+// See Open RM: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB for the base 88MB,
+// plus Hopper+ additions in kgspCalculateGspFwHeapSize_GH100.
+const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_HOPPER: u64 = 170;
+
 impl GspFwHeapParams {
     /// Returns the amount of GSP-RM heap memory used during GSP-RM boot and initialization (up to
     /// and including the first client subdevice allocation).
-    fn base_rm_size(_chipset: Chipset) -> u64 {
-        // TODO: this needs to be updated to return the correct value for Hopper+ once support for
-        // them is added:
-        // u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100)
-        u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+    fn base_rm_size(chipset: Chipset) -> u64 {
+        use crate::gpu::Architecture;
+        match chipset.arch() {
+            Architecture::Hopper | Architecture::Blackwell => {
+                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
+            }
+            _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
+        }
     }
 
     /// Returns the amount of heap memory required to support a single channel allocation.
-    fn client_alloc_size() -> u64 {
-        u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
-            .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+    fn client_alloc_size(chipset: Chipset) -> Result<u64> {
+        use crate::gpu::Architecture;
+        let size = match chipset.arch() {
+            Architecture::Hopper | Architecture::Blackwell => {
+                GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100
+            }
+            _ => u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE),
+        };
+        size.align_up(GSP_HEAP_ALIGNMENT).ok_or(EINVAL)
     }
 
     /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
@@ -179,12 +199,25 @@ impl LibosParams {
                 * num::usize_as_u64(SZ_1M),
     };
 
+    /// Hopper/Blackwell+ GPUs need a larger minimum heap size than the bindings specify.
+    /// The r570 bindings set LIBOS3_BAREMETAL_MIN_MB to 88MB, but Hopper/Blackwell+ actually
+    /// requires 170MB (88 + 12 + 70).
+    const LIBOS_HOPPER: LibosParams = LibosParams {
+        carveout_size: num::u32_as_u64(bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL),
+        allowed_heap_size: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_HOPPER
+            * num::usize_as_u64(SZ_1M)
+            ..num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB)
+                * num::usize_as_u64(SZ_1M),
+    };
+
     /// Returns the libos parameters corresponding to `chipset`.
     pub(crate) fn from_chipset(chipset: Chipset) -> &'static LibosParams {
-        if chipset < Chipset::GA102 {
-            &Self::LIBOS2
-        } else {
-            &Self::LIBOS3
+        use crate::gpu::Architecture;
+        match chipset.arch() {
+            Architecture::Turing => &Self::LIBOS2,
+            Architecture::Ampere if chipset == Chipset::GA100 => &Self::LIBOS2,
+            Architecture::Ampere | Architecture::Ada => &Self::LIBOS3,
+            Architecture::Hopper | Architecture::Blackwell => &Self::LIBOS_HOPPER,
         }
     }
 
@@ -198,7 +231,7 @@ pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> Result<u64
             // RM boot working memory,
             .saturating_add(GspFwHeapParams::base_rm_size(chipset))
             // One RM client,
-            .saturating_add(GspFwHeapParams::client_alloc_size())
+            .saturating_add(GspFwHeapParams::client_alloc_size(chipset)?)
             // Overhead for memory management.
             .saturating_add(GspFwHeapParams::management_overhead(fb_size)?)
             // Clamp to the supported heap sizes.
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-03-17 22:53 ` [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
@ 2026-03-18 16:12   ` kernel test robot
  2026-03-18 17:59     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: kernel test robot @ 2026-03-18 16:12 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: oe-kbuild-all, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML,
	John Hubbard

Hi John,

kernel test robot noticed the following build errors:

[auto build test ERROR on d19ab42867ae7c68be84ed957d95712b7934773f]

url:    https://github.com/intel-lab-lkp/linux/commits/John-Hubbard/gpu-nova-core-Hopper-Blackwell-basic-GPU-identification/20260318-203344
base:   d19ab42867ae7c68be84ed957d95712b7934773f
patch link:    https://lore.kernel.org/r/20260317225355.549853-28-jhubbard%40nvidia.com
patch subject: [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
config: x86_64-rhel-9.4-rust (https://download.01.org/0day-ci/archive/20260318/202603181742.8HLcTchk-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
rustc: rustc 1.88.0 (6b00bc388 2025-06-23)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260318/202603181742.8HLcTchk-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603181742.8HLcTchk-lkp@intel.com/

All errors (new ones prefixed by >>):

   PATH=/opt/cross/clang-20/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
   INFO PATH=/opt/cross/rustc-1.88.0-bindgen-0.72.1/cargo/bin:/opt/cross/clang-20/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
   /usr/bin/timeout -k 100 12h /usr/bin/make KCFLAGS=\ -fno-crash-diagnostics\ -Wno-error=return-type\ -Wreturn-type\ -funsigned-char\ -Wundef\ -falign-functions=64 W=1 --keep-going LLVM=1 -j32 -C source O=/kbuild/obj/consumer/x86_64-rhel-9.4-rust ARCH=x86_64 SHELL=/bin/bash rustfmtcheck 
   make: Entering directory '/kbuild/src/consumer'
   make[1]: Entering directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
>> Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
        fn base_rm_size(chipset: Chipset) -> u64 {
            use crate::gpu::Architecture;
            match chipset.arch() {
   -            Architecture::Hopper | Architecture::Blackwell => {
   -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
   -            }
   +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
                _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
            }
        }
>> Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
        fn base_rm_size(chipset: Chipset) -> u64 {
            use crate::gpu::Architecture;
            match chipset.arch() {
   -            Architecture::Hopper | Architecture::Blackwell => {
   -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
   -            }
   +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
                _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
            }
        }
>> Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
        fn base_rm_size(chipset: Chipset) -> u64 {
            use crate::gpu::Architecture;
            match chipset.arch() {
   -            Architecture::Hopper | Architecture::Blackwell => {
   -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
   -            }
   +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
                _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
            }
        }
   make[2]: *** [Makefile:1916: rustfmt] Error 123
   make[2]: Target 'rustfmtcheck' not remade because of errors.
   make[1]: Leaving directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
   make[1]: *** [Makefile:248: __sub-make] Error 2
   make[1]: Target 'rustfmtcheck' not remade because of errors.
   make: *** [Makefile:248: __sub-make] Error 2
   make: Target 'rustfmtcheck' not remade because of errors.
   make: Leaving directory '/kbuild/src/consumer'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-03-18 16:12   ` kernel test robot
@ 2026-03-18 17:59     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-18 17:59 UTC (permalink / raw)
  To: kernel test robot, Danilo Krummrich, Alexandre Courbot
  Cc: oe-kbuild-all, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 3/18/26 9:12 AM, kernel test robot wrote:
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202603181742.8HLcTchk-lkp@intel.com/

OK, some rustfmtcheck failures. This has revealed a major gap in my
build flow: although I have rustfmt(1) set up to run upon saving
files in my code editor, not all of my scripts run it.

The code saving approach worked so well that I completely forgot
about "make rustfmt" and "make rustfmtcheck". So now those are
part of all scripts and testing here.

Both issues (this one in gsp/fw.rs, and the one in gsp/boot.rs
reported against patch 31/31) are fixed in v8.

Sorry about the failures.

thanks,
-- 
John Hubbard

> 
> All errors (new ones prefixed by >>):
> 
>    PATH=/opt/cross/clang-20/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>    INFO PATH=/opt/cross/rustc-1.88.0-bindgen-0.72.1/cargo/bin:/opt/cross/clang-20/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>    /usr/bin/timeout -k 100 12h /usr/bin/make KCFLAGS=\ -fno-crash-diagnostics\ -Wno-error=return-type\ -Wreturn-type\ -funsigned-char\ -Wundef\ -falign-functions=64 W=1 --keep-going LLVM=1 -j32 -C source O=/kbuild/obj/consumer/x86_64-rhel-9.4-rust ARCH=x86_64 SHELL=/bin/bash rustfmtcheck 
>    make: Entering directory '/kbuild/src/consumer'
>    make[1]: Entering directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
>>> Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
>         fn base_rm_size(chipset: Chipset) -> u64 {
>             use crate::gpu::Architecture;
>             match chipset.arch() {
>    -            Architecture::Hopper | Architecture::Blackwell => {
>    -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
>    -            }
>    +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
>                 _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
>             }
>         }
>>> Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
>         fn base_rm_size(chipset: Chipset) -> u64 {
>             use crate::gpu::Architecture;
>             match chipset.arch() {
>    -            Architecture::Hopper | Architecture::Blackwell => {
>    -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
>    -            }
>    +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
>                 _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
>             }
>         }
>>> Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
>         fn base_rm_size(chipset: Chipset) -> u64 {
>             use crate::gpu::Architecture;
>             match chipset.arch() {
>    -            Architecture::Hopper | Architecture::Blackwell => {
>    -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
>    -            }
>    +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
>                 _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
>             }
>         }
>    make[2]: *** [Makefile:1916: rustfmt] Error 123
>    make[2]: Target 'rustfmtcheck' not remade because of errors.
>    make[1]: Leaving directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
>    make[1]: *** [Makefile:248: __sub-make] Error 2
>    make[1]: Target 'rustfmtcheck' not remade because of errors.
>    make: *** [Makefile:248: __sub-make] Error 2
>    make: Target 'rustfmtcheck' not remade because of errors.
>    make: Leaving directory '/kbuild/src/consumer'
> 



^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v7 28/31] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run()
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (26 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 29/31] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Move the SEC2 reset/load/boot sequence into a BooterFirmware::run()
method, and call it from a thin run_booter() helper on Gsp. This is
almost a pure refactoring with no behavior change, done in preparation
for adding an alternative FSP boot path. The one slight difference is
that an MBOX1 printing typo is fixed:

Previous output:

NovaCore 0000:e1:00.0: SEC2 MBOX0: 0x0, MBOX10x1

Fixed output:

NovaCore 0000:e1:00.0: SEC2 MBOX0: 0x0, MBOX1: 0x1

Cc: Timur Tabi <ttabi@nvidia.com>
Suggested-by: Danilo Krummrich <dakr@kernel.org>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/booter.rs | 35 ++++++++++++++++++-
 drivers/gpu/nova-core/gsp/boot.rs        | 43 +++++++++++-------------
 2 files changed, 54 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/booter.rs b/drivers/gpu/nova-core/firmware/booter.rs
index de2a4536b532..7595af8acfd8 100644
--- a/drivers/gpu/nova-core/firmware/booter.rs
+++ b/drivers/gpu/nova-core/firmware/booter.rs
@@ -8,8 +8,12 @@
 
 use kernel::{
     device,
+    dma::CoherentAllocation,
     prelude::*,
-    transmute::FromBytes, //
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    },
 };
 
 use crate::{
@@ -396,6 +400,35 @@ pub(crate) fn new(
             ucode: ucode_signed,
         })
     }
+
+    /// Load and run the booter firmware on SEC2.
+    ///
+    /// Resets SEC2, loads this firmware image, then boots with the WPR metadata
+    /// address passed via the SEC2 mailboxes.
+    pub(crate) fn run<T: AsBytes + FromBytes>(
+        &self,
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        sec2_falcon: &Falcon<Sec2>,
+        wpr_meta: &CoherentAllocation<T>,
+    ) -> Result {
+        sec2_falcon.reset(bar)?;
+        sec2_falcon.load(dev, bar, self)?;
+        let wpr_handle = wpr_meta.dma_handle();
+        let (mbox0, mbox1) = sec2_falcon.boot(
+            bar,
+            Some(wpr_handle as u32),
+            Some((wpr_handle >> 32) as u32),
+        )?;
+        dev_dbg!(dev, "SEC2 MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
+
+        if mbox0 != 0 {
+            dev_err!(dev, "Booter-load failed with error {:#x}\n", mbox0);
+            return Err(ENODEV);
+        }
+
+        Ok(())
+    }
 }
 
 impl FalconDmaLoadable for BooterFirmware {
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 6db2decbc6f5..ad0344db66b2 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -129,6 +129,25 @@ fn run_fwsec_frts(
         }
     }
 
+    fn run_booter(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        sec2_falcon: &Falcon<Sec2>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+    ) -> Result {
+        let booter = BooterFirmware::new(
+            dev,
+            BooterKind::Loader,
+            chipset,
+            FIRMWARE_VERSION,
+            sec2_falcon,
+            bar,
+        )?;
+
+        booter.run(dev, bar, sec2_falcon, wpr_meta)
+    }
+
     /// Attempt to boot the GSP.
     ///
     /// This is a GPU-dependent and complex procedure that involves loading firmware files from
@@ -155,15 +174,6 @@ pub(crate) fn boot(
 
         Self::run_fwsec_frts(dev, chipset, gsp_falcon, bar, &bios, &fb_layout)?;
 
-        let booter_loader = BooterFirmware::new(
-            dev,
-            BooterKind::Loader,
-            chipset,
-            FIRMWARE_VERSION,
-            sec2_falcon,
-            bar,
-        )?;
-
         let wpr_meta =
             CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
         dma_write!(wpr_meta, [0]?, GspFwWprMeta::new(&gsp_fw, &fb_layout));
@@ -186,20 +196,7 @@ pub(crate) fn boot(
             "Using SEC2 to load and run the booter_load firmware...\n"
         );
 
-        sec2_falcon.reset(bar)?;
-        sec2_falcon.load(dev, bar, &booter_loader)?;
-        let wpr_handle = wpr_meta.dma_handle();
-        let (mbox0, mbox1) = sec2_falcon.boot(
-            bar,
-            Some(wpr_handle as u32),
-            Some((wpr_handle >> 32) as u32),
-        )?;
-        dev_dbg!(pdev, "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n", mbox0, mbox1);
-
-        if mbox0 != 0 {
-            dev_err!(pdev, "Booter-load failed with error {:#x}\n", mbox0);
-            return Err(ENODEV);
-        }
+        Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?;
 
         gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 29/31] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (27 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 28/31] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run() John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 30/31] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

On Hopper and Blackwell, FSP boots GSP with hardware lockdown enabled.
After FSP Chain of Trust completes, the driver must poll for lockdown
release before proceeding with GSP initialization. Add the register
bit and helper functions needed for this polling.

Cc: Gary Guo <gary@garyguo.net>
Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs | 80 ++++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs     |  1 +
 2 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index ad0344db66b2..a3ab0bd7a317 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -15,7 +15,8 @@
     falcon::{
         gsp::Gsp,
         sec2::Sec2,
-        Falcon, //
+        Falcon,
+        FalconEngine, //
     },
     fb::FbLayout,
     firmware::{
@@ -44,6 +45,54 @@
     vbios::Vbios,
 };
 
+/// GSP lockdown pattern written by firmware to mbox0 while RISC-V branch privilege
+/// lockdown is active. The low byte varies, the upper 24 bits are fixed.
+const GSP_LOCKDOWN_PATTERN: u32 = 0xbadf4100;
+const GSP_LOCKDOWN_MASK: u32 = 0xffffff00;
+
+/// GSP falcon mailbox state, used to track lockdown release status.
+struct GspMbox {
+    mbox0: u32,
+    mbox1: u32,
+}
+
+impl GspMbox {
+    /// Read both mailboxes from the GSP falcon.
+    fn read(gsp_falcon: &Falcon<Gsp>, bar: &Bar0) -> Self {
+        Self {
+            mbox0: gsp_falcon.read_mailbox0(bar),
+            mbox1: gsp_falcon.read_mailbox1(bar),
+        }
+    }
+
+    /// Returns true if the lockdown pattern is present in mbox0.
+    fn is_locked_down(&self) -> bool {
+        self.mbox0 != 0 && (self.mbox0 & GSP_LOCKDOWN_MASK) == GSP_LOCKDOWN_PATTERN
+    }
+
+    /// Combines mailbox0 and mailbox1 into a 64-bit address.
+    fn combined_addr(&self) -> u64 {
+        (u64::from(self.mbox1) << 32) | u64::from(self.mbox0)
+    }
+
+    /// Returns true if GSP lockdown has been released.
+    ///
+    /// Checks the lockdown pattern, validates the boot params address,
+    /// and verifies the HWCFG2 lockdown bit is clear.
+    fn lockdown_released(&self, bar: &Bar0, fmc_boot_params_addr: u64) -> bool {
+        if self.is_locked_down() {
+            return false;
+        }
+
+        if self.mbox0 != 0 && self.combined_addr() != fmc_boot_params_addr {
+            return true;
+        }
+
+        let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &crate::falcon::gsp::Gsp::ID);
+        !hwcfg2.riscv_br_priv_lockdown()
+    }
+}
+
 impl super::Gsp {
     /// Helper function to load and run the FWSEC-FRTS firmware and confirm that it has properly
     /// created the WPR2 region.
@@ -148,6 +197,35 @@ fn run_booter(
         booter.run(dev, bar, sec2_falcon, wpr_meta)
     }
 
+    /// Wait for GSP lockdown to be released after FSP Chain of Trust.
+    #[expect(dead_code)]
+    fn wait_for_gsp_lockdown_release(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        gsp_falcon: &Falcon<Gsp>,
+        fmc_boot_params_addr: u64,
+    ) -> Result {
+        dev_dbg!(dev, "Waiting for GSP lockdown release\n");
+
+        let mbox = read_poll_timeout(
+            || Ok(GspMbox::read(gsp_falcon, bar)),
+            |mbox| mbox.lockdown_released(bar, fmc_boot_params_addr),
+            Delta::from_millis(10),
+            Delta::from_millis(4000),
+        )
+        .inspect_err(|_| {
+            dev_err!(dev, "GSP lockdown release timeout\n");
+        })?;
+
+        if mbox.mbox0 != 0 {
+            dev_err!(dev, "GSP-FMC boot failed (mbox: {:#x})\n", mbox.mbox0);
+            return Err(EIO);
+        }
+
+        dev_dbg!(dev, "GSP lockdown released\n");
+        Ok(())
+    }
+
     /// Attempt to boot the GSP.
     ///
     /// This is a GPU-dependent and complex procedure that involves loading firmware files from
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index e70be122e1c9..e59d413dae06 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -321,6 +321,7 @@ pub(crate) fn vga_workspace_addr(self) -> Option<u64> {
 register!(NV_PFALCON_FALCON_HWCFG2 @ PFalconBase[0x000000f4] {
     10:10   riscv as bool;
     12:12   mem_scrubbing as bool, "Set to 0 after memory scrubbing is completed";
+    13:13   riscv_br_priv_lockdown as bool, "RISC-V branch privilege lockdown bit";
     31:31   reset_ready as bool, "Signal indicating that reset is completed (GA102+)";
 });
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 30/31] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (28 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 29/31] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-17 22:53 ` [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot() John Hubbard
  2026-03-18 20:25 ` [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell GPUs use a different PCI config space mirror
address (0x088000) compared to older architectures (0x088480). Update
SetSystemInfo to accept a chipset parameter and select the correct
address based on architecture.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs        |  2 +-
 drivers/gpu/nova-core/gsp/commands.rs    |  8 +++++---
 drivers/gpu/nova-core/gsp/fw/commands.rs | 20 +++++++++++++++++---
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index a3ab0bd7a317..7db811e90825 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -257,7 +257,7 @@ pub(crate) fn boot(
         dma_write!(wpr_meta, [0]?, GspFwWprMeta::new(&gsp_fw, &fb_layout));
 
         self.cmdq
-            .send_command(bar, commands::SetSystemInfo::new(pdev))?;
+            .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
         self.cmdq.send_command(bar, commands::SetRegistry::new())?;
 
         gsp_falcon.reset(bar)?;
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 8f270eca33be..e6a9a1fc6296 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -20,6 +20,7 @@
 
 use crate::{
     driver::Bar0,
+    gpu::Chipset,
     gsp::{
         cmdq::{
             Cmdq,
@@ -37,12 +38,13 @@
 /// The `GspSetSystemInfo` command.
 pub(crate) struct SetSystemInfo<'a> {
     pdev: &'a pci::Device<device::Bound>,
+    chipset: Chipset,
 }
 
 impl<'a> SetSystemInfo<'a> {
     /// Creates a new `GspSetSystemInfo` command using the parameters of `pdev`.
-    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>) -> Self {
-        Self { pdev }
+    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>, chipset: Chipset) -> Self {
+        Self { pdev, chipset }
     }
 }
 
@@ -52,7 +54,7 @@ impl<'a> CommandToGsp for SetSystemInfo<'a> {
     type InitError = Error;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
-        GspSetSystemInfo::init(self.pdev)
+        GspSetSystemInfo::init(self.pdev, self.chipset)
     }
 }
 
diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs
index db46276430be..1dca2552ed54 100644
--- a/drivers/gpu/nova-core/gsp/fw/commands.rs
+++ b/drivers/gpu/nova-core/gsp/fw/commands.rs
@@ -10,7 +10,13 @@
     }, //
 };
 
-use crate::gsp::GSP_PAGE_SIZE;
+use crate::{
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
+    gsp::GSP_PAGE_SIZE, //
+};
 
 use super::bindings;
 
@@ -24,7 +30,10 @@ pub(crate) struct GspSetSystemInfo {
 impl GspSetSystemInfo {
     /// Returns an in-place initializer for the `GspSetSystemInfo` command.
     #[allow(non_snake_case)]
-    pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, Error> + 'a {
+    pub(crate) fn init<'a>(
+        dev: &'a pci::Device<device::Bound>,
+        chipset: Chipset,
+    ) -> impl Init<Self, Error> + 'a {
         type InnerGspSystemInfo = bindings::GspSystemInfo;
         let init_inner = try_init!(InnerGspSystemInfo {
             gpuPhysAddr: dev.resource_start(0)?,
@@ -35,7 +44,12 @@ pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, E
             // Using TASK_SIZE in r535_gsp_rpc_set_system_info() seems wrong because
             // TASK_SIZE is per-task. That's probably a design issue in GSP-RM though.
             maxUserVa: (1 << 47) - 4096,
-            pciConfigMirrorBase: 0x088000,
+            // Hopper, Blackwell, and later moved the PCI config mirror window to 0x092000.
+            // Older architectures continue to use the legacy window at 0x088000.
+            pciConfigMirrorBase: match chipset.arch() {
+                Architecture::Turing | Architecture::Ampere | Architecture::Ada => 0x088000,
+                Architecture::Hopper | Architecture::Blackwell => 0x092000,
+            },
             pciConfigMirrorSize: 0x001000,
 
             PCIDeviceID: (u32::from(dev.device_id()) << 16) | u32::from(dev.vendor_id().as_raw()),
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (29 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 30/31] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
@ 2026-03-17 22:53 ` John Hubbard
  2026-03-18 17:02   ` kernel test robot
  2026-03-18 20:25 ` [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  31 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-03-17 22:53 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add the FSP boot path for Hopper and Blackwell GPUs. These architectures
use FSP with FMC firmware for Chain of Trust boot, rather than SEC2.

boot() now dispatches to boot_via_sec2() or boot_via_fsp() based on
architecture. The SEC2 path keeps its original command ordering. The
FSP path sends SetSystemInfo/SetRegistry after GSP becomes active.
The GSP sequencer only runs for SEC2-based architectures.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/fsp.rs |   2 -
 drivers/gpu/nova-core/fsp.rs          |   5 -
 drivers/gpu/nova-core/gsp/boot.rs     | 181 ++++++++++++++++++++------
 3 files changed, 144 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index e5059d59a4b7..e981f2316d01 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -14,7 +14,6 @@
     gpu::Chipset, //
 };
 
-#[expect(dead_code)]
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the "image" ELF section).
     pub(crate) fmc_image: DmaObject,
@@ -23,7 +22,6 @@ pub(crate) struct FspFirmware {
 }
 
 impl FspFirmware {
-    #[expect(dead_code)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
         chipset: Chipset,
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 68bcfe45aec6..06909217e564 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -227,7 +227,6 @@ pub(crate) struct FmcBootArgs<'a> {
 impl<'a> FmcBootArgs<'a> {
     /// Build FMC boot arguments, allocating the DMA-coherent boot parameter
     /// structure that FSP will read.
-    #[expect(dead_code)]
     #[allow(clippy::too_many_arguments)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
@@ -283,7 +282,6 @@ pub(crate) fn new(
 
     /// DMA address of the FMC boot parameters, needed after boot for lockdown
     /// release polling.
-    #[expect(dead_code)]
     pub(crate) fn boot_params_dma_handle(&self) -> u64 {
         self.fmc_boot_params.dma_handle()
     }
@@ -296,7 +294,6 @@ impl Fsp {
     ///
     /// Polls the thermal scratch register until FSP signals boot completion
     /// or timeout occurs.
-    #[expect(dead_code)]
     pub(crate) fn wait_secure_boot(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
@@ -326,7 +323,6 @@ pub(crate) fn wait_secure_boot(
     ///
     /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
     /// Returns signatures in a heap-allocated structure to prevent stack overflow.
-    #[expect(dead_code)]
     pub(crate) fn extract_fmc_signatures(
         dev: &device::Device<device::Bound>,
         fmc_fw_data: &[u8],
@@ -393,7 +389,6 @@ pub(crate) fn extract_fmc_signatures(
     ///
     /// Builds the COT message from the pre-configured [`FmcBootArgs`], sends it
     /// to FSP, and waits for the response.
-    #[expect(dead_code)]
     pub(crate) fn boot_fmc(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 7db811e90825..4bd226573b89 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -13,6 +13,7 @@
 use crate::{
     driver::Bar0,
     falcon::{
+        fsp::Fsp as FspEngine,
         gsp::Gsp,
         sec2::Sec2,
         Falcon,
@@ -24,6 +25,7 @@
             BooterFirmware,
             BooterKind, //
         },
+        fsp::FspFirmware,
         fwsec::{
             bootloader::FwsecFirmwareWithBl,
             FwsecCommand,
@@ -32,9 +34,17 @@
         gsp::GspFirmware,
         FIRMWARE_VERSION, //
     },
-    gpu::Chipset,
+    fsp::{
+        FmcBootArgs,
+        Fsp, //
+    },
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
     gsp::{
         commands,
+        fw::LibosMemoryRegionInitArgument,
         sequencer::{
             GspSequencer,
             GspSequencerParams, //
@@ -197,8 +207,83 @@ fn run_booter(
         booter.run(dev, bar, sec2_falcon, wpr_meta)
     }
 
+    /// Boot GSP via SEC2 booter firmware (Turing/Ampere/Ada path).
+    ///
+    /// This path uses FWSEC-FRTS to set up WPR2, then boots GSP directly,
+    /// then uses SEC2 to run the booter firmware.
+    #[allow(clippy::too_many_arguments)]
+    fn boot_via_sec2(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        gsp_falcon: &Falcon<Gsp>,
+        sec2_falcon: &Falcon<Sec2>,
+        fb_layout: &FbLayout,
+        libos: &CoherentAllocation<LibosMemoryRegionInitArgument>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+    ) -> Result {
+        // Run FWSEC-FRTS to set up the WPR2 region
+        let bios = Vbios::new(dev, bar)?;
+        Self::run_fwsec_frts(dev, chipset, gsp_falcon, bar, &bios, fb_layout)?;
+
+        // Reset and boot GSP before SEC2
+        gsp_falcon.reset(bar)?;
+        let libos_handle = libos.dma_handle();
+        let (mbox0, mbox1) = gsp_falcon.boot(
+            bar,
+            Some(libos_handle as u32),
+            Some((libos_handle >> 32) as u32),
+        )?;
+        dev_dbg!(dev, "GSP MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
+        dev_dbg!(
+            dev,
+            "Using SEC2 to load and run the booter_load firmware...\n"
+        );
+
+        // Run booter via SEC2
+        Self::run_booter(dev, bar, chipset, sec2_falcon, wpr_meta)
+    }
+
+    /// Boot GSP via FSP Chain of Trust (Hopper/Blackwell+ path).
+    ///
+    /// This path uses FSP to establish a chain of trust and boot GSP-FMC. FSP handles
+    /// the GSP boot internally - no manual GSP reset/boot is needed.
+    fn boot_via_fsp(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        gsp_falcon: &Falcon<Gsp>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+        libos: &CoherentAllocation<LibosMemoryRegionInitArgument>,
+    ) -> Result {
+        let fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
+
+        Fsp::wait_secure_boot(dev, bar, chipset.arch())?;
+
+        let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
+
+        let signatures = Fsp::extract_fmc_signatures(dev, &fsp_fw.fmc_full)?;
+
+        let args = FmcBootArgs::new(
+            dev,
+            chipset,
+            &fsp_fw.fmc_image,
+            wpr_meta.dma_handle(),
+            core::mem::size_of::<GspFwWprMeta>() as u32,
+            libos.dma_handle(),
+            false,
+            &signatures,
+        )?;
+
+        Fsp::boot_fmc(dev, bar, &fsp_falcon, &args)?;
+
+        let fmc_boot_params_addr = args.boot_params_dma_handle();
+        Self::wait_for_gsp_lockdown_release(dev, bar, gsp_falcon, fmc_boot_params_addr)?;
+
+        Ok(())
+    }
+
     /// Wait for GSP lockdown to be released after FSP Chain of Trust.
-    #[expect(dead_code)]
     fn wait_for_gsp_lockdown_release(
         dev: &device::Device<device::Bound>,
         bar: &Bar0,
@@ -242,40 +327,49 @@ pub(crate) fn boot(
         sec2_falcon: &Falcon<Sec2>,
     ) -> Result {
         let dev = pdev.as_ref();
-
-        let bios = Vbios::new(dev, bar)?;
+        let uses_sec2 = matches!(
+            chipset.arch(),
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada
+        );
 
         let gsp_fw = KBox::pin_init(GspFirmware::new(dev, chipset, FIRMWARE_VERSION), GFP_KERNEL)?;
 
         let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?;
         dev_dbg!(dev, "{:#x?}\n", fb_layout);
 
-        Self::run_fwsec_frts(dev, chipset, gsp_falcon, bar, &bios, &fb_layout)?;
-
         let wpr_meta =
             CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
         dma_write!(wpr_meta, [0]?, GspFwWprMeta::new(&gsp_fw, &fb_layout));
 
-        self.cmdq
-            .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
-        self.cmdq.send_command(bar, commands::SetRegistry::new())?;
+        // Architecture-specific boot path
+        if uses_sec2 {
+            // SEC2 path: send commands before GSP reset/boot (original order).
+            self.cmdq
+                .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
+            self.cmdq.send_command(bar, commands::SetRegistry::new())?;
 
-        gsp_falcon.reset(bar)?;
-        let libos_handle = self.libos.dma_handle();
-        let (mbox0, mbox1) = gsp_falcon.boot(
-            bar,
-            Some(libos_handle as u32),
-            Some((libos_handle >> 32) as u32),
-        )?;
-        dev_dbg!(pdev, "GSP MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
-
-        dev_dbg!(
-            pdev,
-            "Using SEC2 to load and run the booter_load firmware...\n"
-        );
-
-        Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?;
+            Self::boot_via_sec2(
+                dev,
+                bar,
+                chipset,
+                gsp_falcon,
+                sec2_falcon,
+                &fb_layout,
+                &self.libos,
+                &wpr_meta,
+            )?;
+        } else {
+            Self::boot_via_fsp(
+                dev,
+                bar,
+                chipset,
+                gsp_falcon,
+                &wpr_meta,
+                &self.libos,
+            )?;
+        }
 
+        // Common post-boot initialization
         gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version);
 
         // Poll for RISC-V to become active before running sequencer
@@ -286,18 +380,31 @@ pub(crate) fn boot(
             Delta::from_secs(5),
         )?;
 
-        dev_dbg!(pdev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(bar),);
+        dev_dbg!(dev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(bar));
 
-        // Create and run the GSP sequencer.
-        let seq_params = GspSequencerParams {
-            bootloader_app_version: gsp_fw.bootloader.app_version,
-            libos_dma_handle: libos_handle,
-            gsp_falcon,
-            sec2_falcon,
-            dev: pdev.as_ref().into(),
-            bar,
-        };
-        GspSequencer::run(&mut self.cmdq, seq_params)?;
+        // For FSP path, send commands after GSP becomes active.
+        if matches!(
+            chipset.arch(),
+            Architecture::Hopper | Architecture::Blackwell
+        ) {
+            self.cmdq
+                .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
+            self.cmdq.send_command(bar, commands::SetRegistry::new())?;
+        }
+
+        // SEC2-based architectures need to run the GSP sequencer
+        if uses_sec2 {
+            let libos_handle = self.libos.dma_handle();
+            let seq_params = GspSequencerParams {
+                bootloader_app_version: gsp_fw.bootloader.app_version,
+                libos_dma_handle: libos_handle,
+                gsp_falcon,
+                sec2_falcon,
+                dev: dev.into(),
+                bar,
+            };
+            GspSequencer::run(&mut self.cmdq, seq_params)?;
+        }
 
         // Wait until GSP is fully initialized.
         commands::wait_gsp_init_done(&mut self.cmdq)?;
@@ -305,8 +412,8 @@ pub(crate) fn boot(
         // Obtain and display basic GPU information.
         let info = commands::get_gsp_info(&mut self.cmdq, bar)?;
         match info.gpu_name() {
-            Ok(name) => dev_info!(pdev, "GPU name: {}\n", name),
-            Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e),
+            Ok(name) => dev_info!(dev, "GPU name: {}\n", name),
+            Err(e) => dev_warn!(dev, "GPU name unavailable: {:?}\n", e),
         }
 
         Ok(())
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()
  2026-03-17 22:53 ` [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot() John Hubbard
@ 2026-03-18 17:02   ` kernel test robot
  2026-03-18 17:59     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: kernel test robot @ 2026-03-18 17:02 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: oe-kbuild-all, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML,
	John Hubbard

Hi John,

kernel test robot noticed the following build errors:

[auto build test ERROR on d19ab42867ae7c68be84ed957d95712b7934773f]

url:    https://github.com/intel-lab-lkp/linux/commits/John-Hubbard/gpu-nova-core-Hopper-Blackwell-basic-GPU-identification/20260318-203344
base:   d19ab42867ae7c68be84ed957d95712b7934773f
patch link:    https://lore.kernel.org/r/20260317225355.549853-32-jhubbard%40nvidia.com
patch subject: [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()
config: x86_64-rhel-9.4-rust (https://download.01.org/0day-ci/archive/20260318/202603181705.C57qvOSk-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
rustc: rustc 1.88.0 (6b00bc388 2025-06-23)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260318/202603181705.C57qvOSk-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603181705.C57qvOSk-lkp@intel.com/

All errors (new ones prefixed by >>):

   PATH=/opt/cross/clang-20/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
   INFO PATH=/opt/cross/rustc-1.88.0-bindgen-0.72.1/cargo/bin:/opt/cross/clang-20/bin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
   /usr/bin/timeout -k 100 12h /usr/bin/make KCFLAGS=\ -fno-crash-diagnostics\ -Wno-error=return-type\ -Wreturn-type\ -funsigned-char\ -Wundef\ -falign-functions=64 W=1 --keep-going LLVM=1 -j32 -C source O=/kbuild/obj/consumer/x86_64-rhel-9.4-rust ARCH=x86_64 SHELL=/bin/bash rustfmtcheck 
   make: Entering directory '/kbuild/src/consumer'
   make[1]: Entering directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
>> Diff in drivers/gpu/nova-core/gsp/boot.rs:359:
                    &wpr_meta,
                )?;
            } else {
   -            Self::boot_via_fsp(
   -                dev,
   -                bar,
   -                chipset,
   -                gsp_falcon,
   -                &wpr_meta,
   -                &self.libos,
   -            )?;
   +            Self::boot_via_fsp(dev, bar, chipset, gsp_falcon, &wpr_meta, &self.libos)?;
            }
    
            // Common post-boot initialization
   Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
        fn base_rm_size(chipset: Chipset) -> u64 {
            use crate::gpu::Architecture;
            match chipset.arch() {
   -            Architecture::Hopper | Architecture::Blackwell => {
   -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
   -            }
   +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
                _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
            }
        }
>> Diff in drivers/gpu/nova-core/gsp/boot.rs:359:
                    &wpr_meta,
                )?;
            } else {
   -            Self::boot_via_fsp(
   -                dev,
   -                bar,
   -                chipset,
   -                gsp_falcon,
   -                &wpr_meta,
   -                &self.libos,
   -            )?;
   +            Self::boot_via_fsp(dev, bar, chipset, gsp_falcon, &wpr_meta, &self.libos)?;
            }
    
            // Common post-boot initialization
   Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
        fn base_rm_size(chipset: Chipset) -> u64 {
            use crate::gpu::Architecture;
            match chipset.arch() {
   -            Architecture::Hopper | Architecture::Blackwell => {
   -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
   -            }
   +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
                _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
            }
        }
>> Diff in drivers/gpu/nova-core/gsp/boot.rs:359:
                    &wpr_meta,
                )?;
            } else {
   -            Self::boot_via_fsp(
   -                dev,
   -                bar,
   -                chipset,
   -                gsp_falcon,
   -                &wpr_meta,
   -                &self.libos,
   -            )?;
   +            Self::boot_via_fsp(dev, bar, chipset, gsp_falcon, &wpr_meta, &self.libos)?;
            }
    
            // Common post-boot initialization
   Diff in drivers/gpu/nova-core/gsp/fw.rs:139:
        fn base_rm_size(chipset: Chipset) -> u64 {
            use crate::gpu::Architecture;
            match chipset.arch() {
   -            Architecture::Hopper | Architecture::Blackwell => {
   -                GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
   -            }
   +            Architecture::Hopper | Architecture::Blackwell => GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100,
                _ => u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X),
            }
        }
   make[2]: *** [Makefile:1916: rustfmt] Error 123
   make[2]: Target 'rustfmtcheck' not remade because of errors.
   make[1]: Leaving directory '/kbuild/obj/consumer/x86_64-rhel-9.4-rust'
   make[1]: *** [Makefile:248: __sub-make] Error 2
   make[1]: Target 'rustfmtcheck' not remade because of errors.
   make: *** [Makefile:248: __sub-make] Error 2
   make: Target 'rustfmtcheck' not remade because of errors.
   make: Leaving directory '/kbuild/src/consumer'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()
  2026-03-18 17:02   ` kernel test robot
@ 2026-03-18 17:59     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-18 17:59 UTC (permalink / raw)
  To: kernel test robot, Danilo Krummrich, Alexandre Courbot
  Cc: oe-kbuild-all, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 3/18/26 10:02 AM, kernel test robot wrote:
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202603181705.C57qvOSk-lkp@intel.com/
> 

Fixed in v8, along with the gsp/fw.rs issue reported against patch
27/31.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support
  2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (30 preceding siblings ...)
  2026-03-17 22:53 ` [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot() John Hubbard
@ 2026-03-18 20:25 ` John Hubbard
  31 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-03-18 20:25 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On 3/17/26 3:53 PM, John Hubbard wrote:
> This is based on today's drm-rust-next, which has Alex's register!()
> macro series. A git branch is here:
> 
>     https://github.com/johnhubbard/linux/tree/nova-core-blackwell-v7

Hi Alex, Danilo,

I've fixed up patches 27 and 31 with the "make rustfmt" changes that
the kernel bot reported, and that's standing by here:

    https://github.com/johnhubbard/linux/commits/nova-core-blackwell-v8/

I'm hoping that otherwise, we're about ready to merge, on top of
Alex's register!() changes patchset [1], it sounds like.

Let me know if or when I should post anything, such as v8.

[1] https://lore.kernel.org/20260318-b4-nova-register-v1-0-22a358aa4c63@nvidia.com

thanks,
-- 
John Hubbard

> 
> It's been re-tested on Turing, Ampere and Blackwell:
> 
>     NovaCore 0000:e1:00.0: GPU name: NVIDIA GeForce GTX 1650
>     NovaCore 0000:e1:00.0: GPU name: NVIDIA RTX A4000
>     NovaCore 0000:01:00.0: GPU name: NVIDIA RTX PRO 6000 Blackwell Max-Q
>     Workstation Edition
> 
> Changes in v7:
> * Rebased onto Alexandre Courbot's rust register!() series in
>   drm-rust-next, including the related generic I/O accessor and
>   IoCapable changes.
> 
> * Rebased onto drm-rust-next (v7.0-rc4 based).
> 
> * Dropped the v6 patches that are already in drm-rust-next: the
>   aux-device fix, the pdev helper macro patch, and the one-item-per-line
>   use cleanup.
> 
> * Reworked the GPU init pieces per review. DMA mask setup now stays in
>   driver probe, with the mask width selected by GPU architecture, and
>   the GFW boot policy now lives in a dedicated GPU HAL.
> 
> * Reworked firmware image parsing per review around a single ElfFormat
>   trait with associated header types. Also added support for both ELF32
>   and ELF64 images, with automatic format detection.
> 
> * Reworked the MCTP/NVDM protocol code to use bitfield! and typed
>   accessors, removing the open-coded bit handling.
> 
> * Reworked the FSP messaging part of the series so that the message
>   structures are introduced in the first patches that use them, instead
>   of as a standalone dead-code-only patch. Also changed fmc_full to use
>   KVec<u8> from the start.
> 
> * Split the WPR heap overflow handling out into a separate prep patch.
>   That patch makes management_overhead() and wpr_heap_size() fallible,
>   uses checked arithmetic, and leaves the larger WPR2 heap patch with
>   only the Hopper and Blackwell sizing changes.
> 
> * Added a code comment documenting the Hopper and Blackwell PCI config
>   mirror base change.
> 
> Changes in v6:
> 
> * Rebased onto drm-rust-next (v7.0-rc1 based).
> 
> * Dropped the first two patches from v5 (aux device fix and pdev
>   macros), which have since been merged independently.
> 
> * const_align_up(): reworked per review from Gary Guo, Miguel Ojeda,
>   and Danilo Krummrich: now returns Option<usize> instead of panicking,
>   takes an Alignment argument instead of a const generic, and no longer
>   needs the inline_const feature addition in scripts/Makefile.build.
> 
> * The rust/sizes and SZ_*_U64 patches from v5 are no longer included.
>   I plan to post those as a separate series that depends on this one.
> 
> Changes in v5:
> 
> * Rebased onto linux.git master.
> 
> * Split MCTP protocol into its own module and file.
> 
> * Many Rust-based improvements: more use of types, especially. Also
>   used Result and Option more.
> 
> * Lots of cleanup of comments and print output and error handling.
> 
> * Added const_align_up() to rust/ and used it in nova-core. This
>   required enabling a Rust feature: inline_const, as recommended by
>   Miguel Ojeda.
> 
> * Refactoring various things, such as Gpu::new() to own Spec creation,
>   and several more such things.
> 
> * Fixed three Delta::ZERO busy-polls (patches 21, 24, 31) to use
>   non-zero sleep intervals (after just realizing that it was a bad
>   choice to have zero in there).
> 
> * Reduced GH100/GB100 HAL duplication. Made FSP_PKEY_SIZE/FSP_SIG_SIZE
>   consistent across patches. Replaced fragile architecture checks with
>   chipset.arch(). Renamed LIBOS_BLACKWELL.
> 
> * Narrowed the scope of some of the #![expect(dead_code)] cases,
>   although that really only matters within the series, not once it is
>   fully applied.
> 
> John Hubbard (31):
>   gpu: nova-core: Hopper/Blackwell: basic GPU identification
>   gpu: nova-core: factor .fwsignature* selection into a new
>     find_gsp_sigs_section()
>   gpu: nova-core: use GPU Architecture to simplify HAL selections
>   gpu: nova-core: move GPU init into Gpu::new()
>   gpu: nova-core: set DMA mask width based on GPU architecture
>   gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
>   gpu: nova-core: move firmware image parsing code to firmware.rs
>   gpu: nova-core: factor out an elf_str() function
>   gpu: nova-core: don't assume 64-bit firmware images
>   gpu: nova-core: add support for 32-bit firmware images
>   gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
>   gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
>     of FSP
>   gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
>   gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
>   gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
>   rust: ptr: add const_align_up()
>   gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
>   gpu: nova-core: add MCTP/NVDM protocol types for firmware
>     communication
>   gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
>     waiting
>   gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
>   gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
>   gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
>   gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
>   gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
>   gpu: nova-core: Blackwell: use correct sysmem flush registers
>   gpu: nova-core: make WPR heap sizing fallible
>   gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
>   gpu: nova-core: refactor SEC2 booter loading into
>     BooterFirmware::run()
>   gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
>   gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
>   gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot()
> 
>  drivers/gpu/nova-core/driver.rs          |  28 +-
>  drivers/gpu/nova-core/falcon.rs          |   1 +
>  drivers/gpu/nova-core/falcon/fsp.rs      | 220 ++++++++++
>  drivers/gpu/nova-core/falcon/hal.rs      |  20 +-
>  drivers/gpu/nova-core/fb.rs              |  26 +-
>  drivers/gpu/nova-core/fb/hal.rs          |  38 +-
>  drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
>  drivers/gpu/nova-core/fb/hal/gb100.rs    |  75 ++++
>  drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
>  drivers/gpu/nova-core/fb/hal/gh100.rs    |  38 ++
>  drivers/gpu/nova-core/firmware.rs        | 204 +++++++++
>  drivers/gpu/nova-core/firmware/booter.rs |  35 +-
>  drivers/gpu/nova-core/firmware/fsp.rs    |  47 ++
>  drivers/gpu/nova-core/firmware/gsp.rs    | 128 ++----
>  drivers/gpu/nova-core/fsp.rs             | 527 +++++++++++++++++++++++
>  drivers/gpu/nova-core/gpu.rs             |  86 +++-
>  drivers/gpu/nova-core/gpu/hal.rs         |  54 +++
>  drivers/gpu/nova-core/gsp/boot.rs        | 298 ++++++++++---
>  drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
>  drivers/gpu/nova-core/gsp/fw.rs          |  83 +++-
>  drivers/gpu/nova-core/gsp/fw/commands.rs |  20 +-
>  drivers/gpu/nova-core/mctp.rs            | 119 +++++
>  drivers/gpu/nova-core/nova_core.rs       |   2 +
>  drivers/gpu/nova-core/regs.rs            |  96 +++++
>  rust/kernel/ptr.rs                       |  24 ++
>  25 files changed, 2001 insertions(+), 240 deletions(-)
>  create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
>  create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/gpu/hal.rs
>  create mode 100644 drivers/gpu/nova-core/mctp.rs
> 
> 
> base-commit: d19ab42867ae7c68be84ed957d95712b7934773f



^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2026-03-25 11:19 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-17 22:53 [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
2026-03-17 22:53 ` [PATCH v7 01/31] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
2026-03-17 22:53 ` [PATCH v7 02/31] gpu: nova-core: factor .fwsignature* selection into a new find_gsp_sigs_section() John Hubbard
2026-03-17 22:53 ` [PATCH v7 03/31] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
2026-03-17 22:53 ` [PATCH v7 04/31] gpu: nova-core: move GPU init into Gpu::new() John Hubbard
2026-03-23 12:45   ` Alexandre Courbot
2026-03-25  3:23     ` John Hubbard
2026-03-17 22:53 ` [PATCH v7 05/31] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-03-23 13:02   ` Alexandre Courbot
2026-03-25  3:26     ` John Hubbard
2026-03-17 22:53 ` [PATCH v7 06/31] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
2026-03-23 13:13   ` Alexandre Courbot
2026-03-25  3:26     ` John Hubbard
2026-03-17 22:53 ` [PATCH v7 07/31] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
2026-03-23 13:19   ` Alexandre Courbot
2026-03-25  3:30     ` John Hubbard
2026-03-25 11:06       ` Alexandre Courbot
2026-03-25 11:18         ` Miguel Ojeda
2026-03-25 11:16       ` Miguel Ojeda
2026-03-17 22:53 ` [PATCH v7 08/31] gpu: nova-core: factor out an elf_str() function John Hubbard
2026-03-17 22:53 ` [PATCH v7 09/31] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-03-17 22:53 ` [PATCH v7 10/31] gpu: nova-core: add support for 32-bit " John Hubbard
2026-03-17 22:53 ` [PATCH v7 11/31] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-03-17 22:53 ` [PATCH v7 12/31] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
2026-03-17 22:53 ` [PATCH v7 13/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-03-17 22:53 ` [PATCH v7 14/31] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-03-17 22:53 ` [PATCH v7 15/31] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-03-17 22:53 ` [PATCH v7 16/31] rust: ptr: add const_align_up() John Hubbard
2026-03-20  8:37   ` David Rheinsberg
2026-03-20  8:44     ` Alice Ryhl
2026-03-20  8:58       ` David Rheinsberg
2026-03-20  9:03         ` Alice Ryhl
2026-03-20  9:26           ` David Rheinsberg
2026-03-20  9:47             ` Alice Ryhl
2026-03-20 10:27               ` David Rheinsberg
2026-03-20 11:12                 ` Alice Ryhl
2026-03-20 13:14                   ` David Rheinsberg
2026-03-20 13:16                     ` Miguel Ojeda
2026-03-20 13:26                       ` Alice Ryhl
2026-03-20  9:48   ` Alice Ryhl
2026-03-20 13:36     ` Gary Guo
2026-03-17 22:53 ` [PATCH v7 17/31] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
2026-03-17 22:53 ` [PATCH v7 18/31] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
2026-03-18  0:01   ` John Hubbard
2026-03-18  0:21     ` Danilo Krummrich
2026-03-18  0:56       ` Alexandre Courbot
2026-03-18 12:36       ` Gary Guo
2026-03-18 19:14         ` John Hubbard
2026-03-17 22:53 ` [PATCH v7 19/31] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-03-17 22:53 ` [PATCH v7 20/31] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-03-17 22:53 ` [PATCH v7 21/31] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-03-17 22:53 ` [PATCH v7 22/31] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
2026-03-17 22:53 ` [PATCH v7 23/31] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-03-17 22:53 ` [PATCH v7 24/31] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-03-17 22:53 ` [PATCH v7 25/31] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-03-17 22:53 ` [PATCH v7 26/31] gpu: nova-core: make WPR heap sizing fallible John Hubbard
2026-03-17 22:53 ` [PATCH v7 27/31] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-03-18 16:12   ` kernel test robot
2026-03-18 17:59     ` John Hubbard
2026-03-17 22:53 ` [PATCH v7 28/31] gpu: nova-core: refactor SEC2 booter loading into BooterFirmware::run() John Hubbard
2026-03-17 22:53 ` [PATCH v7 29/31] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-03-17 22:53 ` [PATCH v7 30/31] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-03-17 22:53 ` [PATCH v7 31/31] gpu: nova-core: Hopper/Blackwell: integrate FSP boot path into boot() John Hubbard
2026-03-18 17:02   ` kernel test robot
2026-03-18 17:59     ` John Hubbard
2026-03-18 20:25 ` [PATCH v7 00/31] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox