[PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support
@ 2026-02-10  2:45 John Hubbard
  2026-02-10  2:45 ` [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
                   ` (33 more replies)
  0 siblings, 34 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hi,

This is based on the Feb 5, 2026 linux-next: commit 9845cf73f7db ("Add
linux-next specific files for 20260205") That's new enough to have the
pdev.as_ref() changes (see below for details), but not so new as to
include the current merge window churn for Linux .70.

I've re-tested on Ampere (GA104) and Blackwell (GB202) RTX GPUs.

Data center GPUs remain as TODO items: GA100 needs some additional code,
Hopper/GH100 might work but is not yet tested, and I haven't even
thought about Blackwell data center GPUs.

So, even though many patches say Hopper/Blackwell, there may be some
test-and-fix work remaining there.

Changes in v4:

* Fixed the IOMMU page faults on address 0x0 that I was seeing on v3 and
  earlier, for the iommu enabled case. These were due to the sysmem
  flush buffer being in a different location for Blackwell, so I've
  HAL-ified that aspect.

* Added a patch (0001) to pass pdev directly to dev_* logging macros.
  Then converted the remaining patches to also use pdev directly,
  instead of pdev.as_ref(). This is only possible in branches that have
  commit a38cd1fea989 ("rust: device: support `dev_printk` on all
  devices"), which in turn is why this v4 is based on a linux-next
  commit.

* Changed FmcSignatures fields from [u32; N] to [u8; N] arrays because
  the data is not treated as 32-bit integers. This eliminates the need
  for .as_bytes_mut() in the FMC signature extraction patch and allows
  using named constants like [u8; FSP_HASH_SIZE]. (From Timur Tabi's
  review.)

* Changed .unwrap_or(u64::MAX) to .expect("...") for alignment overflow
  in client_alloc_size() and management_overhead(). A panic is warranted
  here since the values are compile-time constants and overflow is
  impossible. (From Timur Tabi's review.)

* Added a patch at the end that I actually expect will get merged
  earlier, separately. But for now, it avoids nova-drm aux bus
  registration failure on multi-GPU systems, which in turn keeps the
  driver alive, which in turn avoids a driver teardown missing feature
  (pre-existing), which in turn avoids IOMMU page faults at non-zero
  addresses. whew. :)

Changes in v3:

* Rebased onto linux-next (20260205), which includes several
  rust-for-linux updates that affected nova-core.

* Removed redundant .as_ref() from dev_*!() macro call sites, since the
  dev_printk!() macro now calls .as_ref() internally (Gary Guo's
  "remove redundant .as_ref() for dev_* print" series).

* Added a `use kernel::io::Io` import in regs.rs, needed after the
  upstream separation of generic I/O helpers from the MMIO
  implementation.

Changes in v2:

v2 is here:
    https://lore.kernel.org/20260131005604.454172-1-jhubbard@nvidia.com

* GA100 (an Ampere chip whose firmware boot steps are closer to Turing,
  than to other Amperes) returns ENOTSUPP for now because it is *known*
  to not work yet.

* FSP: use the new Chipset::fsp_cot_version() method instead of a
  hardcoded constant. This fixes a known wrongness on GH100.

* Changed to a HAL approach to handle the slightly different non-WPR
  heap sizes, for Hopper vs. Blackwell.

* Return Option instead of Result from get_gsp_sigs_section() since
  the failure case is simply "not found".

* Return DmaMask directly from dma_mask() instead of returning a bit
  count.

* Change fmc_full from DmaObject to KVec<u8> since it's only used for
  CPU-side signature extraction and is never submitted to hardware
  (only fmc_image is). This eliminates the need for unsafe code and
  the associated SAFETY comment entirely.

* Use as_bytes_mut() instead of unsafe core::slice::from_raw_parts_mut()
  for copying FMC signature data (hash, public_key, signature arrays).

* Refactor wait_for_gsp_lockdown_release() to use early return with ?
  instead of chained .inspect_err().map().and_then() pattern.

* Removed many dev_dbg! statements.

* Use IEC binary prefix "MiB" instead of "MB" for memory size output.
  Also improved display of small sizes (e.g., "24 KiB" instead of
  "0 MB") and fixed a typo ("suprising" -> "surprising").

* Reordered the "skip GFW boot waiting" commit to appear earlier in the
  series.

* Series has been reduced from 31 to 30 patches, because the "needs
  large reserved mem" patch was absorbed into the non-WPR heap size
  patch.

John Hubbard (33):
  gpu: nova-core: pass pdev directly to dev_* logging macros
  gpu: nova-core: print FB sizes, along with ranges
  gpu: nova-core: add FbRange.len() and use it in boot.rs
  gpu: nova-core: Hopper/Blackwell: basic GPU identification
  gpu: nova-core: factor .fwsignature* selection into a new
    get_gsp_sigs_section()
  gpu: nova-core: use GPU Architecture to simplify HAL selections
  gpu: nova-core: apply the one "use" item per line policy to
    commands.rs
  gpu: nova-core: set DMA mask width based on GPU architecture
  gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  gpu: nova-core: move firmware image parsing code to firmware.rs
  gpu: nova-core: factor out a section_name_eq() function
  gpu: nova-core: don't assume 64-bit firmware images
  gpu: nova-core: add support for 32-bit firmware images
  gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
    of FSP
  gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
    waiting
  gpu: nova-core: Hopper/Blackwell: add FSP message structures
  gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  gpu: nova-core: Blackwell: use correct sysmem flush registers
  gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  gpu: nova-core: refactor SEC2 booter loading into run_booter() helper
  gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path
  gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  gpu: nova-core: clarify the GPU firmware boot steps
  gpu: nova-core: fix aux device registration for multi-GPU systems

 drivers/gpu/nova-core/driver.rs          |  48 +-
 drivers/gpu/nova-core/falcon.rs          |   1 +
 drivers/gpu/nova-core/falcon/fsp.rs      | 160 +++++++
 drivers/gpu/nova-core/falcon/hal.rs      |  20 +-
 drivers/gpu/nova-core/fb.rs              | 118 ++++-
 drivers/gpu/nova-core/fb/hal.rs          |  34 +-
 drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs    |  73 +++
 drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
 drivers/gpu/nova-core/fb/hal/gh100.rs    |  37 ++
 drivers/gpu/nova-core/firmware.rs        | 186 ++++++++
 drivers/gpu/nova-core/firmware/fsp.rs    |  47 ++
 drivers/gpu/nova-core/firmware/gsp.rs    | 140 ++----
 drivers/gpu/nova-core/fsp.rs             | 561 +++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs             |  87 +++-
 drivers/gpu/nova-core/gsp/boot.rs        | 337 +++++++++++---
 drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
 drivers/gpu/nova-core/gsp/fw.rs          |  63 ++-
 drivers/gpu/nova-core/gsp/fw/commands.rs |  32 +-
 drivers/gpu/nova-core/nova_core.rs       |   1 +
 drivers/gpu/nova-core/num.rs             |  10 +
 drivers/gpu/nova-core/regs.rs            |  95 ++++
 22 files changed, 1856 insertions(+), 266 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp.rs

base-commit: 9845cf73f7db6094c0d8419d6adb848028f4a921
-- 
2.53.0

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-11 10:06   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 02/33] gpu: nova-core: print FB sizes, along with ranges John Hubbard
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

The dev_dbg!, dev_info!, dev_err!, and dev_warn! macros now accept
pci::Device directly without requiring an explicit .as_ref()
conversion to device::Device, thanks to commit a38cd1fea989
("rust: device: support `dev_printk` on all devices").

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs   |  2 +-
 drivers/gpu/nova-core/gpu.rs      |  4 ++--
 drivers/gpu/nova-core/gsp/boot.rs | 14 +++++++-------
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 5a4cc047bcfc..e39885c0d5ca 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -70,7 +70,7 @@ impl pci::Driver for NovaCore {
 
     fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, Error> {
         pin_init::pin_init_scope(move || {
-            dev_dbg!(pdev.as_ref(), "Probe Nova Core GPU driver.\n");
+            dev_dbg!(pdev, "Probe Nova Core GPU driver.\n");
 
             pdev.enable_device_mem()?;
             pdev.set_master();
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 9b042ef1a308..f5907c31a66d 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -262,13 +262,13 @@ pub(crate) fn new<'a>(
     ) -> impl PinInit<Self, Error> + 'a {
         try_pin_init!(Self {
             spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| {
-                dev_info!(pdev.as_ref(),"NVIDIA ({})\n", spec);
+                dev_info!(pdev, "NVIDIA ({})\n", spec);
             })?,
 
             // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
             _: {
                 gfw::wait_gfw_boot_completion(bar)
-                    .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete\n"))?;
+                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
             },
 
             sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index be427fe26a58..bd6e6dc57e85 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -171,14 +171,14 @@ pub(crate) fn boot(
             Some((libos_handle >> 32) as u32),
         )?;
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "GSP MBOX0: {:#x}, MBOX1: {:#x}\n",
             mbox0,
             mbox1
         );
 
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "Using SEC2 to load and run the booter_load firmware...\n"
         );
 
@@ -191,7 +191,7 @@ pub(crate) fn boot(
             Some((wpr_handle >> 32) as u32),
         )?;
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n",
             mbox0,
             mbox1
@@ -199,7 +199,7 @@ pub(crate) fn boot(
 
         if mbox0 != 0 {
             dev_err!(
-                pdev.as_ref(),
+                pdev,
                 "Booter-load failed with error {:#x}\n",
                 mbox0
             );
@@ -217,7 +217,7 @@ pub(crate) fn boot(
         )?;
 
         dev_dbg!(
-            pdev.as_ref(),
+            pdev,
             "RISC-V active? {}\n",
             gsp_falcon.is_riscv_active(bar),
         );
@@ -239,8 +239,8 @@ pub(crate) fn boot(
         // Obtain and display basic GPU information.
         let info = commands::get_gsp_info(&mut self.cmdq, bar)?;
         match info.gpu_name() {
-            Ok(name) => dev_info!(pdev.as_ref(), "GPU name: {}\n", name),
-            Err(e) => dev_warn!(pdev.as_ref(), "GPU name unavailable: {:?}\n", e),
+            Ok(name) => dev_info!(pdev, "GPU name: {}\n", name),
+            Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e),
         }
 
         Ok(())
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 02/33] gpu: nova-core: print FB sizes, along with ranges
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-02-10  2:45 ` [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 03/33] gpu: nova-core: add FbRange.len() and use it in boot.rs John Hubbard
                   ` (31 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

For convenience of the reader: now you can directly see the sizes of
each range. It is surprising just how much this helps.

Sample output (using an Ampere GA104):

NovaCore 0000:e1:00.0: FbLayout {
    fb: 0x0..0x3ff800000 (16376 MiB),
    vga_workspace: 0x3ff700000..0x3ff800000 (1 MiB),
    frts: 0x3ff600000..0x3ff700000 (1 MiB),
    boot: 0x3ff5fa000..0x3ff600000 (24 KiB),
    elf: 0x3fb960000..0x3ff5f9000 (60 MiB),
    wpr2_heap: 0x3f3900000..0x3fb900000 (128 MiB),
    wpr2: 0x3f3800000..0x3ff700000 (191 MiB),
    heap: 0x3f3700000..0x3f3800000 (1 MiB),
    vf_partition_count: 0x0,
    total_reserved_size: 0x1a00000,
}

Cc: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs | 83 +++++++++++++++++++++++++++++--------
 1 file changed, 66 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index c62abcaed547..6fb804c118c6 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -1,9 +1,13 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use core::ops::Range;
+use core::ops::{
+    Deref,
+    Range, //
+};
 
 use kernel::{
     device,
+    fmt,
     prelude::*,
     ptr::{
         Alignable,
@@ -94,26 +98,71 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
     }
 }
 
+pub(crate) struct FbRange(Range<u64>);
+
+impl From<Range<u64>> for FbRange {
+    fn from(range: Range<u64>) -> Self {
+        Self(range)
+    }
+}
+
+impl Deref for FbRange {
+    type Target = Range<u64>;
+
+    fn deref(&self) -> &Self::Target {
+        &self.0
+    }
+}
+
+impl fmt::Debug for FbRange {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        // Use alternate format ({:#?}) to include size, compact format ({:?}) for just the range.
+        if f.alternate() {
+            let size = self.0.end - self.0.start;
+
+            if size < usize_as_u64(SZ_1M) {
+                let size_kib = size / usize_as_u64(SZ_1K);
+                f.write_fmt(fmt!(
+                    "{:#x}..{:#x} ({} KiB)",
+                    self.0.start,
+                    self.0.end,
+                    size_kib
+                ))
+            } else {
+                let size_mib = size / usize_as_u64(SZ_1M);
+                f.write_fmt(fmt!(
+                    "{:#x}..{:#x} ({} MiB)",
+                    self.0.start,
+                    self.0.end,
+                    size_mib
+                ))
+            }
+        } else {
+            f.write_fmt(fmt!("{:#x}..{:#x}", self.0.start, self.0.end))
+        }
+    }
+}
+
 /// Layout of the GPU framebuffer memory.
 ///
 /// Contains ranges of GPU memory reserved for a given purpose during the GSP boot process.
 #[derive(Debug)]
 pub(crate) struct FbLayout {
     /// Range of the framebuffer. Starts at `0`.
-    pub(crate) fb: Range<u64>,
+    pub(crate) fb: FbRange,
     /// VGA workspace, small area of reserved memory at the end of the framebuffer.
-    pub(crate) vga_workspace: Range<u64>,
+    pub(crate) vga_workspace: FbRange,
     /// FRTS range.
-    pub(crate) frts: Range<u64>,
+    pub(crate) frts: FbRange,
     /// Memory area containing the GSP bootloader image.
-    pub(crate) boot: Range<u64>,
+    pub(crate) boot: FbRange,
     /// Memory area containing the GSP firmware image.
-    pub(crate) elf: Range<u64>,
+    pub(crate) elf: FbRange,
     /// WPR2 heap.
-    pub(crate) wpr2_heap: Range<u64>,
+    pub(crate) wpr2_heap: FbRange,
     /// WPR2 region range, starting with an instance of `GspFwWprMeta`.
-    pub(crate) wpr2: Range<u64>,
-    pub(crate) heap: Range<u64>,
+    pub(crate) wpr2: FbRange,
+    pub(crate) heap: FbRange,
     pub(crate) vf_partition_count: u8,
 }
 
@@ -125,7 +174,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         let fb = {
             let fb_size = hal.vidmem_size(bar);
 
-            0..fb_size
+            FbRange(0..fb_size)
         };
 
         let vga_workspace = {
@@ -152,7 +201,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
                 }
             };
 
-            vga_base..fb.end
+            FbRange(vga_base..fb.end)
         };
 
         let frts = {
@@ -160,7 +209,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             const FRTS_SIZE: u64 = usize_as_u64(SZ_1M);
             let frts_base = vga_workspace.start.align_down(FRTS_DOWN_ALIGN) - FRTS_SIZE;
 
-            frts_base..frts_base + FRTS_SIZE
+            FbRange(frts_base..frts_base + FRTS_SIZE)
         };
 
         let boot = {
@@ -168,7 +217,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             let bootloader_size = u64::from_safe_cast(gsp_fw.bootloader.ucode.size());
             let bootloader_base = (frts.start - bootloader_size).align_down(BOOTLOADER_DOWN_ALIGN);
 
-            bootloader_base..bootloader_base + bootloader_size
+            FbRange(bootloader_base..bootloader_base + bootloader_size)
         };
 
         let elf = {
@@ -176,7 +225,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             let elf_size = u64::from_safe_cast(gsp_fw.size);
             let elf_addr = (boot.start - elf_size).align_down(ELF_DOWN_ALIGN);
 
-            elf_addr..elf_addr + elf_size
+            FbRange(elf_addr..elf_addr + elf_size)
         };
 
         let wpr2_heap = {
@@ -185,7 +234,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
                 gsp::LibosParams::from_chipset(chipset).wpr_heap_size(chipset, fb.end);
             let wpr2_heap_addr = (elf.start - wpr2_heap_size).align_down(WPR2_HEAP_DOWN_ALIGN);
 
-            wpr2_heap_addr..(elf.start).align_down(WPR2_HEAP_DOWN_ALIGN)
+            FbRange(wpr2_heap_addr..(elf.start).align_down(WPR2_HEAP_DOWN_ALIGN))
         };
 
         let wpr2 = {
@@ -193,13 +242,13 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             let wpr2_addr = (wpr2_heap.start - u64::from_safe_cast(size_of::<gsp::GspFwWprMeta>()))
                 .align_down(WPR2_DOWN_ALIGN);
 
-            wpr2_addr..frts.end
+            FbRange(wpr2_addr..frts.end)
         };
 
         let heap = {
             const HEAP_SIZE: u64 = usize_as_u64(SZ_1M);
 
-            wpr2.start - HEAP_SIZE..wpr2.start
+            FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
         };
 
         Ok(Self {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 03/33] gpu: nova-core: add FbRange.len() and use it in boot.rs
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-02-10  2:45 ` [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
  2026-02-10  2:45 ` [PATCH v4 02/33] gpu: nova-core: print FB sizes, along with ranges John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 04/33] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
                   ` (30 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

A tiny simplification: now that FbLayout uses its own specific FbRange
type, add an FbRange.len() method, and use that to (very slightly)
simplify the calculation of Frts::frts_size initialization.

Suggested-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs       | 6 ++++++
 drivers/gpu/nova-core/gsp/boot.rs | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 6fb804c118c6..e803e6e0cdb9 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -100,6 +100,12 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
 
 pub(crate) struct FbRange(Range<u64>);
 
+impl FbRange {
+    pub(crate) fn len(&self) -> u64 {
+        self.0.end - self.0.start
+    }
+}
+
 impl From<Range<u64>> for FbRange {
     fn from(range: Range<u64>) -> Self {
         Self(range)
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index bd6e6dc57e85..465c18e4c888 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -70,7 +70,7 @@ fn run_fwsec_frts(
             bios,
             FwsecCommand::Frts {
                 frts_addr: fb_layout.frts.start,
-                frts_size: fb_layout.frts.end - fb_layout.frts.start,
+                frts_size: fb_layout.frts.len(),
             },
         )?;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 04/33] gpu: nova-core: Hopper/Blackwell: basic GPU identification
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (2 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 03/33] gpu: nova-core: add FbRange.len() and use it in boot.rs John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section() John Hubbard
                   ` (29 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper (GH100) and Blackwell identification, including ELF
.fwsignature_* items.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/hal.rs   |  3 ++-
 drivers/gpu/nova-core/fb/hal.rs       |  5 ++---
 drivers/gpu/nova-core/firmware/gsp.rs | 17 +++++++++++++++++
 drivers/gpu/nova-core/gpu.rs          | 22 ++++++++++++++++++++++
 4 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index 89babd5f9325..444c95fd4ece 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -76,7 +76,8 @@ pub(super) fn falcon_hal<E: FalconEngine + 'static>(
         TU102 | TU104 | TU106 | TU116 | TU117 => {
             KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
+        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
+        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => {
             KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
         _ => return Err(ENOTSUPP),
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index aba0abd8ee00..71fa92d1b709 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -34,8 +34,7 @@ pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset {
         TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
         GA100 => ga100::GA100_HAL,
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
-            ga102::GA102_HAL
-        }
+        GA102 | GA103 | GA104 | GA106 | GA107 | GH100 | AD102 | AD103 | AD104 | AD106 | AD107
+        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => ga102::GA102_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 9488a626352f..bc2243450989 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -222,6 +222,23 @@ pub(crate) fn new<'a>(
                         Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
                         Architecture::Ampere => ".fwsignature_ga10x",
                         Architecture::Ada => ".fwsignature_ad10x",
+                        Architecture::Hopper => ".fwsignature_gh10x",
+                        Architecture::Blackwell => {
+                            // Distinguish between GB10x and GB20x series
+                            match chipset {
+                                // GB10x series: GB100, GB102
+                                Chipset::GB100 | Chipset::GB102 => ".fwsignature_gb10x",
+                                // GB20x series: GB202, GB203, GB205, GB206, GB207
+                                Chipset::GB202
+                                | Chipset::GB203
+                                | Chipset::GB205
+                                | Chipset::GB206
+                                | Chipset::GB207 => ".fwsignature_gb20x",
+                                // It's not possible to get here with a non-Blackwell chipset, but
+                                // Rust doesn't know that.
+                                _ => return Err(ENOTSUPP),
+                            }
+                        }
                     };
 
                     elf::elf64_section(firmware.data(), sigs_section)
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index f5907c31a66d..b6a898008a59 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -83,12 +83,22 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
     GA104 = 0x174,
     GA106 = 0x176,
     GA107 = 0x177,
+    // Hopper
+    GH100 = 0x180,
     // Ada
     AD102 = 0x192,
     AD103 = 0x193,
     AD104 = 0x194,
     AD106 = 0x196,
     AD107 = 0x197,
+    // Blackwell
+    GB100 = 0x1a0,
+    GB102 = 0x1a2,
+    GB202 = 0x1b2,
+    GB203 = 0x1b3,
+    GB205 = 0x1b5,
+    GB206 = 0x1b6,
+    GB207 = 0x1b7,
 });
 
 impl Chipset {
@@ -100,9 +110,17 @@ pub(crate) fn arch(&self) -> Architecture {
             Self::GA100 | Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => {
                 Architecture::Ampere
             }
+            Self::GH100 => Architecture::Hopper,
             Self::AD102 | Self::AD103 | Self::AD104 | Self::AD106 | Self::AD107 => {
                 Architecture::Ada
             }
+            Self::GB100
+            | Self::GB102
+            | Self::GB202
+            | Self::GB203
+            | Self::GB205
+            | Self::GB206
+            | Self::GB207 => Architecture::Blackwell,
         }
     }
 }
@@ -132,7 +150,9 @@ pub(crate) enum Architecture {
     #[default]
     Turing = 0x16,
     Ampere = 0x17,
+    Hopper = 0x18,
     Ada = 0x19,
+    Blackwell = 0x1b,
 }
 
 impl TryFrom<u8> for Architecture {
@@ -142,7 +162,9 @@ fn try_from(value: u8) -> Result<Self> {
         match value {
             0x16 => Ok(Self::Turing),
             0x17 => Ok(Self::Ampere),
+            0x18 => Ok(Self::Hopper),
             0x19 => Ok(Self::Ada),
+            0x1b => Ok(Self::Blackwell),
             _ => Err(ENODEV),
         }
     }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section()
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (3 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 04/33] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-11 10:16   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 06/33] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Keep Gsp::new() from getting too cluttered, by factoring out the
selection of .fwsignature* items. This will continue to grow as we add
GPUs.

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/gsp.rs | 60 ++++++++++++++-------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index bc2243450989..10761716ed93 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -146,6 +146,36 @@ pub(crate) struct GspFirmware {
 }
 
 impl GspFirmware {
+    fn get_gsp_sigs_section(chipset: Chipset) -> Option<&'static str> {
+        match chipset.arch() {
+            Architecture::Turing if matches!(chipset, Chipset::TU116 | Chipset::TU117) => {
+                Some(".fwsignature_tu11x")
+            }
+            Architecture::Turing => Some(".fwsignature_tu10x"),
+            // GA100 uses the same firmware as Turing
+            Architecture::Ampere if chipset == Chipset::GA100 => Some(".fwsignature_tu10x"),
+            Architecture::Ampere => Some(".fwsignature_ga10x"),
+            Architecture::Ada => Some(".fwsignature_ad10x"),
+            Architecture::Hopper => Some(".fwsignature_gh10x"),
+            Architecture::Blackwell => {
+                // Distinguish between GB10x and GB20x series
+                match chipset {
+                    // GB10x series: GB100, GB102
+                    Chipset::GB100 | Chipset::GB102 => Some(".fwsignature_gb10x"),
+                    // GB20x series: GB202, GB203, GB205, GB206, GB207
+                    Chipset::GB202
+                    | Chipset::GB203
+                    | Chipset::GB205
+                    | Chipset::GB206
+                    | Chipset::GB207 => Some(".fwsignature_gb20x"),
+                    // It's not possible to get here with a non-Blackwell chipset, but Rust doesn't
+                    // know that.
+                    _ => None,
+                }
+            }
+        }
+    }
+
     /// Loads the GSP firmware binaries, map them into `dev`'s address-space, and creates the page
     /// tables expected by the GSP bootloader to load it.
     pub(crate) fn new<'a>(
@@ -211,35 +241,7 @@ pub(crate) fn new<'a>(
                 },
                 size,
                 signatures: {
-                    let sigs_section = match chipset.arch() {
-                        Architecture::Turing
-                            if matches!(chipset, Chipset::TU116 | Chipset::TU117) =>
-                        {
-                            ".fwsignature_tu11x"
-                        }
-                        Architecture::Turing => ".fwsignature_tu10x",
-                        // GA100 uses the same firmware as Turing
-                        Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
-                        Architecture::Ampere => ".fwsignature_ga10x",
-                        Architecture::Ada => ".fwsignature_ad10x",
-                        Architecture::Hopper => ".fwsignature_gh10x",
-                        Architecture::Blackwell => {
-                            // Distinguish between GB10x and GB20x series
-                            match chipset {
-                                // GB10x series: GB100, GB102
-                                Chipset::GB100 | Chipset::GB102 => ".fwsignature_gb10x",
-                                // GB20x series: GB202, GB203, GB205, GB206, GB207
-                                Chipset::GB202
-                                | Chipset::GB203
-                                | Chipset::GB205
-                                | Chipset::GB206
-                                | Chipset::GB207 => ".fwsignature_gb20x",
-                                // It's not possible to get here with a non-Blackwell chipset, but
-                                // Rust doesn't know that.
-                                _ => return Err(ENOTSUPP),
-                            }
-                        }
-                    };
+                    let sigs_section = Self::get_gsp_sigs_section(chipset).ok_or(ENOTSUPP)?;
 
                     elf::elf64_section(firmware.data(), sigs_section)
                         .ok_or(EINVAL)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 06/33] gpu: nova-core: use GPU Architecture to simplify HAL selections
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (4 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section() John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 07/33] gpu: nova-core: apply the one "use" item per line policy to commands.rs John Hubbard
                   ` (27 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Instead of long, exhaustive lists of GPUs ("Chipsets"), use entire
GPU Architectures, such as "Blackwell" or "Turing", to make HAL choices.

Note: Left a // TODO for GA100, hinting at the remaining work in order
to bring up that chipset.

A tiny side effect: moved a "use" statement out of function scope, in
each file, up to the top of the file, as per Rust for Linux conventions.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/hal.rs | 21 +++++++++++++--------
 drivers/gpu/nova-core/fb/hal.rs     | 17 +++++++++--------
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index 444c95fd4ece..edf4d27d54f7 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -9,7 +9,10 @@
         FalconBromParams,
         FalconEngine, //
     },
-    gpu::Chipset,
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
 };
 
 mod ga102;
@@ -70,17 +73,19 @@ fn signature_reg_fuse_version(
 pub(super) fn falcon_hal<E: FalconEngine + 'static>(
     chipset: Chipset,
 ) -> Result<KBox<dyn FalconHal<E>>> {
-    use Chipset::*;
-
-    let hal = match chipset {
-        TU102 | TU104 | TU106 | TU116 | TU117 => {
+    let hal = match chipset.arch() {
+        Architecture::Turing => {
             KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 | GH100
-        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => {
+        // TODO: support GA100. Its boot sequence is a lot like Turing, except that it handles the
+        // FRTS steps differently (specifically, it skips FWSEC-FRTS).
+        Architecture::Ampere if chipset == Chipset::GA100 => return Err(ENOTSUPP),
+        Architecture::Ampere
+        | Architecture::Hopper
+        | Architecture::Ada
+        | Architecture::Blackwell => {
             KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
         }
-        _ => return Err(ENOTSUPP),
     };
 
     Ok(hal)
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index 71fa92d1b709..d795ef7ee65d 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -4,7 +4,10 @@
 
 use crate::{
     driver::Bar0,
-    gpu::Chipset, //
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
 };
 
 mod ga100;
@@ -29,12 +32,10 @@ pub(crate) trait FbHal {
 
 /// Returns the HAL corresponding to `chipset`.
 pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
-    use Chipset::*;
-
-    match chipset {
-        TU102 | TU104 | TU106 | TU117 | TU116 => tu102::TU102_HAL,
-        GA100 => ga100::GA100_HAL,
-        GA102 | GA103 | GA104 | GA106 | GA107 | GH100 | AD102 | AD103 | AD104 | AD106 | AD107
-        | GB100 | GB102 | GB202 | GB203 | GB205 | GB206 | GB207 => ga102::GA102_HAL,
+    match chipset.arch() {
+        Architecture::Turing => tu102::TU102_HAL,
+        Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
+        Architecture::Ampere => ga102::GA102_HAL,
+        Architecture::Hopper | Architecture::Ada | Architecture::Blackwell => ga102::GA102_HAL,
     }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 07/33] gpu: nova-core: apply the one "use" item per line policy to commands.rs
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (5 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 06/33] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
                   ` (26 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

As per [1], we need one "use" item per line, in order to reduce merge
conflicts. Furthermore, we need a trailing ", //" in order to tell
rustfmt(1) to leave it alone.

This does that for commands.rs, which is the only file in nova-core that
has any remaining instances of the old style.

[1] https://docs.kernel.org/rust/coding-guidelines.html#imports

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/fw/commands.rs | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs
index 21be44199693..470d8edb62ff 100644
--- a/drivers/gpu/nova-core/gsp/fw/commands.rs
+++ b/drivers/gpu/nova-core/gsp/fw/commands.rs
@@ -1,8 +1,14 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use kernel::prelude::*;
-use kernel::transmute::{AsBytes, FromBytes};
-use kernel::{device, pci};
+use kernel::{
+    device,
+    pci,
+    prelude::*,
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    }, //
+};
 
 use crate::gsp::GSP_PAGE_SIZE;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (6 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 07/33] gpu: nova-core: apply the one "use" item per line policy to commands.rs John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-11 10:28   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
                   ` (25 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

This removes a "TODO" item in the code, which was hardcoded to work on
Ampere and Ada GPUs. Hopper/Blackwell+ have a larger width, so do an
early read of boot42, in order to pick the correct value.

Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs | 33 ++++++++++++++--------------
 drivers/gpu/nova-core/gpu.rs    | 38 ++++++++++++++++++++++++---------
 2 files changed, 44 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index e39885c0d5ca..4ff07b643db6 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -5,7 +5,6 @@
     device::Core,
     devres::Devres,
     dma::Device,
-    dma::DmaMask,
     pci,
     pci::{
         Class,
@@ -17,7 +16,10 @@
     sync::Arc, //
 };
 
-use crate::gpu::Gpu;
+use crate::gpu::{
+    Gpu,
+    Spec, //
+};
 
 #[pin_data]
 pub(crate) struct NovaCore {
@@ -29,14 +31,6 @@ pub(crate) struct NovaCore {
 
 const BAR0_SIZE: usize = SZ_16M;
 
-// For now we only support Ampere which can use up to 47-bit DMA addresses.
-//
-// TODO: Add an abstraction for this to support newer GPUs which may support
-// larger DMA addresses. Limiting these GPUs to smaller address widths won't
-// have any adverse affects, unless installed on systems which require larger
-// DMA addresses. These systems should be quite rare.
-const GPU_DMA_BITS: u32 = 47;
-
 pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
 
 kernel::pci_device_table!(
@@ -75,18 +69,23 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
             pdev.enable_device_mem()?;
             pdev.set_master();
 
-            // SAFETY: No concurrent DMA allocations or mappings can be made because
-            // the device is still being probed and therefore isn't being used by
-            // other threads of execution.
-            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
-
-            let bar = Arc::pin_init(
+            let devres_bar = Arc::pin_init(
                 pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
                 GFP_KERNEL,
             )?;
 
+            // Read the GPU spec early to determine the correct DMA address width.
+            // Hopper/Blackwell+ support 52-bit DMA addresses, earlier architectures use 47-bit.
+            let spec = Spec::new(pdev.as_ref(), devres_bar.access(pdev.as_ref())?)?;
+            dev_info!(pdev.as_ref(), "NVIDIA ({})\n", spec);
+
+            // SAFETY: No concurrent DMA allocations or mappings can be made because
+            // the device is still being probed and therefore isn't being used by
+            // other threads of execution.
+            unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
+
             Ok(try_pin_init!(Self {
-                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
+                gpu <- Gpu::new(pdev, devres_bar.clone(), devres_bar.access(pdev.as_ref())?, spec),
                 _reg <- auxiliary::Registration::new(
                     pdev.as_ref(),
                     c"nova-drm",
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index b6a898008a59..24feb0e8723e 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -3,6 +3,7 @@
 use kernel::{
     device,
     devres::Devres,
+    dma::DmaMask,
     fmt,
     pci,
     prelude::*,
@@ -102,7 +103,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
 });
 
 impl Chipset {
-    pub(crate) fn arch(&self) -> Architecture {
+    pub(crate) const fn arch(&self) -> Architecture {
         match self {
             Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => {
                 Architecture::Turing
@@ -155,6 +156,19 @@ pub(crate) enum Architecture {
     Blackwell = 0x1b,
 }
 
+impl Architecture {
+    /// Returns the DMA mask supported by this architecture.
+    ///
+    /// Hopper and Blackwell support 52-bit DMA addresses, while earlier architectures
+    /// (Turing, Ampere, Ada) support 47-bit DMA addresses.
+    pub(crate) const fn dma_mask(&self) -> DmaMask {
+        match self {
+            Self::Turing | Self::Ampere | Self::Ada => DmaMask::new::<47>(),
+            Self::Hopper | Self::Blackwell => DmaMask::new::<52>(),
+        }
+    }
+}
+
 impl TryFrom<u8> for Architecture {
     type Error = Error;
 
@@ -204,7 +218,7 @@ pub(crate) struct Spec {
 }
 
 impl Spec {
-    fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
+    pub(crate) fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
         // Some brief notes about boot0 and boot42, in chronological order:
         //
         // NV04 through NV50:
@@ -234,6 +248,10 @@ fn new(dev: &device::Device, bar: &Bar0) -> Result<Spec> {
             dev_err!(dev, "Unsupported chipset: {}\n", boot42);
         })
     }
+
+    pub(crate) fn chipset(&self) -> Chipset {
+        self.chipset
+    }
 }
 
 impl TryFrom<regs::NV_PMC_BOOT_42> for Spec {
@@ -281,33 +299,33 @@ pub(crate) fn new<'a>(
         pdev: &'a pci::Device<device::Bound>,
         devres_bar: Arc<Devres<Bar0>>,
         bar: &'a Bar0,
+        spec: Spec,
     ) -> impl PinInit<Self, Error> + 'a {
-        try_pin_init!(Self {
-            spec: Spec::new(pdev.as_ref(), bar).inspect(|spec| {
-                dev_info!(pdev, "NVIDIA ({})\n", spec);
-            })?,
+        let chipset = spec.chipset();
 
+        try_pin_init!(Self {
             // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
             _: {
                 gfw::wait_gfw_boot_completion(bar)
                     .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
             },
 
-            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
+            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
 
             gsp_falcon: Falcon::new(
                 pdev.as_ref(),
-                spec.chipset,
+                chipset,
             )
             .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
 
-            sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?,
+            sec2_falcon: Falcon::new(pdev.as_ref(), chipset)?,
 
             gsp <- Gsp::new(pdev),
 
-            _: { gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)? },
+            _: { gsp.boot(pdev, bar, chipset, gsp_falcon, sec2_falcon)? },
 
             bar: devres_bar,
+            spec,
         })
     }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (7 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-11 10:09   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 10/33] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
                   ` (24 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell GPUs use FSP-based secure boot and do not require
waiting for GFW_BOOT completion. Skip this step for these architectures.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 24feb0e8723e..f04e2a795e90 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -304,10 +304,19 @@ pub(crate) fn new<'a>(
         let chipset = spec.chipset();
 
         try_pin_init!(Self {
-            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
+            // Turing, Ampere, Ada: we must wait for GFW_BOOT completion before doing any
+            // significant setup on the GPU.
+            //
+            // Hopper/Blackwell: skip GFW_BOOT completion waiting entirely, and use the simpler FSP
+            // Chain of Trust boot path (elsewhere) instead.
             _: {
-                gfw::wait_gfw_boot_completion(bar)
-                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                if matches!(
+                    chipset.arch(),
+                    Architecture::Turing | Architecture::Ampere | Architecture::Ada
+                ) {
+                    gfw::wait_gfw_boot_completion(bar)
+                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
+                }
             },
 
             sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, chipset)?,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 10/33] gpu: nova-core: move firmware image parsing code to firmware.rs
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (8 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 11/33] gpu: nova-core: factor out a section_name_eq() function John Hubbard
                   ` (23 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Up until now, only the GSP required parsing of its firmware headers.
However, upcoming support for Hopper/Blackwell+ adds another firmware
image (FMC), along with another format (ELF32).

Therefore, the current ELF64 section parsing support needs to be moved
up a level, so that both of the above can use it.

There are no functional changes. This is pure code movement.

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     | 88 +++++++++++++++++++++++++
 drivers/gpu/nova-core/firmware/gsp.rs | 93 ++-------------------------
 2 files changed, 94 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 68779540aa28..a0201ac8ccb4 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -437,3 +437,91 @@ pub(crate) const fn create(
         this.0
     }
 }
+
+/// Ad-hoc and temporary module to extract sections from ELF images.
+///
+/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
+/// to specific and related bits of data. Future firmware versions are scheduled to move away from
+/// that scheme before nova-core becomes stable, which means this module will eventually be
+/// removed.
+mod elf {
+    use core::mem::size_of;
+
+    use kernel::{
+        bindings,
+        str::CStr,
+        transmute::FromBytes, //
+    };
+
+    /// Newtype to provide a [`FromBytes`] implementation.
+    #[repr(transparent)]
+    struct Elf64Hdr(bindings::elf64_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64Hdr {}
+
+    #[repr(transparent)]
+    struct Elf64SHdr(bindings::elf64_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf64SHdr {}
+
+    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
+    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
+        let hdr = &elf
+            .get(0..size_of::<bindings::elf64_hdr>())
+            .and_then(Elf64Hdr::from_bytes)?
+            .0;
+
+        // Get all the section headers.
+        let mut shdr = {
+            let shdr_num = usize::from(hdr.e_shnum);
+            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
+            let shdr_end = shdr_num
+                .checked_mul(size_of::<Elf64SHdr>())
+                .and_then(|v| v.checked_add(shdr_start))?;
+
+            elf.get(shdr_start..shdr_end)
+                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
+        };
+
+        // Get the strings table.
+        let strhdr = shdr
+            .clone()
+            .nth(usize::from(hdr.e_shstrndx))
+            .and_then(Elf64SHdr::from_bytes)?;
+
+        // Find the section which name matches `name` and return it.
+        shdr.find(|&sh| {
+            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
+                return false;
+            };
+
+            let Some(name_idx) = strhdr
+                .0
+                .sh_offset
+                .checked_add(u64::from(hdr.0.sh_name))
+                .and_then(|idx| usize::try_from(idx).ok())
+            else {
+                return false;
+            };
+
+            // Get the start of the name.
+            elf.get(name_idx..)
+                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
+                // Convert into str.
+                .and_then(|c_str| c_str.to_str().ok())
+                // Check that the name matches.
+                .map(|str| str == name)
+                .unwrap_or(false)
+        })
+        // Return the slice containing the section.
+        .and_then(|sh| {
+            let hdr = Elf64SHdr::from_bytes(sh)?;
+            let start = usize::try_from(hdr.0.sh_offset).ok()?;
+            let end = usize::try_from(hdr.0.sh_size)
+                .ok()
+                .and_then(|sh_size| start.checked_add(sh_size))?;
+
+            elf.get(start..end)
+        })
+    }
+}
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 10761716ed93..173b16cdfb16 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::mem::size_of_val;
+
 use kernel::{
     device,
     dma::{
@@ -16,7 +18,10 @@
 
 use crate::{
     dma::DmaObject,
-    firmware::riscv::RiscvFirmware,
+    firmware::{
+        elf,
+        riscv::RiscvFirmware, //
+    },
     gpu::{
         Architecture,
         Chipset, //
@@ -25,92 +30,6 @@
     num::FromSafeCast,
 };
 
-/// Ad-hoc and temporary module to extract sections from ELF images.
-///
-/// Some firmware images are currently packaged as ELF files, where sections names are used as keys
-/// to specific and related bits of data. Future firmware versions are scheduled to move away from
-/// that scheme before nova-core becomes stable, which means this module will eventually be
-/// removed.
-mod elf {
-    use kernel::{
-        bindings,
-        prelude::*,
-        transmute::FromBytes, //
-    };
-
-    /// Newtype to provide a [`FromBytes`] implementation.
-    #[repr(transparent)]
-    struct Elf64Hdr(bindings::elf64_hdr);
-    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
-    unsafe impl FromBytes for Elf64Hdr {}
-
-    #[repr(transparent)]
-    struct Elf64SHdr(bindings::elf64_shdr);
-    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
-    unsafe impl FromBytes for Elf64SHdr {}
-
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
-        let hdr = &elf
-            .get(0..size_of::<bindings::elf64_hdr>())
-            .and_then(Elf64Hdr::from_bytes)?
-            .0;
-
-        // Get all the section headers.
-        let mut shdr = {
-            let shdr_num = usize::from(hdr.e_shnum);
-            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
-            let shdr_end = shdr_num
-                .checked_mul(size_of::<Elf64SHdr>())
-                .and_then(|v| v.checked_add(shdr_start))?;
-
-            elf.get(shdr_start..shdr_end)
-                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
-        };
-
-        // Get the strings table.
-        let strhdr = shdr
-            .clone()
-            .nth(usize::from(hdr.e_shstrndx))
-            .and_then(Elf64SHdr::from_bytes)?;
-
-        // Find the section which name matches `name` and return it.
-        shdr.find(|&sh| {
-            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
-                return false;
-            };
-
-            let Some(name_idx) = strhdr
-                .0
-                .sh_offset
-                .checked_add(u64::from(hdr.0.sh_name))
-                .and_then(|idx| usize::try_from(idx).ok())
-            else {
-                return false;
-            };
-
-            // Get the start of the name.
-            elf.get(name_idx..)
-                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
-                // Convert into str.
-                .and_then(|c_str| c_str.to_str().ok())
-                // Check that the name matches.
-                .map(|str| str == name)
-                .unwrap_or(false)
-        })
-        // Return the slice containing the section.
-        .and_then(|sh| {
-            let hdr = Elf64SHdr::from_bytes(sh)?;
-            let start = usize::try_from(hdr.0.sh_offset).ok()?;
-            let end = usize::try_from(hdr.0.sh_size)
-                .ok()
-                .and_then(|sh_size| start.checked_add(sh_size))?;
-
-            elf.get(start..end)
-        })
-    }
-}
-
 /// GSP firmware with 3-level radix page tables for the GSP bootloader.
 ///
 /// The bootloader expects firmware to be mapped starting at address 0 in GSP's virtual address
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 11/33] gpu: nova-core: factor out a section_name_eq() function
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (9 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 10/33] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 12/33] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
                   ` (22 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Factor out a chunk of complexity into a new subroutine. This is an
incremental step in adding ELF32 support to the existing ELF64 section
support, for handling GPU firmware.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 40 ++++++++++++-------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index a0201ac8ccb4..72cefc3142ea 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -464,6 +464,13 @@ unsafe impl FromBytes for Elf64Hdr {}
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64SHdr {}
 
+    /// Returns a NULL-terminated string from the ELF image at `offset`.
+    fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
+        let idx = usize::try_from(offset).ok()?;
+        let bytes = elf.get(idx..)?;
+        CStr::from_bytes_until_nul(bytes).ok()?.to_str().ok()
+    }
+
     /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
     pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
         let hdr = &elf
@@ -490,32 +497,15 @@ pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a
             .and_then(Elf64SHdr::from_bytes)?;
 
         // Find the section which name matches `name` and return it.
-        shdr.find(|&sh| {
-            let Some(hdr) = Elf64SHdr::from_bytes(sh) else {
-                return false;
-            };
-
-            let Some(name_idx) = strhdr
-                .0
-                .sh_offset
-                .checked_add(u64::from(hdr.0.sh_name))
-                .and_then(|idx| usize::try_from(idx).ok())
-            else {
-                return false;
-            };
-
-            // Get the start of the name.
-            elf.get(name_idx..)
-                .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
-                // Convert into str.
-                .and_then(|c_str| c_str.to_str().ok())
-                // Check that the name matches.
-                .map(|str| str == name)
-                .unwrap_or(false)
-        })
-        // Return the slice containing the section.
-        .and_then(|sh| {
+        shdr.find_map(|sh| {
             let hdr = Elf64SHdr::from_bytes(sh)?;
+            let name_offset = strhdr.0.sh_offset.checked_add(u64::from(hdr.0.sh_name))?;
+            let section_name = elf_str(elf, name_offset)?;
+
+            if section_name != name {
+                return None;
+            }
+
             let start = usize::try_from(hdr.0.sh_offset).ok()?;
             let end = usize::try_from(hdr.0.sh_size)
                 .ok()
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 12/33] gpu: nova-core: don't assume 64-bit firmware images
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (10 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 11/33] gpu: nova-core: factor out a section_name_eq() function John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 13/33] gpu: nova-core: add support for 32-bit " John Hubbard
                   ` (21 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add ElfHeader and ElfSectionHeader traits to abstract out differences
between ELF32 and ELF64. Implement these for ELF64.

This is in preparation for upcoming ELF32 section support, and for
auto-selecting ELF32 or ELF64.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 99 ++++++++++++++++++++++---------
 1 file changed, 72 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 72cefc3142ea..6ed76a7e15f1 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -453,17 +453,60 @@ mod elf {
         transmute::FromBytes, //
     };
 
-    /// Newtype to provide a [`FromBytes`] implementation.
+    /// Trait to abstract over ELF header differences (32-bit vs 64-bit).
+    trait ElfHeader: FromBytes {
+        fn shnum(&self) -> u16;
+        fn shoff(&self) -> u64;
+        fn shstrndx(&self) -> u16;
+    }
+
+    /// Trait to abstract over ELF section header differences (32-bit vs 64-bit).
+    trait ElfSectionHeader: FromBytes {
+        fn name(&self) -> u32;
+        fn offset(&self) -> u64;
+        fn size(&self) -> u64;
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfHeader`] implementations.
     #[repr(transparent)]
     struct Elf64Hdr(bindings::elf64_hdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64Hdr {}
 
+    impl ElfHeader for Elf64Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            self.0.e_shoff
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfSectionHeader`] implementations.
     #[repr(transparent)]
     struct Elf64SHdr(bindings::elf64_shdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64SHdr {}
 
+    impl ElfSectionHeader for Elf64SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            self.0.sh_offset
+        }
+
+        fn size(&self) -> u64 {
+            self.0.sh_size
+        }
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -471,47 +514,49 @@ fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         CStr::from_bytes_until_nul(bytes).ok()?.to_str().ok()
     }
 
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
-        let hdr = &elf
-            .get(0..size_of::<bindings::elf64_hdr>())
-            .and_then(Elf64Hdr::from_bytes)?
-            .0;
+    fn elf_section_generic<'a, H, S>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
+    where
+        H: ElfHeader,
+        S: ElfSectionHeader,
+    {
+        let hdr = H::from_bytes(elf.get(0..size_of::<H>())?)?;
 
-        // Get all the section headers.
-        let mut shdr = {
-            let shdr_num = usize::from(hdr.e_shnum);
-            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
-            let shdr_end = shdr_num
-                .checked_mul(size_of::<Elf64SHdr>())
-                .and_then(|v| v.checked_add(shdr_start))?;
+        let shdr_num = usize::from(hdr.shnum());
+        let shdr_start = usize::try_from(hdr.shoff()).ok()?;
+        let shdr_end = shdr_num
+            .checked_mul(size_of::<S>())
+            .and_then(|v| v.checked_add(shdr_start))?;
 
-            elf.get(shdr_start..shdr_end)
-                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
-        };
+        // Get all the section headers as an iterator over byte chunks.
+        let shdr_bytes = elf.get(shdr_start..shdr_end)?;
+        let mut shdr_iter = shdr_bytes.chunks_exact(size_of::<S>());
 
         // Get the strings table.
-        let strhdr = shdr
+        let strhdr = shdr_iter
             .clone()
-            .nth(usize::from(hdr.e_shstrndx))
-            .and_then(Elf64SHdr::from_bytes)?;
+            .nth(usize::from(hdr.shstrndx()))
+            .and_then(S::from_bytes)?;
 
         // Find the section which name matches `name` and return it.
-        shdr.find_map(|sh| {
-            let hdr = Elf64SHdr::from_bytes(sh)?;
-            let name_offset = strhdr.0.sh_offset.checked_add(u64::from(hdr.0.sh_name))?;
+        shdr_iter.find_map(|sh_bytes| {
+            let sh = S::from_bytes(sh_bytes)?;
+            let name_offset = strhdr.offset().checked_add(u64::from(sh.name()))?;
             let section_name = elf_str(elf, name_offset)?;
 
             if section_name != name {
                 return None;
             }
 
-            let start = usize::try_from(hdr.0.sh_offset).ok()?;
-            let end = usize::try_from(hdr.0.sh_size)
+            let start = usize::try_from(sh.offset()).ok()?;
+            let end = usize::try_from(sh.size())
                 .ok()
-                .and_then(|sh_size| start.checked_add(sh_size))?;
-
+                .and_then(|sz| start.checked_add(sz))?;
             elf.get(start..end)
         })
     }
+
+    /// Extract the section with name `name` from the ELF64 image `elf`.
+    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf64Hdr, Elf64SHdr>(elf, name)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 13/33] gpu: nova-core: add support for 32-bit firmware images
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (11 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 12/33] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 14/33] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
                   ` (20 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add Elf32Hdr and Elf32SHdr newtypes, implement the ElfHeader and
ElfSectionHeader traits for them, and add elf32_section().

This mirrors the existing ELF64 support, using the same generic
infrastructure.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 46 +++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 6ed76a7e15f1..5f3f878eef71 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -507,6 +507,46 @@ fn size(&self) -> u64 {
         }
     }
 
+    /// Newtype to provide [`FromBytes`] and [`ElfHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32Hdr(bindings::elf32_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32Hdr {}
+
+    impl ElfHeader for Elf32Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            u64::from(self.0.e_shoff)
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfSectionHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32SHdr(bindings::elf32_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32SHdr {}
+
+    impl ElfSectionHeader for Elf32SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            u64::from(self.0.sh_offset)
+        }
+
+        fn size(&self) -> u64 {
+            u64::from(self.0.sh_size)
+        }
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -559,4 +599,10 @@ fn elf_section_generic<'a, H, S>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
     pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Hdr, Elf64SHdr>(elf, name)
     }
+
+    /// Extract section with name `name` from the ELF32 image `elf`.
+    #[expect(dead_code)]
+    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf32Hdr, Elf32SHdr>(elf, name)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 14/33] gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (12 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 13/33] gpu: nova-core: add support for 32-bit " John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 15/33] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
                   ` (19 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add elf_section() which checks the ELF magic and class byte to
automatically dispatch to elf32_section() or elf64_section().

Update existing callers to use elf_section() instead of calling
elf64_section() directly.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     | 20 +++++++++++++++++---
 drivers/gpu/nova-core/firmware/gsp.rs |  4 ++--
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 5f3f878eef71..43a4e70aeedc 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -596,13 +596,27 @@ fn elf_section_generic<'a, H, S>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
     }
 
     /// Extract the section with name `name` from the ELF64 image `elf`.
-    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Hdr, Elf64SHdr>(elf, name)
     }
 
     /// Extract section with name `name` from the ELF32 image `elf`.
-    #[expect(dead_code)]
-    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf32Hdr, Elf32SHdr>(elf, name)
     }
+
+    /// Automatically detects ELF32 vs ELF64 based on the ELF header.
+    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        // Check ELF magic.
+        if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
+            return None;
+        }
+
+        // Check ELF class: 1 = 32-bit, 2 = 64-bit.
+        match elf.get(4)? {
+            1 => elf32_section(elf, name),
+            2 => elf64_section(elf, name),
+            _ => None,
+        }
+    }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 173b16cdfb16..f100b5675b7e 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -105,7 +105,7 @@ pub(crate) fn new<'a>(
         pin_init::pin_init_scope(move || {
             let firmware = super::request_firmware(dev, chipset, "gsp", ver)?;
 
-            let fw_section = elf::elf64_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
+            let fw_section = elf::elf_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
 
             let size = fw_section.len();
 
@@ -162,7 +162,7 @@ pub(crate) fn new<'a>(
                 signatures: {
                     let sigs_section = Self::get_gsp_sigs_section(chipset).ok_or(ENOTSUPP)?;
 
-                    elf::elf64_section(firmware.data(), sigs_section)
+                    elf::elf_section(firmware.data(), sigs_section)
                         .ok_or(EINVAL)
                         .and_then(|data| DmaObject::from_data(dev, data))?
                 },
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 15/33] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (13 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 14/33] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 16/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

FSP is a hardware unit that runs FMC firmware.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     |  1 +
 drivers/gpu/nova-core/firmware/fsp.rs | 44 +++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 43a4e70aeedc..9a9b969aaf79 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -28,6 +28,7 @@
 };
 
 pub(crate) mod booter;
+pub(crate) mod fsp;
 pub(crate) mod fwsec;
 pub(crate) mod gsp;
 pub(crate) mod riscv;
diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
new file mode 100644
index 000000000000..80401b964488
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP is a hardware unit that runs FMC firmware.
+
+use kernel::{
+    device,
+    prelude::*, //
+};
+
+use crate::{
+    dma::DmaObject,
+    firmware::elf,
+    gpu::Chipset, //
+};
+
+#[expect(unused)]
+pub(crate) struct FspFirmware {
+    /// FMC firmware image data (only the .image section)
+    fmc_image: DmaObject,
+    /// Full FMC ELF data (for signature extraction)
+    fmc_full: DmaObject,
+}
+
+impl FspFirmware {
+    #[expect(unused)]
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: Chipset,
+        ver: &str,
+    ) -> Result<Self> {
+        let fw = super::request_firmware(dev, chipset, "fmc", ver)?;
+
+        // FSP expects only the .image section, not the entire ELF file
+        let fmc_image_data = elf::elf_section(fw.data(), "image").ok_or_else(|| {
+            dev_err!(dev, "FMC ELF file missing 'image' section\n");
+            EINVAL
+        })?;
+
+        Ok(Self {
+            fmc_image: DmaObject::from_data(dev, fmc_image_data)?,
+            fmc_full: DmaObject::from_data(dev, fw.data())?,
+        })
+    }
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 16/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (14 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 15/33] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
                   ` (17 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP (Firmware System Processor) falcon engine type that will
handle secure boot and Chain of Trust operations on Hopper and Blackwell
architectures.

The FSP falcon replaces SEC2's role in the boot sequence for these newer
architectures. This initial stub just defines the falcon type and its
base address.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs     |  1 +
 drivers/gpu/nova-core/falcon/fsp.rs | 31 +++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 37bfee1d0949..a0cfb4442df1 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -33,6 +33,7 @@
     regs::macros::RegisterBase, //
 };
 
+pub(crate) mod fsp;
 pub(crate) mod gsp;
 mod hal;
 pub(crate) mod sec2;
diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
new file mode 100644
index 000000000000..cc3fc3cf2f6a
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FSP (Firmware System Processor) falcon engine for Hopper/Blackwell GPUs.
+//!
+//! The FSP falcon handles secure boot and Chain of Trust operations
+//! on Hopper and Blackwell architectures, replacing SEC2's role.
+
+use crate::{
+    falcon::{
+        FalconEngine,
+        PFalcon2Base,
+        PFalconBase, //
+    },
+    regs::macros::RegisterBase,
+};
+
+/// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
+pub(crate) struct Fsp(());
+
+impl RegisterBase<PFalconBase> for Fsp {
+    // FSP falcon base address for Blackwell
+    const BASE: usize = 0x8f2000;
+}
+
+impl RegisterBase<PFalcon2Base> for Fsp {
+    const BASE: usize = 0x8f3000;
+}
+
+impl FalconEngine for Fsp {
+    const ID: Self = Fsp(());
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (15 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 16/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-11 10:57   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
                   ` (16 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add external memory (EMEM) read/write operations to the GPU's FSP falcon
engine. These operations use Falcon PIO (Programmed I/O) to communicate
with the FSP through indirect memory access.

Cc: Gary Guo <gary@garyguo.net>
Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 59 ++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs       | 13 +++++++
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index cc3fc3cf2f6a..fb1c8c89d2ff 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -5,13 +5,20 @@
 //! The FSP falcon handles secure boot and Chain of Trust operations
 //! on Hopper and Blackwell architectures, replacing SEC2's role.
 
+use kernel::prelude::*;
+
 use crate::{
+    driver::Bar0,
     falcon::{
+        Falcon,
         FalconEngine,
         PFalcon2Base,
         PFalconBase, //
     },
-    regs::macros::RegisterBase,
+    regs::{
+        self,
+        macros::RegisterBase, //
+    },
 };
 
 /// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
@@ -29,3 +36,53 @@ impl RegisterBase<PFalcon2Base> for Fsp {
 impl FalconEngine for Fsp {
     const ID: Self = Fsp(());
 }
+
+impl Falcon<Fsp> {
+    /// Writes `data` to FSP external memory at byte `offset` using Falcon PIO.
+    ///
+    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
+    #[expect(unused)]
+    pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
+        // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
+            .set_wr_mode(true)
+            .set_offset(offset)
+            .write(bar, &Fsp::ID);
+
+        for chunk in data.chunks_exact(4) {
+            let word = u32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
+            regs::NV_PFALCON_FALCON_EMEM_DATA::default()
+                .set_data(word)
+                .write(bar, &Fsp::ID);
+        }
+
+        Ok(())
+    }
+
+    /// Reads FSP external memory at byte `offset` into `data` using Falcon PIO.
+    ///
+    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
+    #[expect(unused)]
+    pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
+        // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
+            .set_rd_mode(true)
+            .set_offset(offset)
+            .write(bar, &Fsp::ID);
+
+        for chunk in data.chunks_exact_mut(4) {
+            let word = regs::NV_PFALCON_FALCON_EMEM_DATA::read(bar, &Fsp::ID).data();
+            chunk.copy_from_slice(&word.to_le_bytes());
+        }
+
+        Ok(())
+    }
+}
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index ea0d32f5396c..1ae57cc42a9f 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -8,6 +8,7 @@
 pub(crate) mod macros;
 
 use kernel::{
+    io::Io,
     prelude::*,
     time, //
 };
@@ -431,6 +432,18 @@ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
     8:8     br_fetch as bool;
 });
 
+// GP102 EMEM PIO registers (used by FSP for Hopper/Blackwell)
+// These registers provide falcon external memory communication interface
+register!(NV_PFALCON_FALCON_EMEM_CTL @ PFalconBase[0x00000ac0] {
+    23:0    offset as u32;      // EMEM byte offset (must be 4-byte aligned)
+    24:24   wr_mode as bool;    // Write mode
+    25:25   rd_mode as bool;    // Read mode
+});
+
+register!(NV_PFALCON_FALCON_EMEM_DATA @ PFalconBase[0x00000ac4] {
+    31:0    data as u32;        // EMEM data register
+});
+
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (16 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 16:28   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
                   ` (15 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP messaging infrastructure needed for Chain of Trust
communication on Hopper/Blackwell GPUs.

Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 79 ++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs       | 47 +++++++++++++++++
 2 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index fb1c8c89d2ff..51dae900267f 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -41,7 +41,6 @@ impl Falcon<Fsp> {
     /// Writes `data` to FSP external memory at byte `offset` using Falcon PIO.
     ///
     /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
-    #[expect(unused)]
     pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
         // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
         if offset % 4 != 0 || data.len() % 4 != 0 {
@@ -66,7 +65,6 @@ pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result
     /// Reads FSP external memory at byte `offset` into `data` using Falcon PIO.
     ///
     /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
-    #[expect(unused)]
     pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
         // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
         if offset % 4 != 0 || data.len() % 4 != 0 {
@@ -85,4 +83,81 @@ pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Resu
 
         Ok(())
     }
+
+    /// Poll FSP for incoming data.
+    ///
+    /// Returns the size of available data in bytes, or 0 if no data is available.
+    ///
+    /// The FSP message queue is not circular - pointers are reset to 0 after each
+    /// message exchange, so `tail >= head` is always true when data is present.
+    #[expect(unused)]
+    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
+        let head = regs::NV_PFSP_MSGQ_HEAD::read(bar).address();
+        let tail = regs::NV_PFSP_MSGQ_TAIL::read(bar).address();
+
+        if head == tail {
+            return 0;
+        }
+
+        // TAIL points at last DWORD written, so add 4 to get total size
+        tail.saturating_sub(head) + 4
+    }
+
+    /// Send message to FSP.
+    ///
+    /// Writes a message to FSP EMEM and updates queue pointers to notify FSP.
+    ///
+    /// # Arguments
+    /// * `bar` - BAR0 memory mapping
+    /// * `packet` - Message data (must be 4-byte aligned in length)
+    ///
+    /// # Returns
+    /// `Ok(())` on success, `Err(EINVAL)` if packet is empty or not 4-byte aligned
+    #[expect(unused)]
+    pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
+        if packet.is_empty() {
+            return Err(EINVAL);
+        }
+
+        // Write message to EMEM at offset 0 (validates 4-byte alignment)
+        self.write_emem(bar, 0, packet)?;
+
+        // Update queue pointers - TAIL points at last DWORD written
+        let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
+        regs::NV_PFSP_QUEUE_TAIL::default()
+            .set_address(tail_offset)
+            .write(bar);
+        regs::NV_PFSP_QUEUE_HEAD::default()
+            .set_address(0)
+            .write(bar);
+
+        Ok(())
+    }
+
+    /// Receive message from FSP.
+    ///
+    /// Reads a message from FSP EMEM and resets queue pointers.
+    ///
+    /// # Arguments
+    /// * `bar` - BAR0 memory mapping
+    /// * `buffer` - Buffer to receive message data
+    /// * `size` - Size of message to read in bytes (from `poll_msgq`)
+    ///
+    /// # Returns
+    /// `Ok(bytes_read)` on success, `Err(EINVAL)` if size is 0, exceeds buffer, or not aligned
+    #[expect(unused)]
+    pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
+        if size == 0 || size > buffer.len() {
+            return Err(EINVAL);
+        }
+
+        // Read response from EMEM at offset 0 (validates 4-byte alignment)
+        self.read_emem(bar, 0, &mut buffer[..size])?;
+
+        // Reset message queue pointers after reading
+        regs::NV_PFSP_MSGQ_TAIL::default().set_address(0).write(bar);
+        regs::NV_PFSP_MSGQ_HEAD::default().set_address(0).write(bar);
+
+        Ok(size)
+    }
 }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 1ae57cc42a9f..f63a61324960 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -444,6 +444,53 @@ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
     31:0    data as u32;        // EMEM data register
 });
 
+// FSP (Firmware System Processor) queue registers for Hopper/Blackwell Chain of Trust
+// These registers manage falcon EMEM communication queues
+register!(NV_PFSP_QUEUE_HEAD @ 0x008f2c00 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_QUEUE_TAIL @ 0x008f2c04 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_MSGQ_HEAD @ 0x008f2c80 {
+    31:0    address as u32;
+});
+
+register!(NV_PFSP_MSGQ_TAIL @ 0x008f2c84 {
+    31:0    address as u32;
+});
+
+// PTHERM registers
+
+// FSP secure boot completion status register used by FSP to signal boot completion.
+// This is the NV_THERM_I2CS_SCRATCH register.
+// Different architectures use different addresses:
+// - Hopper (GH100): 0x000200bc
+// - Blackwell (GB202): 0x00ad00bc
+pub(crate) fn fsp_thermal_scratch_reg_addr(arch: Architecture) -> Result<usize> {
+    match arch {
+        Architecture::Hopper => Ok(0x000200bc),
+        Architecture::Blackwell => Ok(0x00ad00bc),
+        _ => Err(kernel::error::code::ENOTSUPP),
+    }
+}
+
+/// FSP writes this value to indicate successful boot completion.
+#[expect(unused)]
+pub(crate) const FSP_BOOT_COMPLETE_SUCCESS: u32 = 0xff;
+
+// Helper function to read FSP boot completion status from the correct register
+#[expect(unused)]
+pub(crate) fn read_fsp_boot_complete_status(
+    bar: &crate::driver::Bar0,
+    arch: Architecture,
+) -> Result<u32> {
+    let addr = fsp_thermal_scratch_reg_addr(arch)?;
+    Ok(bar.read32(addr))
+}
+
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (17 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 16:39   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
                   ` (14 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Various "reserved" areas of FB (frame buffer: vidmem) have to be
calculated, because the GSP booting process needs this information.

The calculations are const, so a new const-compatible alignment function
is also added to num.rs, in order to align the reserved areas.

Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs     | 18 ++++++++++++++++++
 drivers/gpu/nova-core/gsp/fw.rs |  6 +++++-
 drivers/gpu/nova-core/num.rs    | 10 ++++++++++
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index e803e6e0cdb9..9b4407338724 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -170,6 +170,9 @@ pub(crate) struct FbLayout {
     pub(crate) wpr2: FbRange,
     pub(crate) heap: FbRange,
     pub(crate) vf_partition_count: u8,
+    /// Total reserved size (heap + PMU reserved), aligned to 2MB.
+    #[expect(unused)]
+    pub(crate) total_reserved_size: u32,
 }
 
 impl FbLayout {
@@ -257,6 +260,16 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
         };
 
+        // Calculate reserved sizes. PMU reservation is a subset of the total reserved size.
+        let heap_size = (heap.end - heap.start) as u64;
+        let pmu_reserved_size = u64::from(PMU_RESERVED_SIZE);
+
+        let total_reserved_size = {
+            let total = heap_size + pmu_reserved_size;
+            const RSVD_ALIGN: Alignment = Alignment::new::<SZ_2M>();
+            total.align_up(RSVD_ALIGN).ok_or(EINVAL)?
+        };
+
         Ok(Self {
             fb,
             vga_workspace,
@@ -267,6 +280,11 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             wpr2,
             heap,
             vf_partition_count: 0,
+            total_reserved_size: total_reserved_size as u32,
         })
     }
 }
+
+/// PMU reserved size, aligned to 128KB.
+pub(crate) const PMU_RESERVED_SIZE: u32 =
+    crate::num::const_align_up::<SZ_128K>(SZ_8M + SZ_16M + SZ_4K) as u32;
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 83ff91614e36..086153edfa86 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -27,7 +27,10 @@
 };
 
 use crate::{
-    fb::FbLayout,
+    fb::{
+        FbLayout,
+        PMU_RESERVED_SIZE, //
+    },
     firmware::gsp::GspFirmware,
     gpu::Chipset,
     gsp::{
@@ -183,6 +186,7 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
             fbSize: fb_layout.fb.end - fb_layout.fb.start,
             vgaWorkspaceOffset: fb_layout.vga_workspace.start,
             vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
+            pmuReservedSize: PMU_RESERVED_SIZE,
             ..Default::default()
         })
     }
diff --git a/drivers/gpu/nova-core/num.rs b/drivers/gpu/nova-core/num.rs
index c952a834e662..f068722c5bdf 100644
--- a/drivers/gpu/nova-core/num.rs
+++ b/drivers/gpu/nova-core/num.rs
@@ -215,3 +215,13 @@ pub(crate) const fn [<$from _into_ $into>]<const N: $from>() -> $into {
 impl_const_into!(u64 => { u8, u16, u32 });
 impl_const_into!(u32 => { u8, u16 });
 impl_const_into!(u16 => { u8 });
+
+/// Aligns `value` up to `ALIGN` at compile time.
+///
+/// This is the const-compatible equivalent of [`kernel::ptr::Alignable::align_up`].
+/// `ALIGN` must be a power of two (enforced at compile time).
+#[inline(always)]
+pub(crate) const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
+    build_assert!(ALIGN.is_power_of_two());
+    (value + (ALIGN - 1)) & !(ALIGN - 1)
+}
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (18 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 17:13   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 21/33] gpu: nova-core: Hopper/Blackwell: add FSP message structures John Hubbard
                   ` (13 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP (Firmware System Processor) module for Hopper/Blackwell GPUs.
These architectures use a simplified firmware boot sequence:

    FMC --> FSP --> GSP, with no SEC2 involvement.

This commit adds the ability to wait for FSP secure boot completion by
polling the I2CS thermal scratch register until FSP signals success.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs       | 168 +++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 drivers/gpu/nova-core/regs.rs      |   2 -
 3 files changed, 169 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fsp.rs

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
new file mode 100644
index 000000000000..5476259d224c
--- /dev/null
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -0,0 +1,168 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// TODO: remove this once the code is fully functional
+#![expect(dead_code)]
+
+//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
+//!
+//! Hopper/Blackwell use a simplified firmware boot sequence: FMC --> FSP --> GSP.
+//! Unlike Turing/Ampere/Ada, there is NO SEC2 (Security Engine 2) usage.
+//! FSP handles secure boot directly using FMC firmware + Chain of Trust.
+
+use kernel::{
+    device,
+    io::poll::read_poll_timeout,
+    prelude::*,
+    time::Delta,
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    },
+};
+
+use crate::regs::FSP_BOOT_COMPLETE_SUCCESS;
+
+/// FSP secure boot completion timeout in milliseconds.
+const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 4000;
+
+/// MCTP (Management Component Transport Protocol) header values for FSP communication.
+pub(crate) mod mctp {
+    pub(super) const HEADER_SOM: u32 = 1; // Start of Message
+    pub(super) const HEADER_EOM: u32 = 1; // End of Message
+    pub(super) const HEADER_SEID: u32 = 0; // Source Endpoint ID
+    pub(super) const HEADER_SEQ: u32 = 0; // Sequence number
+
+    pub(super) const MSG_TYPE_VENDOR_PCI: u32 = 0x7e;
+    pub(super) const VENDOR_ID_NV: u32 = 0x10de;
+    pub(super) const NVDM_TYPE_COT: u32 = 0x14;
+    pub(super) const NVDM_TYPE_FSP_RESPONSE: u32 = 0x15;
+}
+
+/// GSP FMC boot parameters structure.
+/// This is what FSP expects to receive for booting GSP-RM.
+/// GSP FMC initialization parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspFmcInitParams {
+    /// CC initialization "registry keys"
+    regkeys: u32,
+}
+
+// SAFETY: GspFmcInitParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspFmcInitParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspFmcInitParams {}
+
+/// GSP ACR (Authenticated Code RAM) boot parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspAcrBootGspRmParams {
+    /// Physical memory aperture through which gspRmDescPa is accessed
+    target: u32,
+    /// Size in bytes of the GSP-RM descriptor structure
+    gsp_rm_desc_size: u32,
+    /// Physical offset in the target aperture of the GSP-RM descriptor structure
+    gsp_rm_desc_offset: u64,
+    /// Physical offset in FB to set the start of the WPR containing GSP-RM
+    wpr_carveout_offset: u64,
+    /// Size in bytes of the WPR containing GSP-RM
+    wpr_carveout_size: u32,
+    /// Whether to boot GSP-RM or GSP-Proxy through ACR
+    b_is_gsp_rm_boot: u32,
+}
+
+// SAFETY: GspAcrBootGspRmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspAcrBootGspRmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspAcrBootGspRmParams {}
+
+/// GSP RM boot parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspRmParams {
+    /// Physical memory aperture through which bootArgsOffset is accessed
+    target: u32,
+    /// Physical offset in the memory aperture that will be passed to GSP-RM
+    boot_args_offset: u64,
+}
+
+// SAFETY: GspRmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspRmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspRmParams {}
+
+/// GSP SPDM (Security Protocol and Data Model) parameters.
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+struct GspSpdmParams {
+    /// Physical Memory Aperture through which all addresses are accessed
+    target: u32,
+    /// Physical offset in the memory aperture where SPDM payload buffer is stored
+    payload_buffer_offset: u64,
+    /// Size of the above payload buffer
+    payload_buffer_size: u32,
+}
+
+// SAFETY: GspSpdmParams is a simple C struct with only primitive types.
+unsafe impl AsBytes for GspSpdmParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspSpdmParams {}
+
+/// Complete GSP FMC boot parameters structure.
+/// This is what FSP expects to receive - NOT a raw libos address!
+#[repr(C)]
+#[derive(Debug, Clone, Copy, Default)]
+pub(crate) struct GspFmcBootParams {
+    init_params: GspFmcInitParams,
+    boot_gsp_rm_params: GspAcrBootGspRmParams,
+    gsp_rm_params: GspRmParams,
+    gsp_spdm_params: GspSpdmParams,
+}
+
+// SAFETY: GspFmcBootParams is composed of C structs with only primitive types.
+unsafe impl AsBytes for GspFmcBootParams {}
+// SAFETY: All bit patterns are valid for the primitive fields.
+unsafe impl FromBytes for GspFmcBootParams {}
+
+/// FSP interface for Hopper/Blackwell GPUs.
+pub(crate) struct Fsp;
+
+impl Fsp {
+    /// Wait for FSP secure boot completion.
+    ///
+    /// Polls the thermal scratch register until FSP signals boot completion
+    /// or timeout occurs.
+    pub(crate) fn wait_secure_boot(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        arch: crate::gpu::Architecture,
+    ) -> Result<()> {
+        let timeout = Delta::from_millis(FSP_SECURE_BOOT_TIMEOUT_MS);
+
+        read_poll_timeout(
+            || crate::regs::read_fsp_boot_complete_status(bar, arch),
+            |&status| {
+                dev_dbg!(
+                    dev,
+                    "FSP I2CS scratch register status: {:#x} (expected: {:#x})\n",
+                    status,
+                    FSP_BOOT_COMPLETE_SUCCESS
+                );
+                status == FSP_BOOT_COMPLETE_SUCCESS
+            },
+            Delta::ZERO,
+            timeout,
+        )
+        .map_err(|_| {
+            let final_status =
+                crate::regs::read_fsp_boot_complete_status(bar, arch).unwrap_or(0xDEADBEEF);
+            dev_err!(
+                dev,
+                "FSP secure boot completion timeout - final status: {:#x}\n",
+                final_status
+            );
+            ETIMEDOUT
+        })
+        .map(|_| ())
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index c1121e7c64c5..af6f562779d6 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -10,6 +10,7 @@
 mod falcon;
 mod fb;
 mod firmware;
+mod fsp;
 mod gfw;
 mod gpu;
 mod gsp;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index f63a61324960..0d3ad4755a81 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -478,11 +478,9 @@ pub(crate) fn fsp_thermal_scratch_reg_addr(arch: Architecture) -> Result<usize>
 }
 
 /// FSP writes this value to indicate successful boot completion.
-#[expect(unused)]
 pub(crate) const FSP_BOOT_COMPLETE_SUCCESS: u32 = 0xff;
 
 // Helper function to read FSP boot completion status from the correct register
-#[expect(unused)]
 pub(crate) fn read_fsp_boot_complete_status(
     bar: &crate::driver::Bar0,
     arch: Architecture,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 21/33] gpu: nova-core: Hopper/Blackwell: add FSP message structures
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (19 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 22/33] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the data structures for FSP Chain of Trust communication. These
include the FMC signature container (hash, public key, signature) and
the NVDM payload structures for sending COT messages and receiving
responses.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs | 76 ++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 5476259d224c..7775f1b673c0 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -124,6 +124,82 @@ unsafe impl AsBytes for GspFmcBootParams {}
 // SAFETY: All bit patterns are valid for the primitive fields.
 unsafe impl FromBytes for GspFmcBootParams {}
 
+/// Size constraints for FSP security signatures.
+const FSP_HASH_SIZE: usize = 48; // SHA-384 hash
+const FSP_PKEY_SIZE: usize = 97; // Public key size for GB202 (not 384!)
+const FSP_SIG_SIZE: usize = 96; // Signature size for GB202 (not 384!)
+
+/// Structure to hold FMC signatures.
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct FmcSignatures {
+    hash384: [u8; FSP_HASH_SIZE], // SHA-384 hash (48 bytes)
+    public_key: [u8; 384],        // RSA public key (384 bytes)
+    signature: [u8; 384],         // RSA signature (384 bytes)
+}
+
+impl Default for FmcSignatures {
+    fn default() -> Self {
+        Self {
+            hash384: [0u8; FSP_HASH_SIZE],
+            public_key: [0u8; 384],
+            signature: [0u8; 384],
+        }
+    }
+}
+
+/// FSP Command Response payload structure.
+/// NVDM_PAYLOAD_COMMAND_RESPONSE structure.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCommandResponse {
+    task_id: u32,
+    command_nvdm_type: u32,
+    error_code: u32,
+}
+
+/// NVDM (NVIDIA Device Management) COT (Chain of Trust) payload structure.
+/// This is the main message payload sent to FSP for Chain of Trust.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCot {
+    version: u16,               // offset 0x0, size 2
+    size: u16,                  // offset 0x2, size 2
+    gsp_fmc_sysmem_offset: u64, // offset 0x4, size 8
+    frts_sysmem_offset: u64,    // offset 0xC, size 8
+    frts_sysmem_size: u32,      // offset 0x14, size 4
+    frts_vidmem_offset: u64,    // offset 0x18, size 8
+    frts_vidmem_size: u32,      // offset 0x20, size 4
+    // Authentication related fields
+    hash384: [u8; FSP_HASH_SIZE],     // offset 0x24, size 48 (0x30)
+    public_key: [u8; FSP_PKEY_SIZE],  // offset 0x54, size 384 (0x180)
+    signature: [u8; FSP_SIG_SIZE],    // offset 0x1D4, size 384 (0x180)
+    gsp_boot_args_sysmem_offset: u64, // offset 0x354, size 8
+}
+
+/// Complete FSP message structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct FspMessage {
+    mctp_header: u32,
+    nvdm_header: u32,
+    cot: NvdmPayloadCot,
+}
+
+// SAFETY: FspMessage is a packed C struct with only integral fields.
+unsafe impl AsBytes for FspMessage {}
+
+/// Complete FSP response structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct FspResponse {
+    mctp_header: u32,
+    nvdm_header: u32,
+    response: NvdmPayloadCommandResponse,
+}
+
+// SAFETY: FspResponse is a packed C struct with only integral fields.
+unsafe impl FromBytes for FspResponse {}
+
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 22/33] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (20 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 21/33] gpu: nova-core: Hopper/Blackwell: add FSP message structures John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 23/33] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add extract_fmc_signatures_static() to parse cryptographic signatures
from FMC ELF firmware sections. This extracts the SHA-384 hash, RSA
public key, and signature needed for Chain of Trust verification.

Also exposes the elf_section() helper from firmware.rs for use by FSP.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs |  4 +-
 drivers/gpu/nova-core/fsp.rs      | 76 +++++++++++++++++++++++++++++--
 2 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 9a9b969aaf79..e9101a08511f 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -27,6 +27,8 @@
     },
 };
 
+pub(crate) use elf::elf_section;
+
 pub(crate) mod booter;
 pub(crate) mod fsp;
 pub(crate) mod fwsec;
@@ -607,7 +609,7 @@ fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
     }
 
     /// Automatically detects ELF32 vs ELF64 based on the ELF header.
-    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    pub(crate) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         // Check ELF magic.
         if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
             return None;
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 7775f1b673c0..fca03e25134d 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -124,10 +124,10 @@ unsafe impl AsBytes for GspFmcBootParams {}
 // SAFETY: All bit patterns are valid for the primitive fields.
 unsafe impl FromBytes for GspFmcBootParams {}
 
-/// Size constraints for FSP security signatures.
+/// Size constraints for FSP security signatures (in bytes).
 const FSP_HASH_SIZE: usize = 48; // SHA-384 hash
-const FSP_PKEY_SIZE: usize = 97; // Public key size for GB202 (not 384!)
-const FSP_SIG_SIZE: usize = 96; // Signature size for GB202 (not 384!)
+const FSP_PKEY_SIZE: usize = 384; // RSA public key
+const FSP_SIG_SIZE: usize = 384; // RSA signature
 
 /// Structure to hold FMC signatures.
 #[derive(Debug, Clone, Copy)]
@@ -241,4 +241,74 @@ pub(crate) fn wait_secure_boot(
         })
         .map(|_| ())
     }
+
+    /// Extract FMC firmware signatures for Chain of Trust verification.
+    ///
+    /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
+    /// Returns signatures in a heap-allocated structure to prevent stack overflow.
+    pub(crate) fn extract_fmc_signatures_static(
+        dev: &device::Device<device::Bound>,
+        fmc_fw_data: &[u8],
+    ) -> Result<KBox<FmcSignatures>> {
+        // Extract hash section (SHA-384)
+        let hash_section = crate::firmware::elf_section(fmc_fw_data, "hash")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'hash' section\n"))?;
+
+        // Extract public key section (RSA public key)
+        let pkey_section = crate::firmware::elf_section(fmc_fw_data, "publickey")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'publickey' section\n"))?;
+
+        // Extract signature section (RSA signature)
+        let sig_section = crate::firmware::elf_section(fmc_fw_data, "signature")
+            .ok_or(EINVAL)
+            .inspect_err(|_| dev_err!(dev, "FMC firmware missing 'signature' section\n"))?;
+
+        // Validate section sizes - hash must be exactly 48 bytes
+        if hash_section.len() != FSP_HASH_SIZE {
+            dev_err!(
+                dev,
+                "FMC hash section size {} != expected {}\n",
+                hash_section.len(),
+                FSP_HASH_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        // Public key and signature can be smaller than the fixed array sizes.
+        if pkey_section.len() > FSP_PKEY_SIZE {
+            dev_err!(
+                dev,
+                "FMC publickey section size {} > maximum {}\n",
+                pkey_section.len(),
+                FSP_PKEY_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        if sig_section.len() > FSP_SIG_SIZE {
+            dev_err!(
+                dev,
+                "FMC signature section size {} > maximum {}\n",
+                sig_section.len(),
+                FSP_SIG_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        // Allocate signature structure on heap to avoid stack overflow
+        let mut signatures = KBox::new(FmcSignatures::default(), GFP_KERNEL)?;
+
+        // Copy hash section directly (48 bytes exactly)
+        signatures.hash384.copy_from_slice(hash_section);
+
+        // Copy public key section (up to 384 bytes, zero-padded)
+        signatures.public_key[..pkey_section.len()].copy_from_slice(pkey_section);
+
+        // Copy signature section (up to 384 bytes, zero-padded)
+        signatures.signature[..sig_section.len()].copy_from_slice(sig_section);
+
+        Ok(signatures)
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 23/33] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (21 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 22/33] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add send_sync_fsp() which sends an MCTP/NVDM message to FSP and waits
for the response. This handles the low-level protocol details including
header validation, error checking, and timeout handling.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs |   3 -
 drivers/gpu/nova-core/fsp.rs        | 101 ++++++++++++++++++++++++++++
 2 files changed, 101 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index 51dae900267f..08b382a32cc0 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -90,7 +90,6 @@ pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Resu
     ///
     /// The FSP message queue is not circular - pointers are reset to 0 after each
     /// message exchange, so `tail >= head` is always true when data is present.
-    #[expect(unused)]
     pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
         let head = regs::NV_PFSP_MSGQ_HEAD::read(bar).address();
         let tail = regs::NV_PFSP_MSGQ_TAIL::read(bar).address();
@@ -113,7 +112,6 @@ pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
     ///
     /// # Returns
     /// `Ok(())` on success, `Err(EINVAL)` if packet is empty or not 4-byte aligned
-    #[expect(unused)]
     pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
         if packet.is_empty() {
             return Err(EINVAL);
@@ -145,7 +143,6 @@ pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
     ///
     /// # Returns
     /// `Ok(bytes_read)` on success, `Err(EINVAL)` if size is 0, exceeds buffer, or not aligned
-    #[expect(unused)]
     pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
         if size == 0 || size > buffer.len() {
             return Err(EINVAL);
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index fca03e25134d..479b97eed7bd 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -22,6 +22,9 @@
 
 use crate::regs::FSP_BOOT_COMPLETE_SUCCESS;
 
+/// FSP message timeout in milliseconds.
+const FSP_MSG_TIMEOUT_MS: i64 = 2000;
+
 /// FSP secure boot completion timeout in milliseconds.
 const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 4000;
 
@@ -311,4 +314,102 @@ pub(crate) fn extract_fmc_signatures_static(
 
         Ok(signatures)
     }
+
+    /// Send message to FSP and wait for response.
+    fn send_sync_fsp(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
+        nvdm_type: u32,
+        packet: &[u8],
+    ) -> Result<()> {
+        // Send message
+        fsp_falcon.send_msg(bar, packet)?;
+
+        // Wait for response
+        let timeout = Delta::from_millis(FSP_MSG_TIMEOUT_MS);
+        let packet_size = read_poll_timeout(
+            || Ok(fsp_falcon.poll_msgq(bar)),
+            |&size| size > 0,
+            Delta::ZERO,
+            timeout,
+        )
+        .map_err(|_| {
+            dev_err!(dev, "FSP response timeout\n");
+            ETIMEDOUT
+        })?;
+
+        // Receive response
+        let packet_size = packet_size as usize;
+        let mut response_buf = KVec::<u8>::new();
+        response_buf.resize(packet_size, 0, GFP_KERNEL)?;
+        fsp_falcon.recv_msg(bar, &mut response_buf, packet_size)?;
+
+        // Parse response
+        if response_buf.len() < core::mem::size_of::<FspResponse>() {
+            dev_err!(dev, "FSP response too small: {}\n", response_buf.len());
+            return Err(EIO);
+        }
+
+        let response = FspResponse::from_bytes(&response_buf[..]).ok_or(EIO)?;
+
+        // Copy packed struct fields to avoid alignment issues
+        let mctp_header = response.mctp_header;
+        let nvdm_header = response.nvdm_header;
+        let command_nvdm_type = response.response.command_nvdm_type;
+        let error_code = response.response.error_code;
+
+        // Validate MCTP header
+        let mctp_som = (mctp_header >> 31) & 1;
+        let mctp_eom = (mctp_header >> 30) & 1;
+        if mctp_som != 1 || mctp_eom != 1 {
+            dev_err!(
+                dev,
+                "Unexpected MCTP header in FSP reply: {:#x}\n",
+                mctp_header
+            );
+            return Err(EIO);
+        }
+
+        // Validate NVDM header
+        let nvdm_msg_type = nvdm_header & 0x7f;
+        let nvdm_vendor_id = (nvdm_header >> 8) & 0xffff;
+        let nvdm_type_resp = (nvdm_header >> 24) & 0xff;
+
+        if nvdm_msg_type != mctp::MSG_TYPE_VENDOR_PCI
+            || nvdm_vendor_id != mctp::VENDOR_ID_NV
+            || nvdm_type_resp != mctp::NVDM_TYPE_FSP_RESPONSE
+        {
+            dev_err!(
+                dev,
+                "Unexpected NVDM header in FSP reply: {:#x}\n",
+                nvdm_header
+            );
+            return Err(EIO);
+        }
+
+        // Check command type matches
+        if command_nvdm_type != nvdm_type {
+            dev_err!(
+                dev,
+                "Expected NVDM type {:#x} in reply, got {:#x}\n",
+                nvdm_type,
+                command_nvdm_type
+            );
+            return Err(EIO);
+        }
+
+        // Check for errors
+        if error_code != 0 {
+            dev_err!(
+                dev,
+                "NVDM command {:#x} failed with error {:#x}\n",
+                nvdm_type,
+                error_code
+            );
+            return Err(EIO);
+        }
+
+        Ok(())
+    }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (22 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 23/33] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 18:16   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
                   ` (9 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the boot functions that construct FMC boot parameters and send the
Chain of Trust message to FSP. This completes the FSP communication
infrastructure needed to boot GSP firmware on Hopper/Blackwell GPUs.

Cc: Joel Fernandes <joelagnelf@nvidia.com>
Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs           |   2 +-
 drivers/gpu/nova-core/fb/hal.rs       |   9 +-
 drivers/gpu/nova-core/firmware/fsp.rs |  19 +--
 drivers/gpu/nova-core/fsp.rs          | 166 ++++++++++++++++++++++++--
 drivers/gpu/nova-core/gpu.rs          |  12 ++
 5 files changed, 191 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 9b4407338724..3a2b79a5c107 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -30,7 +30,7 @@
     regs,
 };
 
-mod hal;
+pub(crate) mod hal;
 
 /// Type holding the sysmem flush memory page, a page of memory to be written into the
 /// `NV_PFB_NISO_FLUSH_SYSMEM_ADDR*` registers and used to maintain memory coherency.
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index d795ef7ee65d..eaa545fe9b08 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -28,10 +28,17 @@ pub(crate) trait FbHal {
 
     /// Returns the VRAM size, in bytes.
     fn vidmem_size(&self, bar: &Bar0) -> u64;
+
+    /// Returns the non-WPR heap size for GPUs that need large reserved memory.
+    ///
+    /// Returns `None` for GPUs that don't need extra reserved memory.
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        None
+    }
 }
 
 /// Returns the HAL corresponding to `chipset`.
-pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
+pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset.arch() {
         Architecture::Turing => tu102::TU102_HAL,
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index 80401b964488..edcc173c2fa6 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -3,6 +3,7 @@
 //! FSP is a hardware unit that runs FMC firmware.
 
 use kernel::{
+    alloc::KVec,
     device,
     prelude::*, //
 };
@@ -13,16 +14,16 @@
     gpu::Chipset, //
 };
 
-#[expect(unused)]
+#[expect(dead_code)]
 pub(crate) struct FspFirmware {
-    /// FMC firmware image data (only the .image section)
-    fmc_image: DmaObject,
-    /// Full FMC ELF data (for signature extraction)
-    fmc_full: DmaObject,
+    /// FMC firmware image data (only the .image section) - submitted to hardware
+    pub(crate) fmc_image: DmaObject,
+    /// Full FMC ELF data (for signature extraction) - CPU-only access
+    pub(crate) fmc_full: KVec<u8>,
 }
 
 impl FspFirmware {
-    #[expect(unused)]
+    #[expect(dead_code)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
         chipset: Chipset,
@@ -36,9 +37,13 @@ pub(crate) fn new(
             EINVAL
         })?;
 
+        // Copy the full ELF into a kernel vector for CPU-side signature extraction
+        let mut fmc_full = KVec::with_capacity(fw.data().len(), GFP_KERNEL)?;
+        fmc_full.extend_from_slice(fw.data(), GFP_KERNEL)?;
+
         Ok(Self {
             fmc_image: DmaObject::from_data(dev, fmc_image_data)?,
-            fmc_full: DmaObject::from_data(dev, fw.data())?,
+            fmc_full,
         })
     }
 }
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 479b97eed7bd..35169f3bc446 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -13,6 +13,11 @@
     device,
     io::poll::read_poll_timeout,
     prelude::*,
+    ptr::{
+        Alignable,
+        Alignment, //
+    },
+    sizes::{SZ_1M, SZ_2M},
     time::Delta,
     transmute::{
         AsBytes,
@@ -35,6 +40,16 @@ pub(crate) mod mctp {
     pub(super) const HEADER_SEID: u32 = 0; // Source Endpoint ID
     pub(super) const HEADER_SEQ: u32 = 0; // Sequence number
 
+    // MCTP header bit positions
+    pub(super) const HEADER_SOM_SHIFT: u32 = 31;
+    pub(super) const HEADER_EOM_SHIFT: u32 = 30;
+    pub(super) const HEADER_SEQ_SHIFT: u32 = 28;
+    pub(super) const HEADER_SEID_SHIFT: u32 = 16;
+
+    // NVDM header bit positions
+    pub(super) const NVDM_VENDOR_ID_SHIFT: u32 = 8;
+    pub(super) const NVDM_TYPE_SHIFT: u32 = 24;
+
     pub(super) const MSG_TYPE_VENDOR_PCI: u32 = 0x7e;
     pub(super) const VENDOR_ID_NV: u32 = 0x10de;
     pub(super) const NVDM_TYPE_COT: u32 = 0x14;
@@ -203,6 +218,19 @@ struct FspResponse {
 // SAFETY: FspResponse is a packed C struct with only integral fields.
 unsafe impl FromBytes for FspResponse {}
 
+/// Trait implemented by types representing a message to send to FSP.
+///
+/// This provides [`Fsp::send_sync_fsp`] with the information it needs to send
+/// a given message, following the same pattern as GSP's `CommandToGsp`.
+pub(crate) trait MessageToFsp: AsBytes {
+    /// NVDM type identifying this message to FSP.
+    const NVDM_TYPE: u32;
+}
+
+impl MessageToFsp for FspMessage {
+    const NVDM_TYPE: u32 = mctp::NVDM_TYPE_COT;
+}
+
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -315,16 +343,138 @@ pub(crate) fn extract_fmc_signatures_static(
         Ok(signatures)
     }
 
-    /// Send message to FSP and wait for response.
-    fn send_sync_fsp(
+    /// Creates FMC boot parameters structure for FSP.
+    ///
+    /// This structure tells FSP how to boot GSP-RM with the correct memory layout.
+    pub(crate) fn create_fmc_boot_params(
+        dev: &device::Device<device::Bound>,
+        wpr_meta_addr: u64,
+        wpr_meta_size: u32,
+        libos_addr: u64,
+    ) -> Result<kernel::dma::CoherentAllocation<GspFmcBootParams>> {
+        use kernel::dma::CoherentAllocation;
+
+        const GSP_DMA_TARGET_COHERENT_SYSTEM: u32 = 1;
+        const GSP_DMA_TARGET_NONCOHERENT_SYSTEM: u32 = 2;
+
+        let fmc_boot_params = CoherentAllocation::<GspFmcBootParams>::alloc_coherent(
+            dev,
+            1,
+            GFP_KERNEL | __GFP_ZERO,
+        )?;
+
+        // Configure ACR boot parameters (WPR metadata location) using dma_write! macro
+        kernel::dma_write!(
+            fmc_boot_params[0].boot_gsp_rm_params.target = GSP_DMA_TARGET_COHERENT_SYSTEM
+        )?;
+        kernel::dma_write!(
+            fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_offset = wpr_meta_addr
+        )?;
+        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_size = wpr_meta_size)?;
+
+        // Blackwell FSP expects wpr_carveout_offset and wpr_carveout_size to be zero;
+        // it obtains WPR info from other sources.
+        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.b_is_gsp_rm_boot = 1)?;
+
+        // Configure RM parameters (libos location) using dma_write! macro
+        kernel::dma_write!(
+            fmc_boot_params[0].gsp_rm_params.target = GSP_DMA_TARGET_NONCOHERENT_SYSTEM
+        )?;
+        kernel::dma_write!(fmc_boot_params[0].gsp_rm_params.boot_args_offset = libos_addr)?;
+
+        Ok(fmc_boot_params)
+    }
+
+    /// Boot GSP FMC with pre-extracted signatures.
+    ///
+    /// This version takes pre-extracted signatures and FMC image data.
+    /// Used when signatures are extracted separately from the full ELF file.
+    #[allow(clippy::too_many_arguments)]
+    pub(crate) fn boot_gsp_fmc_with_signatures(
         dev: &device::Device<device::Bound>,
         bar: &crate::driver::Bar0,
+        chipset: crate::gpu::Chipset,
+        fmc_image_fw: &crate::dma::DmaObject, // Contains only the image section
+        fmc_boot_params: &kernel::dma::CoherentAllocation<GspFmcBootParams>,
+        total_reserved_size: u64,
+        resume: bool,
         fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
-        nvdm_type: u32,
-        packet: &[u8],
+        signatures: &FmcSignatures,
     ) -> Result<()> {
+        dev_dbg!(dev, "Starting FSP boot sequence for {}\n", chipset);
+
+        // Build FSP Chain of Trust message
+        let fmc_addr = fmc_image_fw.dma_handle(); // Now points to image data only
+        let fmc_boot_params_addr = fmc_boot_params.dma_handle();
+
+        // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
+        let frts_offset = if !resume {
+            let mut frts_reserved_size =
+                if let Some(heap_size) = crate::fb::hal::fb_hal(chipset).non_wpr_heap_size() {
+                    u64::from(heap_size)
+                } else {
+                    total_reserved_size
+                };
+
+            // Add PMU reserved size
+            frts_reserved_size += u64::from(crate::fb::PMU_RESERVED_SIZE);
+
+            frts_reserved_size
+                .align_up(Alignment::new::<SZ_2M>())
+                .unwrap_or(frts_reserved_size)
+        } else {
+            0
+        };
+        let frts_size = if !resume { SZ_1M as u32 } else { 0 };
+
+        // Build the FSP message
+        let msg = KBox::new(
+            FspMessage {
+                mctp_header: (mctp::HEADER_SOM << mctp::HEADER_SOM_SHIFT)
+                    | (mctp::HEADER_EOM << mctp::HEADER_EOM_SHIFT)
+                    | (mctp::HEADER_SEID << mctp::HEADER_SEID_SHIFT)
+                    | (mctp::HEADER_SEQ << mctp::HEADER_SEQ_SHIFT),
+
+                nvdm_header: (mctp::MSG_TYPE_VENDOR_PCI)
+                    | (mctp::VENDOR_ID_NV << mctp::NVDM_VENDOR_ID_SHIFT)
+                    | (mctp::NVDM_TYPE_COT << mctp::NVDM_TYPE_SHIFT),
+
+                cot: NvdmPayloadCot {
+                    version: chipset.fsp_cot_version(),
+                    size: core::mem::size_of::<NvdmPayloadCot>() as u16,
+                    gsp_fmc_sysmem_offset: fmc_addr,
+                    frts_sysmem_offset: 0,
+                    frts_sysmem_size: 0,
+                    frts_vidmem_offset: frts_offset,
+                    frts_vidmem_size: frts_size,
+                    hash384: signatures.hash384,
+                    public_key: signatures.public_key,
+                    signature: signatures.signature,
+                    gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
+                },
+            },
+            GFP_KERNEL,
+        )?;
+
+        // Send COT message to FSP and wait for response
+        Self::send_sync_fsp(dev, bar, fsp_falcon, &*msg)?;
+
+        dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
+        Ok(())
+    }
+
+    /// Send message to FSP and wait for response.
+    fn send_sync_fsp<M>(
+        dev: &device::Device<device::Bound>,
+        bar: &crate::driver::Bar0,
+        fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
+        msg: &M,
+    ) -> Result<()>
+    where
+        M: MessageToFsp,
+    {
         // Send message
-        fsp_falcon.send_msg(bar, packet)?;
+        fsp_falcon.send_msg(bar, msg.as_bytes())?;
 
         // Wait for response
         let timeout = Delta::from_millis(FSP_MSG_TIMEOUT_MS);
@@ -389,11 +539,11 @@ fn send_sync_fsp(
         }
 
         // Check command type matches
-        if command_nvdm_type != nvdm_type {
+        if command_nvdm_type != M::NVDM_TYPE {
             dev_err!(
                 dev,
                 "Expected NVDM type {:#x} in reply, got {:#x}\n",
-                nvdm_type,
+                M::NVDM_TYPE,
                 command_nvdm_type
             );
             return Err(EIO);
@@ -404,7 +554,7 @@ fn send_sync_fsp(
             dev_err!(
                 dev,
                 "NVDM command {:#x} failed with error {:#x}\n",
-                nvdm_type,
+                M::NVDM_TYPE,
                 error_code
             );
             return Err(EIO);
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index f04e2a795e90..88b1546e3cb4 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -124,6 +124,18 @@ pub(crate) const fn arch(&self) -> Architecture {
             | Self::GB207 => Architecture::Blackwell,
         }
     }
+
+    /// Returns the FSP Chain of Trust (COT) protocol version for this chipset.
+    ///
+    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
+    pub(crate) const fn fsp_cot_version(&self) -> u16 {
+        match self.arch() {
+            Architecture::Hopper => 1,
+            Architecture::Blackwell => 2,
+            // Other architectures don't use FSP COT
+            _ => 0,
+        }
+    }
 }
 
 // TODO
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (23 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 20:04   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 26/33] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
                   ` (8 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper, Blackwell and later require more space for the non-WPR heap.

Add a new FbHal method to return the non-WPR heap size, and create a new
GH100 HAL for Hopper and GB100 HAL for Blackwell that return the
appropriate value for each GPU architecture.

Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs           | 14 +++++++---
 drivers/gpu/nova-core/fb/hal.rs       |  7 +++--
 drivers/gpu/nova-core/fb/hal/ga102.rs |  2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs | 37 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/fb/hal/gh100.rs | 37 +++++++++++++++++++++++++++
 5 files changed, 91 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 3a2b79a5c107..7c502f15622c 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -98,6 +98,15 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
     }
 }
 
+/// Calculate non-WPR heap size based on chipset architecture.
+/// This matches the logic used in FSP for consistency.
+pub(crate) fn calc_non_wpr_heap_size(chipset: Chipset) -> u64 {
+    hal::fb_hal(chipset)
+        .non_wpr_heap_size()
+        .map(u64::from)
+        .unwrap_or(SZ_1M as u64)
+}
+
 pub(crate) struct FbRange(Range<u64>);
 
 impl FbRange {
@@ -255,9 +264,8 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         };
 
         let heap = {
-            const HEAP_SIZE: u64 = usize_as_u64(SZ_1M);
-
-            FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
+            let heap_size = calc_non_wpr_heap_size(chipset);
+            FbRange(wpr2.start - heap_size..wpr2.start)
         };
 
         // Calculate reserved sizes. PMU reservation is a subset of the total reserved size.
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index eaa545fe9b08..ebd12247f771 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -12,6 +12,8 @@
 
 mod ga100;
 mod ga102;
+mod gb100;
+mod gh100;
 mod tu102;
 
 pub(crate) trait FbHal {
@@ -42,7 +44,8 @@ pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset.arch() {
         Architecture::Turing => tu102::TU102_HAL,
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
-        Architecture::Ampere => ga102::GA102_HAL,
-        Architecture::Hopper | Architecture::Ada | Architecture::Blackwell => ga102::GA102_HAL,
+        Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
+        Architecture::Hopper => gh100::GH100_HAL,
+        Architecture::Blackwell => gb100::GB100_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/ga102.rs b/drivers/gpu/nova-core/fb/hal/ga102.rs
index 734605905031..f8d8f01e3c5d 100644
--- a/drivers/gpu/nova-core/fb/hal/ga102.rs
+++ b/drivers/gpu/nova-core/fb/hal/ga102.rs
@@ -8,7 +8,7 @@
     regs, //
 };
 
-fn vidmem_size_ga102(bar: &Bar0) -> u64 {
+pub(super) fn vidmem_size_ga102(bar: &Bar0) -> u64 {
     regs::NV_USABLE_FB_SIZE_IN_MB::read(bar).usable_fb_size()
 }
 
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
new file mode 100644
index 000000000000..eaab3f934f6e
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gb100;
+
+impl FbHal for Gb100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        Some(0x220000)
+    }
+}
+
+const GB100: Gb100 = Gb100;
+pub(super) const GB100_HAL: &dyn FbHal = &GB100;
diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
new file mode 100644
index 000000000000..6c56b8439276
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gh100;
+
+impl FbHal for Gh100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        Some(0x200000)
+    }
+}
+
+const GH100: Gh100 = Gh100;
+pub(super) const GH100_HAL: &dyn FbHal = &GH100;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 26/33] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (24 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Blackwell GPUs moved the sysmem flush page registers away from the
legacy NV_PFB_NISO_FLUSH_SYSMEM_ADDR used by Ampere/Ada.

GB10x uses HSHUB0 registers, with both a primary and EG (egress) pair
that must be programmed to the same address. GB20x uses FBHUB0
registers.

Add separate GB100 and GB202 fb HALs, and split the Blackwell HAL
dispatch so that each uses its respective registers.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb/hal.rs       |  6 ++-
 drivers/gpu/nova-core/fb/hal/gb100.rs | 42 ++++++++++++++++--
 drivers/gpu/nova-core/fb/hal/gb202.rs | 62 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/regs.rs         | 36 ++++++++++++++++
 4 files changed, 142 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs

diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index ebd12247f771..01f19a00685c 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -13,6 +13,7 @@
 mod ga100;
 mod ga102;
 mod gb100;
+mod gb202;
 mod gh100;
 mod tu102;
 
@@ -46,6 +47,9 @@ pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
         Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
         Architecture::Hopper => gh100::GH100_HAL,
-        Architecture::Blackwell => gb100::GB100_HAL,
+        Architecture::Blackwell => match chipset {
+            Chipset::GB100 | Chipset::GB102 => gb100::GB100_HAL,
+            _ => gb202::GB202_HAL,
+        },
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
index eaab3f934f6e..6039f66b12cc 100644
--- a/drivers/gpu/nova-core/fb/hal/gb100.rs
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -1,21 +1,57 @@
 // SPDX-License-Identifier: GPL-2.0
 
+//! Blackwell GB10x framebuffer HAL.
+//!
+//! GB10x GPUs use HSHUB0 registers for the sysmem flush page. Both the primary and EG (egress)
+//! register pairs must be programmed to the same address, as required by hardware.
+
 use kernel::prelude::*;
 
 use crate::{
     driver::Bar0,
-    fb::hal::FbHal, //
+    fb::hal::FbHal,
+    regs, //
 };
 
 struct Gb100;
 
+fn read_sysmem_flush_page_gb100(bar: &Bar0) -> u64 {
+    let lo = u64::from(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::read(bar).adr());
+    let hi = u64::from(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::read(bar).adr());
+
+    lo | (hi << 32)
+}
+
+fn write_sysmem_flush_page_gb100(bar: &Bar0, addr: u64) {
+    let addr_lo = addr as u32;
+    let addr_hi = (addr >> 32) as u32;
+
+    // Write HI first: the hardware may trigger the flush on the LO write.
+
+    // Primary HSHUB pair.
+    regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        .set_adr(addr_hi)
+        .write(bar);
+    regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        .set_adr(addr_lo)
+        .write(bar);
+
+    // EG (egress) pair -- must match the primary pair.
+    regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        .set_adr(addr_hi)
+        .write(bar);
+    regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        .set_adr(addr_lo)
+        .write(bar);
+}
+
 impl FbHal for Gb100 {
     fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
-        super::ga100::read_sysmem_flush_page_ga100(bar)
+        read_sysmem_flush_page_gb100(bar)
     }
 
     fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
-        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+        write_sysmem_flush_page_gb100(bar, addr);
 
         Ok(())
     }
diff --git a/drivers/gpu/nova-core/fb/hal/gb202.rs b/drivers/gpu/nova-core/fb/hal/gb202.rs
new file mode 100644
index 000000000000..7fa9a49d2cca
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb202.rs
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Blackwell GB20x framebuffer HAL.
+//!
+//! GB20x GPUs moved the sysmem flush registers from `NV_PFB_NISO_FLUSH_SYSMEM_ADDR` to
+//! `NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_{LO,HI}`.
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal,
+    regs, //
+};
+
+struct Gb202;
+
+fn read_sysmem_flush_page_gb202(bar: &Bar0) -> u64 {
+    let lo = u64::from(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::read(bar).adr());
+    let hi = u64::from(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::read(bar).adr());
+
+    lo | (hi << 32)
+}
+
+fn write_sysmem_flush_page_gb202(bar: &Bar0, addr: u64) {
+    // Write HI first. The hardware will trigger the flush on the LO write.
+    regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::default()
+        // CAST: upper 32 bits, then masked to 20 bits by the register field.
+        .set_adr((addr >> 32) as u32)
+        .write(bar);
+    regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::default()
+        // CAST: lower 32 bits. Hardware ignores bits 7:0.
+        .set_adr(addr as u32)
+        .write(bar);
+}
+
+impl FbHal for Gb202 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        read_sysmem_flush_page_gb202(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        write_sysmem_flush_page_gb202(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn non_wpr_heap_size(&self) -> Option<u32> {
+        Some(0x220000)
+    }
+}
+
+const GB202: Gb202 = Gb202;
+pub(super) const GB202_HAL: &dyn FbHal = &GB202;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 0d3ad4755a81..dfc30247bca6 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -116,6 +116,42 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
     23:0    adr_63_40 as u32;
 });
 
+// Blackwell GB10x sysmem flush registers (HSHUB0).
+//
+// GB10x GPUs use two pairs of HSHUB registers for sysmembar: a primary pair and an EG
+// (egress) pair. Both must be programmed to the same address. Hardware ignores bits 7:0
+// of each LO register. HSHUB0 base is 0x00891000.
+
+register!(NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x00891e50 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x00891e54 {
+    19:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x008916c0 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x008916c4 {
+    19:0    adr as u32;
+});
+
+// Blackwell GB20x sysmem flush registers (FBHUB0).
+//
+// Unlike the older NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an
+// 8-bit right-shift, these registers take the raw address split into lower/upper 32-bit halves.
+// The hardware ignores bits 7:0 of the LO register.
+
+register!(NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO @ 0x008a1d58 {
+    31:0    adr as u32;
+});
+
+register!(NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI @ 0x008a1d5c {
+    19:0    adr as u32;
+});
+
 register!(NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE @ 0x00100ce0 {
     3:0     lower_scale as u8;
     9:4     lower_mag as u8;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (25 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 26/33] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 20:10   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper John Hubbard
                   ` (6 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper, Blackwell and later GPUs require a larger heap for WPR2.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/fw.rs | 57 ++++++++++++++++++++++++++-------
 1 file changed, 45 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 086153edfa86..927bcee6a5a5 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -49,21 +49,41 @@ enum GspFwHeapParams {}
 /// Minimum required alignment for the GSP heap.
 const GSP_HEAP_ALIGNMENT: Alignment = Alignment::new::<{ 1 << 20 }>();
 
+// These constants override the generated bindings for architecture-specific heap sizing.
+//
+// 14MB for Hopper/Blackwell+.
+const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100: u64 = 14 * SZ_1M as u64;
+// 142MB client alloc for ~188MB total.
+const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100: u64 = 142 * SZ_1M as u64;
+// Blackwell-specific minimum heap size (88 + 12 + 70 = 170MB)
+const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_BLACKWELL: u64 = 170;
+
 impl GspFwHeapParams {
     /// Returns the amount of GSP-RM heap memory used during GSP-RM boot and initialization (up to
     /// and including the first client subdevice allocation).
-    fn base_rm_size(_chipset: Chipset) -> u64 {
-        // TODO: this needs to be updated to return the correct value for Hopper+ once support for
-        // them is added:
-        // u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100)
-        u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+    fn base_rm_size(chipset: Chipset) -> u64 {
+        if crate::fb::hal::fb_hal(chipset)
+            .non_wpr_heap_size()
+            .is_some()
+        {
+            GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100
+        } else {
+            u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+        }
     }
 
     /// Returns the amount of heap memory required to support a single channel allocation.
-    fn client_alloc_size() -> u64 {
-        u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
-            .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+    fn client_alloc_size(chipset: Chipset) -> u64 {
+        if crate::fb::hal::fb_hal(chipset)
+            .non_wpr_heap_size()
+            .is_some()
+        {
+            GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE_GH100
+        } else {
+            u64::from(bindings::GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE)
+        }
+        .align_up(GSP_HEAP_ALIGNMENT)
+        .expect("client_alloc_size alignment overflow")
     }
 
     /// Returns the amount of memory to reserve for management purposes for a framebuffer of size
@@ -74,7 +94,7 @@ fn management_overhead(fb_size: u64) -> u64 {
         u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
             .saturating_mul(fb_size_gb)
             .align_up(GSP_HEAP_ALIGNMENT)
-            .unwrap_or(u64::MAX)
+            .expect("management_overhead alignment overflow")
     }
 }
 
@@ -106,12 +126,25 @@ impl LibosParams {
                 * num::usize_as_u64(SZ_1M),
     };
 
+    /// Hopper/Blackwell+ GPUs need a larger minimum heap size than the bindings specify.
+    /// The r570 bindings set LIBOS3_BAREMETAL_MIN_MB to 88MB, but Hopper/Blackwell+ actually
+    /// requires 170MB (88 + 12 + 70).
+    const LIBOS_BLACKWELL: LibosParams = LibosParams {
+        carveout_size: num::u32_as_u64(bindings::GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL),
+        allowed_heap_size: GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MIN_MB_BLACKWELL
+            * num::usize_as_u64(SZ_1M)
+            ..num::u32_as_u64(bindings::GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS3_BAREMETAL_MAX_MB)
+                * num::usize_as_u64(SZ_1M),
+    };
+
     /// Returns the libos parameters corresponding to `chipset`.
     pub(crate) fn from_chipset(chipset: Chipset) -> &'static LibosParams {
         if chipset < Chipset::GA102 {
             &Self::LIBOS2
-        } else {
+        } else if chipset < Chipset::GH100 {
             &Self::LIBOS3
+        } else {
+            &Self::LIBOS_BLACKWELL
         }
     }
 
@@ -124,7 +157,7 @@ pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
             // RM boot working memory,
             .saturating_add(GspFwHeapParams::base_rm_size(chipset))
             // One RM client,
-            .saturating_add(GspFwHeapParams::client_alloc_size())
+            .saturating_add(GspFwHeapParams::client_alloc_size(chipset))
             // Overhead for memory management.
             .saturating_add(GspFwHeapParams::management_overhead(fb_size))
             // Clamp to the supported heap sizes.
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (26 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 20:12   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
                   ` (5 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Extract the SEC2 booter loading sequence into a dedicated helper
function. This is a *almost* pure refactoring with no behavior change,
done in preparation for adding an alternative FSP boot path. The one
slight difference is that an MBOX1 printing typo is fixed:

Previous output:

NovaCore 0000:e1:00.0: SEC2 MBOX0: 0x0, MBOX10x1

Fixed output:

NovaCore 0000:e1:00.0: SEC2 MBOX0: 0x0, MBOX1: 0x1

Cc: Timur Tabi <ttabi@nvidia.com>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs | 67 ++++++++++++++++---------------
 1 file changed, 35 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 465c18e4c888..6191986fc6b5 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -120,6 +120,40 @@ fn run_fwsec_frts(
         }
     }
 
+    fn run_booter(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        sec2_falcon: &Falcon<Sec2>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+    ) -> Result {
+        let booter_loader = BooterFirmware::new(
+            dev,
+            BooterKind::Loader,
+            chipset,
+            FIRMWARE_VERSION,
+            sec2_falcon,
+            bar,
+        )?;
+
+        sec2_falcon.reset(bar)?;
+        sec2_falcon.load(bar, &booter_loader)?;
+        let wpr_handle = wpr_meta.dma_handle();
+        let (mbox0, mbox1) = sec2_falcon.boot(
+            bar,
+            Some(wpr_handle as u32),
+            Some((wpr_handle >> 32) as u32),
+        )?;
+        dev_dbg!(dev, "SEC2 MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
+
+        if mbox0 != 0 {
+            dev_err!(dev, "Booter-load failed with error {:#x}\n", mbox0);
+            return Err(ENODEV);
+        }
+
+        Ok(())
+    }
+
     /// Attempt to boot the GSP.
     ///
     /// This is a GPU-dependent and complex procedure that involves loading firmware files from
@@ -146,15 +180,6 @@ pub(crate) fn boot(
 
         Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
 
-        let booter_loader = BooterFirmware::new(
-            dev,
-            BooterKind::Loader,
-            chipset,
-            FIRMWARE_VERSION,
-            sec2_falcon,
-            bar,
-        )?;
-
         let wpr_meta =
             CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
         dma_write!(wpr_meta[0] = GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
@@ -182,29 +207,7 @@ pub(crate) fn boot(
             "Using SEC2 to load and run the booter_load firmware...\n"
         );
 
-        sec2_falcon.reset(bar)?;
-        sec2_falcon.load(bar, &booter_loader)?;
-        let wpr_handle = wpr_meta.dma_handle();
-        let (mbox0, mbox1) = sec2_falcon.boot(
-            bar,
-            Some(wpr_handle as u32),
-            Some((wpr_handle >> 32) as u32),
-        )?;
-        dev_dbg!(
-            pdev,
-            "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n",
-            mbox0,
-            mbox1
-        );
-
-        if mbox0 != 0 {
-            dev_err!(
-                pdev,
-                "Booter-load failed with error {:#x}\n",
-                mbox0
-            );
-            return Err(ENODEV);
-        }
+        Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?;
 
         gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (27 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-17 20:20   ` Danilo Krummrich
  2026-02-10  2:45 ` [PATCH v4 30/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path John Hubbard
                   ` (4 subsequent siblings)
  33 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

On Hopper and Blackwell, FSP boots GSP with hardware lockdown enabled.
After FSP Chain of Trust completes, the driver must poll for lockdown
release before proceeding with GSP initialization. Add the register
bit and helper functions needed for this polling.

Cc: Gary Guo <gary@garyguo.net>
Cc: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs | 88 ++++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs     |  1 +
 2 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 6191986fc6b5..1e8a7306e078 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -15,7 +15,8 @@
     falcon::{
         gsp::Gsp,
         sec2::Sec2,
-        Falcon, //
+        Falcon,
+        FalconEngine, //
     },
     fb::FbLayout,
     firmware::{
@@ -154,6 +155,91 @@ fn run_booter(
         Ok(())
     }
 
+    /// Check if GSP lockdown has been released after FSP Chain of Trust
+    fn gsp_lockdown_released(
+        dev: &device::Device,
+        gsp_falcon: &Falcon<Gsp>,
+        bar: &Bar0,
+        fmc_boot_params_addr: u64,
+        mbox0: &mut u32,
+    ) -> bool {
+        // Read GSP falcon mailbox0
+        *mbox0 = gsp_falcon.read_mailbox0(bar);
+
+        // Check 1: If mbox0 has 0xbadf4100 pattern, GSP is still locked down
+        if *mbox0 != 0 && (*mbox0 & 0xffffff00) == 0xbadf4100 {
+            return false;
+        }
+
+        // Check 2: If mbox0 has a value, check if it's an error
+        if *mbox0 != 0 {
+            let mbox1 = gsp_falcon.read_mailbox1(bar);
+
+            let combined_addr = (u64::from(mbox1) << 32) | u64::from(*mbox0);
+            if combined_addr != fmc_boot_params_addr {
+                // Address doesn't match - GSP wrote an error code
+                // Return TRUE (lockdown released) with error
+                dev_dbg!(
+                    dev,
+                    "GSP lockdown error: mbox0={:#x}, combined_addr={:#x}, expected={:#x}\n",
+                    *mbox0,
+                    combined_addr,
+                    fmc_boot_params_addr
+                );
+                return true;
+            }
+        }
+
+        // Check 3: Verify HWCFG2 RISCV_BR_PRIV_LOCKDOWN bit is clear
+        let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &crate::falcon::gsp::Gsp::ID);
+        !hwcfg2.riscv_br_priv_lockdown()
+    }
+
+    /// Wait for GSP lockdown to be released after FSP Chain of Trust
+    #[expect(dead_code)]
+    fn wait_for_gsp_lockdown_release(
+        dev: &device::Device,
+        bar: &Bar0,
+        gsp_falcon: &Falcon<Gsp>,
+        fmc_boot_params_addr: u64,
+    ) -> Result<u32> {
+        dev_dbg!(dev, "Waiting for GSP lockdown release\n");
+
+        let mut mbox0: u32 = 0;
+
+        let (_, mbox0) = read_poll_timeout(
+            || {
+                let released = Self::gsp_lockdown_released(
+                    dev,
+                    gsp_falcon,
+                    bar,
+                    fmc_boot_params_addr,
+                    &mut mbox0,
+                );
+
+                Ok((released, mbox0))
+            },
+            |(released, _)| *released,
+            Delta::ZERO,
+            Delta::from_millis(4000),
+        )
+        .inspect_err(|_| {
+            dev_err!(dev, "GSP lockdown release timeout\n");
+        })?;
+
+        // Check mbox0 for error after wait completion
+        if mbox0 != 0 {
+            dev_err!(dev, "GSP-FMC boot failed (mbox: {:#x})\n", mbox0);
+            return Err(EIO);
+        }
+
+        dev_dbg!(
+            dev,
+            "GSP hardware lockdown fully released, proceeding with initialization\n"
+        );
+        Ok(mbox0)
+    }
+
     /// Attempt to boot the GSP.
     ///
     /// This is a GPU-dependent and complex procedure that involves loading firmware files from
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index dfc30247bca6..01788aa8d1f1 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -321,6 +321,7 @@ pub(crate) fn vga_workspace_addr(self) -> Option<u64> {
 register!(NV_PFALCON_FALCON_HWCFG2 @ PFalconBase[0x000000f4] {
     10:10   riscv as bool;
     12:12   mem_scrubbing as bool, "Set to 0 after memory scrubbing is completed";
+    13:13   riscv_br_priv_lockdown as bool, "RISC-V branch privilege lockdown bit";
     31:31   reset_ready as bool, "Signal indicating that reset is completed (GA102+)";
 });
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 30/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (28 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 31/33] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Add the FSP-based boot path for Hopper and Blackwell GPUs. Unlike
Turing/Ampere/Ada which use SEC2 to load the booter firmware, Hopper
and Blackwell use FSP (Firmware System Processor) with FMC firmware
to establish a Chain of Trust and boot GSP directly.

The boot() function now dispatches to either run_booter() (SEC2 path)
or run_fsp() (FSP path) based on the GPU architecture. The cmdq
commands are moved to after GSP boot, and the GSP sequencer is only
run for SEC2-based architectures.

Cc: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs           |   1 -
 drivers/gpu/nova-core/firmware/fsp.rs |   2 -
 drivers/gpu/nova-core/fsp.rs          |   6 +-
 drivers/gpu/nova-core/gsp/boot.rs     | 158 ++++++++++++++++++++------
 4 files changed, 123 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 7c502f15622c..a860f43ec5af 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -180,7 +180,6 @@ pub(crate) struct FbLayout {
     pub(crate) heap: FbRange,
     pub(crate) vf_partition_count: u8,
     /// Total reserved size (heap + PMU reserved), aligned to 2MB.
-    #[expect(unused)]
     pub(crate) total_reserved_size: u32,
 }
 
diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index edcc173c2fa6..e10954aa146a 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -14,7 +14,6 @@
     gpu::Chipset, //
 };
 
-#[expect(dead_code)]
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the .image section) - submitted to hardware
     pub(crate) fmc_image: DmaObject,
@@ -23,7 +22,6 @@ pub(crate) struct FspFirmware {
 }
 
 impl FspFirmware {
-    #[expect(dead_code)]
     pub(crate) fn new(
         dev: &device::Device<device::Bound>,
         chipset: Chipset,
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 35169f3bc446..5e3c15a71e2d 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -1,8 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
 
-// TODO: remove this once the code is fully functional
-#![expect(dead_code)]
-
 //! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
 //!
 //! Hopper/Blackwell use a simplified firmware boot sequence: FMC --> FSP --> GSP.
@@ -11,6 +8,7 @@
 
 use kernel::{
     device,
+    dma::CoherentAllocation,
     io::poll::read_poll_timeout,
     prelude::*,
     ptr::{
@@ -352,8 +350,6 @@ pub(crate) fn create_fmc_boot_params(
         wpr_meta_size: u32,
         libos_addr: u64,
     ) -> Result<kernel::dma::CoherentAllocation<GspFmcBootParams>> {
-        use kernel::dma::CoherentAllocation;
-
         const GSP_DMA_TARGET_COHERENT_SYSTEM: u32 = 1;
         const GSP_DMA_TARGET_NONCOHERENT_SYSTEM: u32 = 2;
 
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 1e8a7306e078..80d57a54c0c9 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -13,6 +13,7 @@
 use crate::{
     driver::Bar0,
     falcon::{
+        fsp::Fsp as FspEngine,
         gsp::Gsp,
         sec2::Sec2,
         Falcon,
@@ -24,6 +25,7 @@
             BooterFirmware,
             BooterKind, //
         },
+        fsp::FspFirmware,
         fwsec::{
             FwsecCommand,
             FwsecFirmware, //
@@ -31,9 +33,14 @@
         gsp::GspFirmware,
         FIRMWARE_VERSION, //
     },
-    gpu::Chipset,
+    fsp::Fsp,
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
     gsp::{
         commands,
+        fw::LibosMemoryRegionInitArgument,
         sequencer::{
             GspSequencer,
             GspSequencerParams, //
@@ -155,6 +162,55 @@ fn run_booter(
         Ok(())
     }
 
+    fn run_fsp(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        gsp_falcon: &Falcon<Gsp>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+        libos: &CoherentAllocation<LibosMemoryRegionInitArgument>,
+        fb_layout: &FbLayout,
+    ) -> Result {
+        let fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
+
+        Fsp::wait_secure_boot(dev, bar, chipset.arch())?;
+
+        let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
+
+        // fmc_full is a KVec<u8> for CPU-side signature extraction only.
+        // A separate buffer, fsp_fw.fmc_image, is what gets submitted to the hardware.
+        let signatures = Fsp::extract_fmc_signatures_static(dev, &fsp_fw.fmc_full)?;
+
+        // Create FMC boot parameters
+        let fmc_boot_params = Fsp::create_fmc_boot_params(
+            dev,
+            wpr_meta.dma_handle(),
+            core::mem::size_of::<GspFwWprMeta>() as u32,
+            libos.dma_handle(),
+        )?;
+
+        // Execute FSP Chain of Trust
+        // NOTE: FSP Chain of Trust handles GSP boot internally - we do NOT reset or boot GSP
+        Fsp::boot_gsp_fmc_with_signatures(
+            dev,
+            bar,
+            chipset,
+            &fsp_fw.fmc_image,
+            &fmc_boot_params,
+            u64::from(fb_layout.total_reserved_size),
+            false, // not resuming
+            &fsp_falcon,
+            &signatures,
+        )?;
+
+        // Wait for GSP lockdown to be released
+        let fmc_boot_params_addr = fmc_boot_params.dma_handle();
+        let _mbox0 =
+            Self::wait_for_gsp_lockdown_release(dev, bar, gsp_falcon, fmc_boot_params_addr)?;
+
+        Ok(())
+    }
+
     /// Check if GSP lockdown has been released after FSP Chain of Trust
     fn gsp_lockdown_released(
         dev: &device::Device,
@@ -196,7 +252,6 @@ fn gsp_lockdown_released(
     }
 
     /// Wait for GSP lockdown to be released after FSP Chain of Trust
-    #[expect(dead_code)]
     fn wait_for_gsp_lockdown_release(
         dev: &device::Device,
         bar: &Bar0,
@@ -257,43 +312,63 @@ pub(crate) fn boot(
     ) -> Result {
         let dev = pdev.as_ref();
 
-        let bios = Vbios::new(dev, bar)?;
-
         let gsp_fw = KBox::pin_init(GspFirmware::new(dev, chipset, FIRMWARE_VERSION), GFP_KERNEL)?;
 
         let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?;
         dev_dbg!(dev, "{:#x?}\n", fb_layout);
 
-        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
+        if matches!(
+            chipset.arch(),
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada
+        ) {
+            let bios = Vbios::new(dev, bar)?;
+            Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
+        }
 
         let wpr_meta =
             CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
         dma_write!(wpr_meta[0] = GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
 
-        self.cmdq
-            .send_command(bar, commands::SetSystemInfo::new(pdev))?;
-        self.cmdq.send_command(bar, commands::SetRegistry::new())?;
+        // For SEC2-based architectures, reset GSP and boot it before SEC2
+        if matches!(
+            chipset.arch(),
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada
+        ) {
+            gsp_falcon.reset(bar)?;
+            let libos_handle = self.libos.dma_handle();
+            let (mbox0, mbox1) = gsp_falcon.boot(
+                bar,
+                Some(libos_handle as u32),
+                Some((libos_handle >> 32) as u32),
+            )?;
+            dev_dbg!(
+                pdev,
+                "GSP MBOX0: {:#x}, MBOX1: {:#x}\n",
+                mbox0,
+                mbox1
+            );
 
-        gsp_falcon.reset(bar)?;
-        let libos_handle = self.libos.dma_handle();
-        let (mbox0, mbox1) = gsp_falcon.boot(
-            bar,
-            Some(libos_handle as u32),
-            Some((libos_handle >> 32) as u32),
-        )?;
-        dev_dbg!(
-            pdev,
-            "GSP MBOX0: {:#x}, MBOX1: {:#x}\n",
-            mbox0,
-            mbox1
-        );
+            dev_dbg!(
+                pdev,
+                "Using SEC2 to load and run the booter_load firmware...\n"
+            );
+        }
 
-        dev_dbg!(
-            pdev,
-            "Using SEC2 to load and run the booter_load firmware...\n"
-        );
+        match chipset.arch() {
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada => {
+                Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?
+            }
 
-        Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?;
+            Architecture::Hopper | Architecture::Blackwell => Self::run_fsp(
+                dev,
+                bar,
+                chipset,
+                gsp_falcon,
+                &wpr_meta,
+                &self.libos,
+                &fb_layout,
+            )?,
+        }
 
         gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version);
 
@@ -311,16 +386,27 @@ pub(crate) fn boot(
             gsp_falcon.is_riscv_active(bar),
         );
 
-        // Create and run the GSP sequencer.
-        let seq_params = GspSequencerParams {
-            bootloader_app_version: gsp_fw.bootloader.app_version,
-            libos_dma_handle: libos_handle,
-            gsp_falcon,
-            sec2_falcon,
-            dev: pdev.as_ref().into(),
-            bar,
-        };
-        GspSequencer::run(&mut self.cmdq, seq_params)?;
+        // Now that GSP is active, send system info and registry
+        self.cmdq
+            .send_command(bar, commands::SetSystemInfo::new(pdev))?;
+        self.cmdq.send_command(bar, commands::SetRegistry::new())?;
+
+        if matches!(
+            chipset.arch(),
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada
+        ) {
+            let libos_handle = self.libos.dma_handle();
+            // Create and run the GSP sequencer.
+            let seq_params = GspSequencerParams {
+                bootloader_app_version: gsp_fw.bootloader.app_version,
+                libos_dma_handle: libos_handle,
+                gsp_falcon,
+                sec2_falcon,
+                dev: pdev.as_ref().into(),
+                bar,
+            };
+            GspSequencer::run(&mut self.cmdq, seq_params)?;
+        }
 
         // Wait until GSP is fully initialized.
         commands::wait_gsp_init_done(&mut self.cmdq)?;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 31/33] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (29 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 30/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:45 ` [PATCH v4 32/33] gpu: nova-core: clarify the GPU firmware boot steps John Hubbard
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell GPUs use a different PCI config mirror base address
(0x092000) compared to earlier architectures (0x088000). Pass the chipset
through to GspSetSystemInfo::init() so it can select the correct address.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs        |  2 +-
 drivers/gpu/nova-core/gsp/commands.rs    |  8 +++++---
 drivers/gpu/nova-core/gsp/fw/commands.rs | 18 +++++++++++++++---
 3 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 80d57a54c0c9..aae5ffac262a 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -388,7 +388,7 @@ pub(crate) fn boot(
 
         // Now that GSP is active, send system info and registry
         self.cmdq
-            .send_command(bar, commands::SetSystemInfo::new(pdev))?;
+            .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
         self.cmdq.send_command(bar, commands::SetRegistry::new())?;
 
         if matches!(
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 8f270eca33be..e6a9a1fc6296 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -20,6 +20,7 @@
 
 use crate::{
     driver::Bar0,
+    gpu::Chipset,
     gsp::{
         cmdq::{
             Cmdq,
@@ -37,12 +38,13 @@
 /// The `GspSetSystemInfo` command.
 pub(crate) struct SetSystemInfo<'a> {
     pdev: &'a pci::Device<device::Bound>,
+    chipset: Chipset,
 }
 
 impl<'a> SetSystemInfo<'a> {
     /// Creates a new `GspSetSystemInfo` command using the parameters of `pdev`.
-    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>) -> Self {
-        Self { pdev }
+    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>, chipset: Chipset) -> Self {
+        Self { pdev, chipset }
     }
 }
 
@@ -52,7 +54,7 @@ impl<'a> CommandToGsp for SetSystemInfo<'a> {
     type InitError = Error;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
-        GspSetSystemInfo::init(self.pdev)
+        GspSetSystemInfo::init(self.pdev, self.chipset)
     }
 }
 
diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs
index 470d8edb62ff..fe8f56ba3e80 100644
--- a/drivers/gpu/nova-core/gsp/fw/commands.rs
+++ b/drivers/gpu/nova-core/gsp/fw/commands.rs
@@ -10,7 +10,13 @@
     }, //
 };
 
-use crate::gsp::GSP_PAGE_SIZE;
+use crate::{
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
+    gsp::GSP_PAGE_SIZE, //
+};
 
 use super::bindings;
 
@@ -24,7 +30,10 @@ pub(crate) struct GspSetSystemInfo {
 impl GspSetSystemInfo {
     /// Returns an in-place initializer for the `GspSetSystemInfo` command.
     #[allow(non_snake_case)]
-    pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, Error> + 'a {
+    pub(crate) fn init<'a>(
+        dev: &'a pci::Device<device::Bound>,
+        chipset: Chipset,
+    ) -> impl Init<Self, Error> + 'a {
         type InnerGspSystemInfo = bindings::GspSystemInfo;
         let init_inner = try_init!(InnerGspSystemInfo {
             gpuPhysAddr: dev.resource_start(0)?,
@@ -35,7 +44,10 @@ pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, E
             // Using TASK_SIZE in r535_gsp_rpc_set_system_info() seems wrong because
             // TASK_SIZE is per-task. That's probably a design issue in GSP-RM though.
             maxUserVa: (1 << 47) - 4096,
-            pciConfigMirrorBase: 0x088000,
+            pciConfigMirrorBase: match chipset.arch() {
+                Architecture::Turing | Architecture::Ampere | Architecture::Ada => 0x088000,
+                Architecture::Hopper | Architecture::Blackwell => 0x092000,
+            },
             pciConfigMirrorSize: 0x001000,
 
             PCIDeviceID: (u32::from(dev.device_id()) << 16) | u32::from(dev.vendor_id().as_raw()),
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 32/33] gpu: nova-core: clarify the GPU firmware boot steps
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (30 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 31/33] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
@ 2026-02-10  2:45 ` John Hubbard
  2026-02-10  2:46 ` [PATCH v4 33/33] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
  2026-02-10 22:27 ` [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:45 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

Now that Hopper/Blackwell GSP is up and running, it's clear how to
factor out the common code and the per-architecture code, for booting
up firmware. The key is that, for Turing, Ampere, and Ada, the SEC2
firmware is used and a CPU "sequencer" must be run. For Hopper,
Blackwell and later GPUs, there is no SEC2, no sequencer, but there is
an FSP to get running instead.

This change makes that clearly visible on-screen.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs | 118 +++++++++++++++++-------------
 1 file changed, 66 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index aae5ffac262a..02eec2961b5f 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -162,7 +162,48 @@ fn run_booter(
         Ok(())
     }
 
-    fn run_fsp(
+    /// Boot GSP via SEC2 booter firmware (Turing/Ampere/Ada path).
+    ///
+    /// This path uses FWSEC-FRTS to set up WPR2, then boots GSP directly,
+    /// then uses SEC2 to run the booter firmware.
+    #[allow(clippy::too_many_arguments)]
+    fn boot_via_sec2(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        chipset: Chipset,
+        gsp_falcon: &Falcon<Gsp>,
+        sec2_falcon: &Falcon<Sec2>,
+        fb_layout: &FbLayout,
+        libos: &CoherentAllocation<LibosMemoryRegionInitArgument>,
+        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
+    ) -> Result {
+        // Run FWSEC-FRTS to set up the WPR2 region
+        let bios = Vbios::new(dev, bar)?;
+        Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, fb_layout)?;
+
+        // Reset and boot GSP before SEC2
+        gsp_falcon.reset(bar)?;
+        let libos_handle = libos.dma_handle();
+        let (mbox0, mbox1) = gsp_falcon.boot(
+            bar,
+            Some(libos_handle as u32),
+            Some((libos_handle >> 32) as u32),
+        )?;
+        dev_dbg!(dev, "GSP MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
+        dev_dbg!(
+            dev,
+            "Using SEC2 to load and run the booter_load firmware...\n"
+        );
+
+        // Run booter via SEC2
+        Self::run_booter(dev, bar, chipset, sec2_falcon, wpr_meta)
+    }
+
+    /// Boot GSP via FSP Chain of Trust (Hopper/Blackwell+ path).
+    ///
+    /// This path uses FSP to establish a chain of trust and boot GSP-FMC. FSP handles
+    /// the GSP boot internally - no manual GSP reset/boot is needed.
+    fn boot_via_fsp(
         dev: &device::Device<device::Bound>,
         bar: &Bar0,
         chipset: Chipset,
@@ -311,55 +352,34 @@ pub(crate) fn boot(
         sec2_falcon: &Falcon<Sec2>,
     ) -> Result {
         let dev = pdev.as_ref();
+        let uses_sec2 = matches!(
+            chipset.arch(),
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada
+        );
 
         let gsp_fw = KBox::pin_init(GspFirmware::new(dev, chipset, FIRMWARE_VERSION), GFP_KERNEL)?;
 
         let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?;
         dev_dbg!(dev, "{:#x?}\n", fb_layout);
 
-        if matches!(
-            chipset.arch(),
-            Architecture::Turing | Architecture::Ampere | Architecture::Ada
-        ) {
-            let bios = Vbios::new(dev, bar)?;
-            Self::run_fwsec_frts(dev, gsp_falcon, bar, &bios, &fb_layout)?;
-        }
-
         let wpr_meta =
             CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
         dma_write!(wpr_meta[0] = GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
 
-        // For SEC2-based architectures, reset GSP and boot it before SEC2
-        if matches!(
-            chipset.arch(),
-            Architecture::Turing | Architecture::Ampere | Architecture::Ada
-        ) {
-            gsp_falcon.reset(bar)?;
-            let libos_handle = self.libos.dma_handle();
-            let (mbox0, mbox1) = gsp_falcon.boot(
+        // Architecture-specific boot path
+        if uses_sec2 {
+            Self::boot_via_sec2(
+                dev,
                 bar,
-                Some(libos_handle as u32),
-                Some((libos_handle >> 32) as u32),
+                chipset,
+                gsp_falcon,
+                sec2_falcon,
+                &fb_layout,
+                &self.libos,
+                &wpr_meta,
             )?;
-            dev_dbg!(
-                pdev,
-                "GSP MBOX0: {:#x}, MBOX1: {:#x}\n",
-                mbox0,
-                mbox1
-            );
-
-            dev_dbg!(
-                pdev,
-                "Using SEC2 to load and run the booter_load firmware...\n"
-            );
-        }
-
-        match chipset.arch() {
-            Architecture::Turing | Architecture::Ampere | Architecture::Ada => {
-                Self::run_booter(dev, bar, chipset, sec2_falcon, &wpr_meta)?
-            }
-
-            Architecture::Hopper | Architecture::Blackwell => Self::run_fsp(
+        } else {
+            Self::boot_via_fsp(
                 dev,
                 bar,
                 chipset,
@@ -367,9 +387,10 @@ pub(crate) fn boot(
                 &wpr_meta,
                 &self.libos,
                 &fb_layout,
-            )?,
+            )?;
         }
 
+        // Common post-boot initialization
         gsp_falcon.write_os_version(bar, gsp_fw.bootloader.app_version);
 
         // Poll for RISC-V to become active before running sequencer
@@ -380,29 +401,22 @@ pub(crate) fn boot(
             Delta::from_secs(5),
         )?;
 
-        dev_dbg!(
-            pdev,
-            "RISC-V active? {}\n",
-            gsp_falcon.is_riscv_active(bar),
-        );
+        dev_dbg!(dev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(bar));
 
         // Now that GSP is active, send system info and registry
         self.cmdq
             .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
         self.cmdq.send_command(bar, commands::SetRegistry::new())?;
 
-        if matches!(
-            chipset.arch(),
-            Architecture::Turing | Architecture::Ampere | Architecture::Ada
-        ) {
+        // SEC2-based architectures need to run the GSP sequencer
+        if uses_sec2 {
             let libos_handle = self.libos.dma_handle();
-            // Create and run the GSP sequencer.
             let seq_params = GspSequencerParams {
                 bootloader_app_version: gsp_fw.bootloader.app_version,
                 libos_dma_handle: libos_handle,
                 gsp_falcon,
                 sec2_falcon,
-                dev: pdev.as_ref().into(),
+                dev: dev.into(),
                 bar,
             };
             GspSequencer::run(&mut self.cmdq, seq_params)?;
@@ -414,8 +428,8 @@ pub(crate) fn boot(
         // Obtain and display basic GPU information.
         let info = commands::get_gsp_info(&mut self.cmdq, bar)?;
         match info.gpu_name() {
-            Ok(name) => dev_info!(pdev, "GPU name: {}\n", name),
-            Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e),
+            Ok(name) => dev_info!(dev, "GPU name: {}\n", name),
+            Err(e) => dev_warn!(dev, "GPU name unavailable: {:?}\n", e),
         }
 
         Ok(())
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH v4 33/33] gpu: nova-core: fix aux device registration for multi-GPU systems
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (31 preceding siblings ...)
  2026-02-10  2:45 ` [PATCH v4 32/33] gpu: nova-core: clarify the GPU firmware boot steps John Hubbard
@ 2026-02-10  2:46 ` John Hubbard
  2026-02-10 22:27 ` [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10  2:46 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, John Hubbard

The auxiliary device registration was using a hardcoded ID of 0, which
caused probe() to fail on multi-GPU systems with:

   sysfs: cannot create duplicate filename '/bus/auxiliary/devices/NovaCore.nova-drm.0'

Fix this by using an atomic counter to generate unique IDs for each
GPU's aux device registration. The TODO item to eventually use XArray
for recycling aux device IDs is retained, but for now, this works very
nicely.

This has the side effect of making debugfs[1] work on multi-GPU systems.

[1] https://lore.kernel.org/20260203224757.871729-1-ttabi@nvidia.com

Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 4ff07b643db6..52729799725b 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -1,5 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::sync::atomic::{
+    AtomicU32,
+    Ordering, //
+};
+
 use kernel::{
     auxiliary,
     device::Core,
@@ -21,6 +26,9 @@
     Spec, //
 };
 
+/// Counter for generating unique auxiliary device IDs.
+static AUXILIARY_ID_COUNTER: AtomicU32 = AtomicU32::new(0);
+
 #[pin_data]
 pub(crate) struct NovaCore {
     #[pin]
@@ -84,12 +92,17 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
             // other threads of execution.
             unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
 
+            // TODO[XARR]: Use XArray for proper ID allocation/recycling. Until then, use a simple
+            // atomic counter which never recycles IDs. A unique ID is required for multi-GPU
+            // systems, because without it, probe() would fail for all but the first GPU.
+            let aux_id = AUXILIARY_ID_COUNTER.fetch_add(1, Ordering::Relaxed);
+
             Ok(try_pin_init!(Self {
                 gpu <- Gpu::new(pdev, devres_bar.clone(), devres_bar.access(pdev.as_ref())?, spec),
                 _reg <- auxiliary::Registration::new(
                     pdev.as_ref(),
                     c"nova-drm",
-                    0, // TODO[XARR]: Once it lands, use XArray; for now we don't use the ID.
+                    aux_id,
                     crate::MODULE_NAME
                 ),
             }))
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support
  2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (32 preceding siblings ...)
  2026-02-10  2:46 ` [PATCH v4 33/33] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
@ 2026-02-10 22:27 ` John Hubbard
  33 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-10 22:27 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Zhi Wang, David Airlie, Simona Vetter, Bjorn Helgaas,
	Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/9/26 6:45 PM, John Hubbard wrote:
> Hi,
> 
> This is based on the Feb 5, 2026 linux-next: commit 9845cf73f7db ("Add
> linux-next specific files for 20260205") That's new enough to have the
> pdev.as_ref() changes (see below for details), but not so new as to
> include the current merge window churn for Linux .70.

Forgot to mention that there is a git branch here, for convenience:

 https://github.com/johnhubbard/linux/tree/nova-core-blackwell-linux-next-v4

thanks,
-- 
John Hubbard

> 
> I've re-tested on Ampere (GA104) and Blackwell (GB202) RTX GPUs.
> 
> Data center GPUs remain as TODO items: GA100 needs some additional code,
> Hopper/GH100 might work but is not yet tested, and I haven't even
> thought about Blackwell data center GPUs.
> 
> So, even though many patches say Hopper/Blackwell, there may be some
> test-and-fix work remaining there.
> 
> Changes in v4:
> 
> * Fixed the IOMMU page faults on address 0x0 that I was seeing on v3 and
>   earlier, for the iommu enabled case. These were due to the sysmem
>   flush buffer being in a different location for Blackwell, so I've
>   HAL-ified that aspect.
> 
> * Added a patch (0001) to pass pdev directly to dev_* logging macros.
>   Then converted the remaining patches to also use pdev directly,
>   instead of pdev.as_ref(). This is only possible in branches that have
>   commit a38cd1fea989 ("rust: device: support `dev_printk` on all
>   devices"), which in turn is why this v4 is based on a linux-next
>   commit.
> 
> * Changed FmcSignatures fields from [u32; N] to [u8; N] arrays because
>   the data is not treated as 32-bit integers. This eliminates the need
>   for .as_bytes_mut() in the FMC signature extraction patch and allows
>   using named constants like [u8; FSP_HASH_SIZE]. (From Timur Tabi's
>   review.)
> 
> * Changed .unwrap_or(u64::MAX) to .expect("...") for alignment overflow
>   in client_alloc_size() and management_overhead(). A panic is warranted
>   here since the values are compile-time constants and overflow is
>   impossible. (From Timur Tabi's review.)
> 
> * Added a patch at the end that I actually expect will get merged
>   earlier, separately. But for now, it avoids nova-drm aux bus
>   registration failure on multi-GPU systems, which in turn keeps the
>   driver alive, which in turn avoids a driver teardown missing feature
>   (pre-existing), which in turn avoids IOMMU page faults at non-zero
>   addresses. whew. :)
> 
> Changes in v3:
> 
> * Rebased onto linux-next (20260205), which includes several
>   rust-for-linux updates that affected nova-core.
> 
> * Removed redundant .as_ref() from dev_*!() macro call sites, since the
>   dev_printk!() macro now calls .as_ref() internally (Gary Guo's
>   "remove redundant .as_ref() for dev_* print" series).
> 
> * Added a `use kernel::io::Io` import in regs.rs, needed after the
>   upstream separation of generic I/O helpers from the MMIO
>   implementation.
> 
> Changes in v2:
> 
> v2 is here:
>     https://lore.kernel.org/20260131005604.454172-1-jhubbard@nvidia.com
> 
> * GA100 (an Ampere chip whose firmware boot steps are closer to Turing,
>   than to other Amperes) returns ENOTSUPP for now because it is *known*
>   to not work yet.
> 
> * FSP: use the new Chipset::fsp_cot_version() method instead of a
>   hardcoded constant. This fixes a known wrongness on GH100.
> 
> * Changed to a HAL approach to handle the slightly different non-WPR
>   heap sizes, for Hopper vs. Blackwell.
> 
> * Return Option instead of Result from get_gsp_sigs_section() since
>   the failure case is simply "not found".
> 
> * Return DmaMask directly from dma_mask() instead of returning a bit
>   count.
> 
> * Change fmc_full from DmaObject to KVec<u8> since it's only used for
>   CPU-side signature extraction and is never submitted to hardware
>   (only fmc_image is). This eliminates the need for unsafe code and
>   the associated SAFETY comment entirely.
> 
> * Use as_bytes_mut() instead of unsafe core::slice::from_raw_parts_mut()
>   for copying FMC signature data (hash, public_key, signature arrays).
> 
> * Refactor wait_for_gsp_lockdown_release() to use early return with ?
>   instead of chained .inspect_err().map().and_then() pattern.
> 
> * Removed many dev_dbg! statements.
> 
> * Use IEC binary prefix "MiB" instead of "MB" for memory size output.
>   Also improved display of small sizes (e.g., "24 KiB" instead of
>   "0 MB") and fixed a typo ("suprising" -> "surprising").
> 
> * Reordered the "skip GFW boot waiting" commit to appear earlier in the
>   series.
> 
> * Series has been reduced from 31 to 30 patches, because the "needs
>   large reserved mem" patch was absorbed into the non-WPR heap size
>   patch.
> 
> John Hubbard (33):
>   gpu: nova-core: pass pdev directly to dev_* logging macros
>   gpu: nova-core: print FB sizes, along with ranges
>   gpu: nova-core: add FbRange.len() and use it in boot.rs
>   gpu: nova-core: Hopper/Blackwell: basic GPU identification
>   gpu: nova-core: factor .fwsignature* selection into a new
>     get_gsp_sigs_section()
>   gpu: nova-core: use GPU Architecture to simplify HAL selections
>   gpu: nova-core: apply the one "use" item per line policy to
>     commands.rs
>   gpu: nova-core: set DMA mask width based on GPU architecture
>   gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
>   gpu: nova-core: move firmware image parsing code to firmware.rs
>   gpu: nova-core: factor out a section_name_eq() function
>   gpu: nova-core: don't assume 64-bit firmware images
>   gpu: nova-core: add support for 32-bit firmware images
>   gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
>   gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
>     of FSP
>   gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
>   gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
>   gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
>   gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
>   gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
>     waiting
>   gpu: nova-core: Hopper/Blackwell: add FSP message structures
>   gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
>   gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
>   gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
>   gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
>   gpu: nova-core: Blackwell: use correct sysmem flush registers
>   gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
>   gpu: nova-core: refactor SEC2 booter loading into run_booter() helper
>   gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
>   gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path
>   gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
>   gpu: nova-core: clarify the GPU firmware boot steps
>   gpu: nova-core: fix aux device registration for multi-GPU systems
> 
>  drivers/gpu/nova-core/driver.rs          |  48 +-
>  drivers/gpu/nova-core/falcon.rs          |   1 +
>  drivers/gpu/nova-core/falcon/fsp.rs      | 160 +++++++
>  drivers/gpu/nova-core/falcon/hal.rs      |  20 +-
>  drivers/gpu/nova-core/fb.rs              | 118 ++++-
>  drivers/gpu/nova-core/fb/hal.rs          |  34 +-
>  drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
>  drivers/gpu/nova-core/fb/hal/gb100.rs    |  73 +++
>  drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
>  drivers/gpu/nova-core/fb/hal/gh100.rs    |  37 ++
>  drivers/gpu/nova-core/firmware.rs        | 186 ++++++++
>  drivers/gpu/nova-core/firmware/fsp.rs    |  47 ++
>  drivers/gpu/nova-core/firmware/gsp.rs    | 140 ++----
>  drivers/gpu/nova-core/fsp.rs             | 561 +++++++++++++++++++++++
>  drivers/gpu/nova-core/gpu.rs             |  87 +++-
>  drivers/gpu/nova-core/gsp/boot.rs        | 337 +++++++++++---
>  drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
>  drivers/gpu/nova-core/gsp/fw.rs          |  63 ++-
>  drivers/gpu/nova-core/gsp/fw/commands.rs |  32 +-
>  drivers/gpu/nova-core/nova_core.rs       |   1 +
>  drivers/gpu/nova-core/num.rs             |  10 +
>  drivers/gpu/nova-core/regs.rs            |  95 ++++
>  22 files changed, 1856 insertions(+), 266 deletions(-)
>  create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
>  create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fsp.rs
> 
> 
> base-commit: 9845cf73f7db6094c0d8419d6adb848028f4a921



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros
  2026-02-10  2:45 ` [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
@ 2026-02-11 10:06   ` Danilo Krummrich
  2026-02-11 18:48     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-11 10:06 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> The dev_dbg!, dev_info!, dev_err!, and dev_warn! macros now accept
> pci::Device directly without requiring an explicit .as_ref()
> conversion to device::Device, thanks to commit a38cd1fea989
> ("rust: device: support `dev_printk` on all devices").
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

There is already [1], which I queued up for when v7.0-rc1 is out.

[1] https://lore.kernel.org/all/20260123175854.176735-7-gary@kernel.org/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  2026-02-10  2:45 ` [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
@ 2026-02-11 10:09   ` Danilo Krummrich
  2026-02-12  1:49     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-11 10:09 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> Hopper and Blackwell GPUs use FSP-based secure boot and do not require
> waiting for GFW_BOOT completion. Skip this step for these architectures.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/gpu.rs | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 24feb0e8723e..f04e2a795e90 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -304,10 +304,19 @@ pub(crate) fn new<'a>(
>          let chipset = spec.chipset();
>  
>          try_pin_init!(Self {
> -            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
> +            // Turing, Ampere, Ada: we must wait for GFW_BOOT completion before doing any
> +            // significant setup on the GPU.
> +            //
> +            // Hopper/Blackwell: skip GFW_BOOT completion waiting entirely, and use the simpler FSP
> +            // Chain of Trust boot path (elsewhere) instead.
>              _: {
> -                gfw::wait_gfw_boot_completion(bar)
> -                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
> +                if matches!(
> +                    chipset.arch(),
> +                    Architecture::Turing | Architecture::Ampere | Architecture::Ada

I assume Blackwell is not an exception and we expect this to be the case for
future architectures as well? I.e. checking for "!Architecture::Blackwell" makes
no sense?

> +                ) {
> +                    gfw::wait_gfw_boot_completion(bar)
> +                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
> +                }
>              },

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section()
  2026-02-10  2:45 ` [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section() John Hubbard
@ 2026-02-11 10:16   ` Danilo Krummrich
  2026-02-12  0:39     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-11 10:16 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> Keep Gsp::new() from getting too cluttered, by factoring out the
> selection of .fwsignature* items. This will continue to grow as we add
> GPUs.
>
> Reviewed-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware/gsp.rs | 60 ++++++++++++++-------------
>  1 file changed, 31 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
> index bc2243450989..10761716ed93 100644
> --- a/drivers/gpu/nova-core/firmware/gsp.rs
> +++ b/drivers/gpu/nova-core/firmware/gsp.rs
> @@ -146,6 +146,36 @@ pub(crate) struct GspFirmware {
>  }
>  
>  impl GspFirmware {
> +    fn get_gsp_sigs_section(chipset: Chipset) -> Option<&'static str> {

Please don't use the 'get' prefix, as it commonly indicates taking a reference
count.

Let's use something like find_gsp_sigs_section().

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-02-10  2:45 ` [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
@ 2026-02-11 10:28   ` Danilo Krummrich
  2026-02-12  2:06     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-11 10:28 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> This removes a "TODO" item in the code, which was hardcoded to work on
> Ampere and Ada GPUs. Hopper/Blackwell+ have a larger width, so do an
> early read of boot42, in order to pick the correct value.
>
> Cc: Gary Guo <gary@garyguo.net>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/driver.rs | 33 ++++++++++++++--------------
>  drivers/gpu/nova-core/gpu.rs    | 38 ++++++++++++++++++++++++---------
>  2 files changed, 44 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
> index e39885c0d5ca..4ff07b643db6 100644
> --- a/drivers/gpu/nova-core/driver.rs
> +++ b/drivers/gpu/nova-core/driver.rs
> @@ -5,7 +5,6 @@
>      device::Core,
>      devres::Devres,
>      dma::Device,
> -    dma::DmaMask,
>      pci,
>      pci::{
>          Class,
> @@ -17,7 +16,10 @@
>      sync::Arc, //
>  };
>  
> -use crate::gpu::Gpu;
> +use crate::gpu::{
> +    Gpu,
> +    Spec, //
> +};
>  
>  #[pin_data]
>  pub(crate) struct NovaCore {
> @@ -29,14 +31,6 @@ pub(crate) struct NovaCore {
>  
>  const BAR0_SIZE: usize = SZ_16M;
>  
> -// For now we only support Ampere which can use up to 47-bit DMA addresses.
> -//
> -// TODO: Add an abstraction for this to support newer GPUs which may support
> -// larger DMA addresses. Limiting these GPUs to smaller address widths won't
> -// have any adverse affects, unless installed on systems which require larger
> -// DMA addresses. These systems should be quite rare.
> -const GPU_DMA_BITS: u32 = 47;
> -
>  pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
>  
>  kernel::pci_device_table!(
> @@ -75,18 +69,23 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
>              pdev.enable_device_mem()?;
>              pdev.set_master();
>  
> -            // SAFETY: No concurrent DMA allocations or mappings can be made because
> -            // the device is still being probed and therefore isn't being used by
> -            // other threads of execution.
> -            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
> -
> -            let bar = Arc::pin_init(

Spurious rename.

> +            let devres_bar = Arc::pin_init(
>                  pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
>                  GFP_KERNEL,
>              )?;
>  
> +            // Read the GPU spec early to determine the correct DMA address width.

Hm.. we should move the dma_set_mask_and_coherent() call into Gpu::new(), so all
GPU specific initialization remains in the constructor of Gpu.

> +            // Hopper/Blackwell+ support 52-bit DMA addresses, earlier architectures use 47-bit.

I'd move this down to the dma_set_mask_and_coherent() call, or maybe just remove
it as well, since we have the very same comment for Architecture::dma_mask().

> +            let spec = Spec::new(pdev.as_ref(), devres_bar.access(pdev.as_ref())?)?;
> +            dev_info!(pdev.as_ref(), "NVIDIA ({})\n", spec);

This re-introduces pdev.as_ref().

> +
> +            // SAFETY: No concurrent DMA allocations or mappings can be made because
> +            // the device is still being probed and therefore isn't being used by
> +            // other threads of execution.
> +            unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
> +
>              Ok(try_pin_init!(Self {
> -                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
> +                gpu <- Gpu::new(pdev, devres_bar.clone(), devres_bar.access(pdev.as_ref())?, spec),
>                  _reg <- auxiliary::Registration::new(
>                      pdev.as_ref(),
>                      c"nova-drm",

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-02-10  2:45 ` [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
@ 2026-02-11 10:57   ` Danilo Krummrich
  2026-02-12  2:09     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-11 10:57 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> Add external memory (EMEM) read/write operations to the GPU's FSP falcon
> engine. These operations use Falcon PIO (Programmed I/O) to communicate
> with the FSP through indirect memory access.
>
> Cc: Gary Guo <gary@garyguo.net>
> Cc: Timur Tabi <ttabi@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/falcon/fsp.rs | 59 ++++++++++++++++++++++++++++-
>  drivers/gpu/nova-core/regs.rs       | 13 +++++++
>  2 files changed, 71 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
> index cc3fc3cf2f6a..fb1c8c89d2ff 100644
> --- a/drivers/gpu/nova-core/falcon/fsp.rs
> +++ b/drivers/gpu/nova-core/falcon/fsp.rs
> @@ -5,13 +5,20 @@
>  //! The FSP falcon handles secure boot and Chain of Trust operations
>  //! on Hopper and Blackwell architectures, replacing SEC2's role.
>  
> +use kernel::prelude::*;
> +
>  use crate::{
> +    driver::Bar0,
>      falcon::{
> +        Falcon,
>          FalconEngine,
>          PFalcon2Base,
>          PFalconBase, //
>      },
> -    regs::macros::RegisterBase,
> +    regs::{
> +        self,
> +        macros::RegisterBase, //
> +    },
>  };
>  
>  /// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
> @@ -29,3 +36,53 @@ impl RegisterBase<PFalcon2Base> for Fsp {
>  impl FalconEngine for Fsp {
>      const ID: Self = Fsp(());
>  }
> +
> +impl Falcon<Fsp> {
> +    /// Writes `data` to FSP external memory at byte `offset` using Falcon PIO.
> +    ///
> +    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
> +    #[expect(unused)]
> +    pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
> +        // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
> +        if offset % 4 != 0 || data.len() % 4 != 0 {
> +            return Err(EINVAL);
> +        }
> +
> +        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
> +            .set_wr_mode(true)
> +            .set_offset(offset)
> +            .write(bar, &Fsp::ID);
> +
> +        for chunk in data.chunks_exact(4) {
> +            let word = u32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
> +            regs::NV_PFALCON_FALCON_EMEM_DATA::default()
> +                .set_data(word)
> +                .write(bar, &Fsp::ID);
> +        }
> +
> +        Ok(())
> +    }
> +
> +    /// Reads FSP external memory at byte `offset` into `data` using Falcon PIO.
> +    ///
> +    /// Returns `EINVAL` if offset or data length is not 4-byte aligned.
> +    #[expect(unused)]
> +    pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
> +        // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
> +        if offset % 4 != 0 || data.len() % 4 != 0 {
> +            return Err(EINVAL);
> +        }
> +
> +        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
> +            .set_rd_mode(true)
> +            .set_offset(offset)
> +            .write(bar, &Fsp::ID);
> +
> +        for chunk in data.chunks_exact_mut(4) {
> +            let word = regs::NV_PFALCON_FALCON_EMEM_DATA::read(bar, &Fsp::ID).data();
> +            chunk.copy_from_slice(&word.to_le_bytes());
> +        }
> +
> +        Ok(())
> +    }
> +}

Technically, we could represent this as a separate I/O backend and use IoView /
IoSlice (once we have it).

So, you could have Falcon<Fsp>::emem(), which returns an &Emem that implements
Io [1].

This way we would get IoView and register!() for free on top of it. IoView will
allow you to modify fields of the FSP structures similar to what we have for DMA
with dma_read!() and dma_write!().

I just briefly glanced at the subsequent patches, but it looks like this could
save quite some code.

We may not get the full potential right away, as IoView is still WIP, but I
think it makes sense to consider it for a follow-up.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git/tree/rust/kernel/io.rs?h=driver-core-next#n303

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros
  2026-02-11 10:06   ` Danilo Krummrich
@ 2026-02-11 18:48     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-11 18:48 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/11/26 2:06 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> The dev_dbg!, dev_info!, dev_err!, and dev_warn! macros now accept
>> pci::Device directly without requiring an explicit .as_ref()
>> conversion to device::Device, thanks to commit a38cd1fea989
>> ("rust: device: support `dev_printk` on all devices").
>>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> 
> There is already [1], which I queued up for when v7.0-rc1 is out.
> 
> [1] https://lore.kernel.org/all/20260123175854.176735-7-gary@kernel.org/

I thought I saw that go by. OK, this will go away when I rebase onto
the upcoming drm-rust-next.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section()
  2026-02-11 10:16   ` Danilo Krummrich
@ 2026-02-12  0:39     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-12  0:39 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/11/26 2:16 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
...
>> diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
>> index bc2243450989..10761716ed93 100644
>> --- a/drivers/gpu/nova-core/firmware/gsp.rs
>> +++ b/drivers/gpu/nova-core/firmware/gsp.rs
>> @@ -146,6 +146,36 @@ pub(crate) struct GspFirmware {
>>  }
>>  
>>  impl GspFirmware {
>> +    fn get_gsp_sigs_section(chipset: Chipset) -> Option<&'static str> {
> 
> Please don't use the 'get' prefix, as it commonly indicates taking a reference
> count.
> 
> Let's use something like find_gsp_sigs_section().

Done, thanks for catching that.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  2026-02-11 10:09   ` Danilo Krummrich
@ 2026-02-12  1:49     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-12  1:49 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/11/26 2:09 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
...
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index 24feb0e8723e..f04e2a795e90 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -304,10 +304,19 @@ pub(crate) fn new<'a>(
>>          let chipset = spec.chipset();
>>  
>>          try_pin_init!(Self {
>> -            // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
>> +            // Turing, Ampere, Ada: we must wait for GFW_BOOT completion before doing any
>> +            // significant setup on the GPU.
>> +            //
>> +            // Hopper/Blackwell: skip GFW_BOOT completion waiting entirely, and use the simpler FSP
>> +            // Chain of Trust boot path (elsewhere) instead.
>>              _: {
>> -                gfw::wait_gfw_boot_completion(bar)
>> -                    .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
>> +                if matches!(
>> +                    chipset.arch(),
>> +                    Architecture::Turing | Architecture::Ampere | Architecture::Ada
> 
> I assume Blackwell is not an exception and we expect this to be the case for
> future architectures as well? I.e. checking for "!Architecture::Blackwell" makes
> no sense?

You are correct. I've applied this locally, so that future GPUs
continue to skip GFW_BOOT (which I'm thinking of renaming now, but
in a future patchset):

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index f04e2a795e90..2034c05c04de 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -167,6 +167,18 @@ pub(crate) const fn dma_mask(&self) -> DmaMask {
             Self::Hopper | Self::Blackwell => DmaMask::new::<52>(),
         }
     }
+
+    /// Returns whether the GPU uses GFW_BOOT for firmware loading.
+    ///
+    /// Pre-Hopper architectures (Turing, Ampere, Ada) require waiting for GFW_BOOT completion
+    /// before any significant GPU setup. Hopper and later use the FSP Chain of Trust boot path
+    /// instead.
+    pub(crate) const fn needs_gfw_boot(&self) -> bool {
+        match self {
+            Self::Turing | Self::Ampere | Self::Ada => true,
+            _ => false,
+        }
+    }
 }
 
 impl TryFrom<u8> for Architecture {
@@ -304,16 +316,8 @@ pub(crate) fn new<'a>(
         let chipset = spec.chipset();
 
         try_pin_init!(Self {
-            // Turing, Ampere, Ada: we must wait for GFW_BOOT completion before doing any
-            // significant setup on the GPU.
-            //
-            // Hopper/Blackwell: skip GFW_BOOT completion waiting entirely, and use the simpler FSP
-            // Chain of Trust boot path (elsewhere) instead.
             _: {
-                if matches!(
-                    chipset.arch(),
-                    Architecture::Turing | Architecture::Ampere | Architecture::Ada
-                ) {
+                if chipset.arch().needs_gfw_boot() {
                     gfw::wait_gfw_boot_completion(bar)
                         .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
                 }


> 
>> +                ) {
>> +                    gfw::wait_gfw_boot_completion(bar)
>> +                        .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
>> +                }
>>              },

thanks,
-- 
John Hubbard


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-02-11 10:28   ` Danilo Krummrich
@ 2026-02-12  2:06     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-12  2:06 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/11/26 2:28 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> This removes a "TODO" item in the code, which was hardcoded to work on
>> Ampere and Ada GPUs. Hopper/Blackwell+ have a larger width, so do an
>> early read of boot42, in order to pick the correct value.
>>
>> Cc: Gary Guo <gary@garyguo.net>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/driver.rs | 33 ++++++++++++++--------------
>>  drivers/gpu/nova-core/gpu.rs    | 38 ++++++++++++++++++++++++---------
>>  2 files changed, 44 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
>> index e39885c0d5ca..4ff07b643db6 100644
>> --- a/drivers/gpu/nova-core/driver.rs
>> +++ b/drivers/gpu/nova-core/driver.rs
>> @@ -5,7 +5,6 @@
>>      device::Core,
>>      devres::Devres,
>>      dma::Device,
>> -    dma::DmaMask,
>>      pci,
>>      pci::{
>>          Class,
>> @@ -17,7 +16,10 @@
>>      sync::Arc, //
>>  };
>>  
>> -use crate::gpu::Gpu;
>> +use crate::gpu::{
>> +    Gpu,
>> +    Spec, //
>> +};
>>  
>>  #[pin_data]
>>  pub(crate) struct NovaCore {
>> @@ -29,14 +31,6 @@ pub(crate) struct NovaCore {
>>  
>>  const BAR0_SIZE: usize = SZ_16M;
>>  
>> -// For now we only support Ampere which can use up to 47-bit DMA addresses.
>> -//
>> -// TODO: Add an abstraction for this to support newer GPUs which may support
>> -// larger DMA addresses. Limiting these GPUs to smaller address widths won't
>> -// have any adverse affects, unless installed on systems which require larger
>> -// DMA addresses. These systems should be quite rare.
>> -const GPU_DMA_BITS: u32 = 47;
>> -
>>  pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
>>  
>>  kernel::pci_device_table!(
>> @@ -75,18 +69,23 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
>>              pdev.enable_device_mem()?;
>>              pdev.set_master();
>>  
>> -            // SAFETY: No concurrent DMA allocations or mappings can be made because
>> -            // the device is still being probed and therefore isn't being used by
>> -            // other threads of execution.
>> -            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
>> -
>> -            let bar = Arc::pin_init(
> 
> Spurious rename.

Reverted.

> 
>> +            let devres_bar = Arc::pin_init(
>>                  pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
>>                  GFP_KERNEL,
>>              )?;
>>  
>> +            // Read the GPU spec early to determine the correct DMA address width.
> 
> Hm.. we should move the dma_set_mask_and_coherent() call into Gpu::new(), so all
> GPU specific initialization remains in the constructor of Gpu.

Makes sense, done.

> 
>> +            // Hopper/Blackwell+ support 52-bit DMA addresses, earlier architectures use 47-bit.
> 
> I'd move this down to the dma_set_mask_and_coherent() call, or maybe just remove
> it as well, since we have the very same comment for Architecture::dma_mask().

Done.

> 
>> +            let spec = Spec::new(pdev.as_ref(), devres_bar.access(pdev.as_ref())?)?;
>> +            dev_info!(pdev.as_ref(), "NVIDIA ({})\n", spec);
> 
> This re-introduces pdev.as_ref().

Fixed.

Very helpful, thanks!

thanks,
-- 
John Hubbard


> 
>> +
>> +            // SAFETY: No concurrent DMA allocations or mappings can be made because
>> +            // the device is still being probed and therefore isn't being used by
>> +            // other threads of execution.
>> +            unsafe { pdev.dma_set_mask_and_coherent(spec.chipset().arch().dma_mask())? };
>> +
>>              Ok(try_pin_init!(Self {
>> -                gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
>> +                gpu <- Gpu::new(pdev, devres_bar.clone(), devres_bar.access(pdev.as_ref())?, spec),
>>                  _reg <- auxiliary::Registration::new(
>>                      pdev.as_ref(),
>>                      c"nova-drm",




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-02-11 10:57   ` Danilo Krummrich
@ 2026-02-12  2:09     ` John Hubbard
  2026-02-17 15:43       ` Danilo Krummrich
  0 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-12  2:09 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/11/26 2:57 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
...
>> +    #[expect(unused)]
>> +    pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
>> +        // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
>> +        if offset % 4 != 0 || data.len() % 4 != 0 {
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
>> +            .set_rd_mode(true)
>> +            .set_offset(offset)
>> +            .write(bar, &Fsp::ID);
>> +
>> +        for chunk in data.chunks_exact_mut(4) {
>> +            let word = regs::NV_PFALCON_FALCON_EMEM_DATA::read(bar, &Fsp::ID).data();
>> +            chunk.copy_from_slice(&word.to_le_bytes());
>> +        }
>> +
>> +        Ok(())
>> +    }
>> +}
> 
> Technically, we could represent this as a separate I/O backend and use IoView /
> IoSlice (once we have it).
> 
> So, you could have Falcon<Fsp>::emem(), which returns an &Emem that implements
> Io [1].
> 
> This way we would get IoView and register!() for free on top of it. IoView will
> allow you to modify fields of the FSP structures similar to what we have for DMA
> with dma_read!() and dma_write!().
> 
> I just briefly glanced at the subsequent patches, but it looks like this could
> save quite some code.
> 
> We may not get the full potential right away, as IoView is still WIP, but I
> think it makes sense to consider it for a follow-up.
> 

Yes, let's keep this in mind.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-02-12  2:09     ` John Hubbard
@ 2026-02-17 15:43       ` Danilo Krummrich
  2026-02-19  2:54         ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 15:43 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Thu Feb 12, 2026 at 3:09 AM CET, John Hubbard wrote:
> On 2/11/26 2:57 AM, Danilo Krummrich wrote:
>> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> ...
>>> +    #[expect(unused)]
>>> +    pub(crate) fn read_emem(&self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
>>> +        // TODO: replace with `is_multiple_of` once the MSRV is >= 1.82.
>>> +        if offset % 4 != 0 || data.len() % 4 != 0 {
>>> +            return Err(EINVAL);
>>> +        }
>>> +
>>> +        regs::NV_PFALCON_FALCON_EMEM_CTL::default()
>>> +            .set_rd_mode(true)
>>> +            .set_offset(offset)
>>> +            .write(bar, &Fsp::ID);
>>> +
>>> +        for chunk in data.chunks_exact_mut(4) {
>>> +            let word = regs::NV_PFALCON_FALCON_EMEM_DATA::read(bar, &Fsp::ID).data();
>>> +            chunk.copy_from_slice(&word.to_le_bytes());
>>> +        }
>>> +
>>> +        Ok(())
>>> +    }
>>> +}
>> 
>> Technically, we could represent this as a separate I/O backend and use IoView /
>> IoSlice (once we have it).
>> 
>> So, you could have Falcon<Fsp>::emem(), which returns an &Emem that implements
>> Io [1].
>> 
>> This way we would get IoView and register!() for free on top of it. IoView will
>> allow you to modify fields of the FSP structures similar to what we have for DMA
>> with dma_read!() and dma_write!().
>> 
>> I just briefly glanced at the subsequent patches, but it looks like this could
>> save quite some code.
>> 
>> We may not get the full potential right away, as IoView is still WIP, but I
>> think it makes sense to consider it for a follow-up.
>> 
>
> Yes, let's keep this in mind.

Just to clarify, I mean we should implement Io and IoCapable right away, but
leave the rest as follow-up.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  2026-02-10  2:45 ` [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
@ 2026-02-17 16:28   ` Danilo Krummrich
  2026-02-20 22:05     ` Tegra notes for Nova: " John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 16:28 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> +    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
> +    pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
> +    pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {

Just a quick note, since I just reminded myself on this: We should keep in mind
that at some point we have to replace most (if not all) &Bar0 usages with
&Mmio<SIZE> as nova-core will also support platform devices.

I think Tegra chips with GSP-based GPU IPs have a compatible register layout,
right?

(I will prepare a patch to address this for the existing codebase.)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  2026-02-10  2:45 ` [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
@ 2026-02-17 16:39   ` Danilo Krummrich
  2026-02-19  3:01     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 16:39 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> diff --git a/drivers/gpu/nova-core/num.rs b/drivers/gpu/nova-core/num.rs
> index c952a834e662..f068722c5bdf 100644
> --- a/drivers/gpu/nova-core/num.rs
> +++ b/drivers/gpu/nova-core/num.rs
> @@ -215,3 +215,13 @@ pub(crate) const fn [<$from _into_ $into>]<const N: $from>() -> $into {
>  impl_const_into!(u64 => { u8, u16, u32 });
>  impl_const_into!(u32 => { u8, u16 });
>  impl_const_into!(u16 => { u8 });
> +
> +/// Aligns `value` up to `ALIGN` at compile time.
> +///
> +/// This is the const-compatible equivalent of [`kernel::ptr::Alignable::align_up`].
> +/// `ALIGN` must be a power of two (enforced at compile time).
> +#[inline(always)]
> +pub(crate) const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {

We should probably just add this as a function to Alignable.

> +    build_assert!(ALIGN.is_power_of_two());

ALIGN is a const generic, hence this should be:

	const { core::assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };

> +    (value + (ALIGN - 1)) & !(ALIGN - 1)
> +}

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-02-10  2:45 ` [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
@ 2026-02-17 17:13   ` Danilo Krummrich
  2026-02-20 23:26     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 17:13 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> +/// MCTP (Management Component Transport Protocol) header values for FSP communication.
> +pub(crate) mod mctp {
> +    pub(super) const HEADER_SOM: u32 = 1; // Start of Message
> +    pub(super) const HEADER_EOM: u32 = 1; // End of Message
> +    pub(super) const HEADER_SEID: u32 = 0; // Source Endpoint ID
> +    pub(super) const HEADER_SEQ: u32 = 0; // Sequence number

This looks like it should be a MctpMessageType enum instead.

> +
> +    pub(super) const MSG_TYPE_VENDOR_PCI: u32 = 0x7e;
> +    pub(super) const VENDOR_ID_NV: u32 = 0x10de;
> +    pub(super) const NVDM_TYPE_COT: u32 = 0x14;
> +    pub(super) const NVDM_TYPE_FSP_RESPONSE: u32 = 0x15;

This seems like it should be a different type (or even types) as they are
specific header field values.

> +}
> +
> +/// GSP FMC boot parameters structure.
> +/// This is what FSP expects to receive for booting GSP-RM.
> +/// GSP FMC initialization parameters.
> +#[repr(C)]
> +#[derive(Debug, Clone, Copy, Default)]
> +struct GspFmcInitParams {
> +    /// CC initialization "registry keys"
> +    regkeys: u32,
> +}
> +
> +// SAFETY: GspFmcInitParams is a simple C struct with only primitive types.
> +unsafe impl AsBytes for GspFmcInitParams {}
> +// SAFETY: All bit patterns are valid for the primitive fields.
> +unsafe impl FromBytes for GspFmcInitParams {}
> +
> +/// GSP ACR (Authenticated Code RAM) boot parameters.
> +#[repr(C)]
> +#[derive(Debug, Clone, Copy, Default)]
> +struct GspAcrBootGspRmParams {
> +    /// Physical memory aperture through which gspRmDescPa is accessed

For the whole series, please end them with a period. Personally, I don't care
too much, but it's a convention we have for Rust code.

> +/// FSP interface for Hopper/Blackwell GPUs.
> +pub(crate) struct Fsp;
> +
> +impl Fsp {
> +    /// Wait for FSP secure boot completion.
> +    ///
> +    /// Polls the thermal scratch register until FSP signals boot completion
> +    /// or timeout occurs.
> +    pub(crate) fn wait_secure_boot(
> +        dev: &device::Device<device::Bound>,
> +        bar: &crate::driver::Bar0,
> +        arch: crate::gpu::Architecture,
> +    ) -> Result<()> {

Please just use 'Result'.

> +        let timeout = Delta::from_millis(FSP_SECURE_BOOT_TIMEOUT_MS);
> +
> +        read_poll_timeout(
> +            || crate::regs::read_fsp_boot_complete_status(bar, arch),

Let's just import regs.

> +            |&status| {
> +                dev_dbg!(
> +                    dev,
> +                    "FSP I2CS scratch register status: {:#x} (expected: {:#x})\n",
> +                    status,
> +                    FSP_BOOT_COMPLETE_SUCCESS
> +                );

I don't think we should have print statements in the condition closure. Even for
debug prints we don't want to spam the console.

> +                status == FSP_BOOT_COMPLETE_SUCCESS
> +            },
> +            Delta::ZERO,
> +            timeout,
> +        )
> +        .map_err(|_| {
> +            let final_status =
> +                crate::regs::read_fsp_boot_complete_status(bar, arch).unwrap_or(0xDEADBEEF);

I think read_fsp_boot_complete_status() should just return an Option.

Also, do we really care about the actual value if we time out? If so, this won't
work reliably, i.e. final_status could have changed to
FSP_BOOT_COMPLETE_SUCCESS.

> +            dev_err!(
> +                dev,
> +                "FSP secure boot completion timeout - final status: {:#x}\n",
> +                final_status
> +            );
> +            ETIMEDOUT

Even if it is the wrong architecture we return ETIMEDOUT? Actually, since FSP is
for Hopper/Blackwell only/plus, we way even want to print a warning and use
debug_assert().

> +        })
> +        .map(|_| ())
> +    }
> +}

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  2026-02-10  2:45 ` [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
@ 2026-02-17 18:16   ` Danilo Krummrich
  2026-02-20 23:35     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 18:16 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> +    /// Creates FMC boot parameters structure for FSP.
> +    ///
> +    /// This structure tells FSP how to boot GSP-RM with the correct memory layout.
> +    pub(crate) fn create_fmc_boot_params(
> +        dev: &device::Device<device::Bound>,
> +        wpr_meta_addr: u64,
> +        wpr_meta_size: u32,
> +        libos_addr: u64,
> +    ) -> Result<kernel::dma::CoherentAllocation<GspFmcBootParams>> {
> +        use kernel::dma::CoherentAllocation;
> +
> +        const GSP_DMA_TARGET_COHERENT_SYSTEM: u32 = 1;
> +        const GSP_DMA_TARGET_NONCOHERENT_SYSTEM: u32 = 2;
> +
> +        let fmc_boot_params = CoherentAllocation::<GspFmcBootParams>::alloc_coherent(
> +            dev,
> +            1,
> +            GFP_KERNEL | __GFP_ZERO,
> +        )?;

I've mentioned this in another context already (where it doesn't work
unfortunately), but I think we should add a constructor that takes a closure
with a &mut [T] argument, so we don't have to use dma_write!() for
initialization. If you want I can prepare a patch.

> +
> +        // Configure ACR boot parameters (WPR metadata location) using dma_write! macro
> +        kernel::dma_write!(
> +            fmc_boot_params[0].boot_gsp_rm_params.target = GSP_DMA_TARGET_COHERENT_SYSTEM
> +        )?;
> +        kernel::dma_write!(
> +            fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_offset = wpr_meta_addr
> +        )?;
> +        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_size = wpr_meta_size)?;
> +
> +        // Blackwell FSP expects wpr_carveout_offset and wpr_carveout_size to be zero;
> +        // it obtains WPR info from other sources.
> +        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.b_is_gsp_rm_boot = 1)?;
> +
> +        // Configure RM parameters (libos location) using dma_write! macro
> +        kernel::dma_write!(
> +            fmc_boot_params[0].gsp_rm_params.target = GSP_DMA_TARGET_NONCOHERENT_SYSTEM
> +        )?;
> +        kernel::dma_write!(fmc_boot_params[0].gsp_rm_params.boot_args_offset = libos_addr)?;
> +
> +        Ok(fmc_boot_params)
> +    }
> +
> +    /// Boot GSP FMC with pre-extracted signatures.
> +    ///
> +    /// This version takes pre-extracted signatures and FMC image data.
> +    /// Used when signatures are extracted separately from the full ELF file.
> +    #[allow(clippy::too_many_arguments)]

Maybe we should just add a FmcBootArgs type with a corresponding constructor.
This should also get us rid of the helper function create_fmc_boot_params().

> +    pub(crate) fn boot_gsp_fmc_with_signatures(
>          dev: &device::Device<device::Bound>,
>          bar: &crate::driver::Bar0,
> +        chipset: crate::gpu::Chipset,
> +        fmc_image_fw: &crate::dma::DmaObject, // Contains only the image section
> +        fmc_boot_params: &kernel::dma::CoherentAllocation<GspFmcBootParams>,
> +        total_reserved_size: u64,
> +        resume: bool,
>          fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
> -        nvdm_type: u32,
> -        packet: &[u8],
> +        signatures: &FmcSignatures,
>      ) -> Result<()> {
> +        dev_dbg!(dev, "Starting FSP boot sequence for {}\n", chipset);
> +
> +        // Build FSP Chain of Trust message
> +        let fmc_addr = fmc_image_fw.dma_handle(); // Now points to image data only
> +        let fmc_boot_params_addr = fmc_boot_params.dma_handle();
> +
> +        // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
> +        let frts_offset = if !resume {
> +            let mut frts_reserved_size =
> +                if let Some(heap_size) = crate::fb::hal::fb_hal(chipset).non_wpr_heap_size() {
> +                    u64::from(heap_size)
> +                } else {
> +                    total_reserved_size
> +                };
> +
> +            // Add PMU reserved size
> +            frts_reserved_size += u64::from(crate::fb::PMU_RESERVED_SIZE);
> +
> +            frts_reserved_size
> +                .align_up(Alignment::new::<SZ_2M>())
> +                .unwrap_or(frts_reserved_size)
> +        } else {
> +            0
> +        };
> +        let frts_size = if !resume { SZ_1M as u32 } else { 0 };
> +
> +        // Build the FSP message

This comment seems superfluous.

> +        let msg = KBox::new(
> +            FspMessage {
> +                mctp_header: (mctp::HEADER_SOM << mctp::HEADER_SOM_SHIFT)
> +                    | (mctp::HEADER_EOM << mctp::HEADER_EOM_SHIFT)
> +                    | (mctp::HEADER_SEID << mctp::HEADER_SEID_SHIFT)
> +                    | (mctp::HEADER_SEQ << mctp::HEADER_SEQ_SHIFT),
> +
> +                nvdm_header: (mctp::MSG_TYPE_VENDOR_PCI)
> +                    | (mctp::VENDOR_ID_NV << mctp::NVDM_VENDOR_ID_SHIFT)
> +                    | (mctp::NVDM_TYPE_COT << mctp::NVDM_TYPE_SHIFT),
> +
> +                cot: NvdmPayloadCot {
> +                    version: chipset.fsp_cot_version(),
> +                    size: core::mem::size_of::<NvdmPayloadCot>() as u16,
> +                    gsp_fmc_sysmem_offset: fmc_addr,
> +                    frts_sysmem_offset: 0,
> +                    frts_sysmem_size: 0,
> +                    frts_vidmem_offset: frts_offset,
> +                    frts_vidmem_size: frts_size,
> +                    hash384: signatures.hash384,
> +                    public_key: signatures.public_key,
> +                    signature: signatures.signature,
> +                    gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
> +                },
> +            },
> +            GFP_KERNEL,
> +        )?;
> +
> +        // Send COT message to FSP and wait for response
> +        Self::send_sync_fsp(dev, bar, fsp_falcon, &*msg)?;
> +
> +        dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
> +        Ok(())
> +    }

<snip>

> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index f04e2a795e90..88b1546e3cb4 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -124,6 +124,18 @@ pub(crate) const fn arch(&self) -> Architecture {
>              | Self::GB207 => Architecture::Blackwell,
>          }
>      }
> +
> +    /// Returns the FSP Chain of Trust (COT) protocol version for this chipset.
> +    ///
> +    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
> +    pub(crate) const fn fsp_cot_version(&self) -> u16 {
> +        match self.arch() {
> +            Architecture::Hopper => 1,
> +            Architecture::Blackwell => 2,
> +            // Other architectures don't use FSP COT
> +            _ => 0,

I think we should use a new type to represent this version and use Option, i.e.
return Option<FspCotVersion>.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-02-10  2:45 ` [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
@ 2026-02-17 20:04   ` Danilo Krummrich
  2026-02-20 23:57     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 20:04 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> Hopper, Blackwell and later require more space for the non-WPR heap.
>
> Add a new FbHal method to return the non-WPR heap size, and create a new
> GH100 HAL for Hopper and GB100 HAL for Blackwell that return the
> appropriate value for each GPU architecture.
>
> Cc: Timur Tabi <ttabi@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/fb.rs           | 14 +++++++---
>  drivers/gpu/nova-core/fb/hal.rs       |  7 +++--
>  drivers/gpu/nova-core/fb/hal/ga102.rs |  2 +-
>  drivers/gpu/nova-core/fb/hal/gb100.rs | 37 +++++++++++++++++++++++++++
>  drivers/gpu/nova-core/fb/hal/gh100.rs | 37 +++++++++++++++++++++++++++
>  5 files changed, 91 insertions(+), 6 deletions(-)
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
>
> diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
> index 3a2b79a5c107..7c502f15622c 100644
> --- a/drivers/gpu/nova-core/fb.rs
> +++ b/drivers/gpu/nova-core/fb.rs
> @@ -98,6 +98,15 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
>      }
>  }
>  
> +/// Calculate non-WPR heap size based on chipset architecture.
> +/// This matches the logic used in FSP for consistency.
> +pub(crate) fn calc_non_wpr_heap_size(chipset: Chipset) -> u64 {
> +    hal::fb_hal(chipset)
> +        .non_wpr_heap_size()
> +        .map(u64::from)
> +        .unwrap_or(SZ_1M as u64)

This should use u64::from_safe_cast().

Also, I already brought this up in the context of GPU buddy, I wonder if we
should just add SZ_* constants for 64-bit devices. Shouldn't be too hard to
generate the corresponding code.

I think it is a repeating pattern, and having to use u64::from_safe_cast() all
the time seems cumbersome.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-02-10  2:45 ` [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
@ 2026-02-17 20:10   ` Danilo Krummrich
  2026-02-21  1:01     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 20:10 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> @@ -74,7 +94,7 @@ fn management_overhead(fb_size: u64) -> u64 {
>          u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
>              .saturating_mul(fb_size_gb)
>              .align_up(GSP_HEAP_ALIGNMENT)
> -            .unwrap_or(u64::MAX)
> +            .expect("management_overhead alignment overflow")

Ultimately, the relevant value for this calculation (fb_size) comes from the
hardware through a register read if I'm not mistaken, we shouldn't panic on
that, but rather handle is as an error if the read value is not plausible.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper
  2026-02-10  2:45 ` [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper John Hubbard
@ 2026-02-17 20:12   ` Danilo Krummrich
  2026-02-21  1:03     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 20:12 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
> index 465c18e4c888..6191986fc6b5 100644
> --- a/drivers/gpu/nova-core/gsp/boot.rs
> +++ b/drivers/gpu/nova-core/gsp/boot.rs
> @@ -120,6 +120,40 @@ fn run_fwsec_frts(
>          }
>      }
>  
> +    fn run_booter(
> +        dev: &device::Device<device::Bound>,
> +        bar: &Bar0,
> +        chipset: Chipset,
> +        sec2_falcon: &Falcon<Sec2>,
> +        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
> +    ) -> Result {
> +        let booter_loader = BooterFirmware::new(
> +            dev,
> +            BooterKind::Loader,
> +            chipset,
> +            FIRMWARE_VERSION,
> +            sec2_falcon,
> +            bar,
> +        )?;

Maybe we should just make the part below a method of BooterFirmware, i.e.
BooterFirmware::run()?

> +        sec2_falcon.reset(bar)?;
> +        sec2_falcon.load(bar, &booter_loader)?;
> +        let wpr_handle = wpr_meta.dma_handle();
> +        let (mbox0, mbox1) = sec2_falcon.boot(
> +            bar,
> +            Some(wpr_handle as u32),
> +            Some((wpr_handle >> 32) as u32),
> +        )?;
> +        dev_dbg!(dev, "SEC2 MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
> +
> +        if mbox0 != 0 {
> +            dev_err!(dev, "Booter-load failed with error {:#x}\n", mbox0);
> +            return Err(ENODEV);
> +        }
> +
> +        Ok(())
> +    }

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  2026-02-10  2:45 ` [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
@ 2026-02-17 20:20   ` Danilo Krummrich
  2026-02-21  1:06     ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Danilo Krummrich @ 2026-02-17 20:20 UTC (permalink / raw)
  To: John Hubbard
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
> +    /// Check if GSP lockdown has been released after FSP Chain of Trust
> +    fn gsp_lockdown_released(
> +        dev: &device::Device,
> +        gsp_falcon: &Falcon<Gsp>,
> +        bar: &Bar0,
> +        fmc_boot_params_addr: u64,
> +        mbox0: &mut u32,
> +    ) -> bool {
> +        // Read GSP falcon mailbox0
> +        *mbox0 = gsp_falcon.read_mailbox0(bar);
> +
> +        // Check 1: If mbox0 has 0xbadf4100 pattern, GSP is still locked down
> +        if *mbox0 != 0 && (*mbox0 & 0xffffff00) == 0xbadf4100 {
> +            return false;
> +        }

Hm...we could create a tiny type wrapper around this value, and do the check
with a method, such as Mbox::is_locked_down(&self).

> +        // Check 2: If mbox0 has a value, check if it's an error
> +        if *mbox0 != 0 {
> +            let mbox1 = gsp_falcon.read_mailbox1(bar);
> +
> +            let combined_addr = (u64::from(mbox1) << 32) | u64::from(*mbox0);

This could also be part of the type.

> +            if combined_addr != fmc_boot_params_addr {
> +                // Address doesn't match - GSP wrote an error code
> +                // Return TRUE (lockdown released) with error
> +                dev_dbg!(
> +                    dev,
> +                    "GSP lockdown error: mbox0={:#x}, combined_addr={:#x}, expected={:#x}\n",
> +                    *mbox0,
> +                    combined_addr,
> +                    fmc_boot_params_addr
> +                );
> +                return true;
> +            }
> +        }
> +
> +        // Check 3: Verify HWCFG2 RISCV_BR_PRIV_LOCKDOWN bit is clear
> +        let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &crate::falcon::gsp::Gsp::ID);
> +        !hwcfg2.riscv_br_priv_lockdown()
> +    }

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-02-17 15:43       ` Danilo Krummrich
@ 2026-02-19  2:54         ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-19  2:54 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/17/26 7:43 AM, Danilo Krummrich wrote:
> On Thu Feb 12, 2026 at 3:09 AM CET, John Hubbard wrote:
>> On 2/11/26 2:57 AM, Danilo Krummrich wrote:
>>> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> ...
>>> We may not get the full potential right away, as IoView is still WIP, but I
>>> think it makes sense to consider it for a follow-up.
>>>
>>
>> Yes, let's keep this in mind.
> 
> Just to clarify, I mean we should implement Io and IoCapable right away, but
> leave the rest as follow-up.

OK. Done in v5!

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  2026-02-17 16:39   ` Danilo Krummrich
@ 2026-02-19  3:01     ` John Hubbard
  2026-02-19  9:01       ` Miguel Ojeda
  0 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-19  3:01 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/17/26 8:39 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> diff --git a/drivers/gpu/nova-core/num.rs b/drivers/gpu/nova-core/num.rs
>> index c952a834e662..f068722c5bdf 100644
>> --- a/drivers/gpu/nova-core/num.rs
>> +++ b/drivers/gpu/nova-core/num.rs
>> @@ -215,3 +215,13 @@ pub(crate) const fn [<$from _into_ $into>]<const N: $from>() -> $into {
>>  impl_const_into!(u64 => { u8, u16, u32 });
>>  impl_const_into!(u32 => { u8, u16 });
>>  impl_const_into!(u16 => { u8 });
>> +
>> +/// Aligns `value` up to `ALIGN` at compile time.
>> +///
>> +/// This is the const-compatible equivalent of [`kernel::ptr::Alignable::align_up`].
>> +/// `ALIGN` must be a power of two (enforced at compile time).
>> +#[inline(always)]
>> +pub(crate) const fn const_align_up<const ALIGN: usize>(value: usize) -> usize {
> 
> We should probably just add this as a function to Alignable.

So that would go in via a separate git branch? And therefore probably
I should send that change separately?

Or should I just create a new patch for rust/kernel/ptr.rs as part of
this series?

> 
>> +    build_assert!(ALIGN.is_power_of_two());
> 
> ALIGN is a const generic, hence this should be:
> 
> 	const { core::assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };

That doesn't compile on rustc 1.78.0:

error[E0658]: inline-const is experimental
   --> drivers/gpu/nova-core/num.rs:225:5
    |
225 |     const { core::assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
    |     ^^^^^
    |
    = note: see issue #76001 <https://github.com/rust-lang/rust/issues/76001> for more information
    = help: add `#![feature(inline_const)]` to the crate attributes to enable
    = note: this compiler was built on 2024-04-29; consider upgrading it if it is out of date

error: aborting due to 1 previous error

For more information about this error, try `rustc --explain E0658`.

> 
>> +    (value + (ALIGN - 1)) & !(ALIGN - 1)
>> +}

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  2026-02-19  3:01     ` John Hubbard
@ 2026-02-19  9:01       ` Miguel Ojeda
  2026-02-20 22:08         ` John Hubbard
  0 siblings, 1 reply; 66+ messages in thread
From: Miguel Ojeda @ 2026-02-19  9:01 UTC (permalink / raw)
  To: John Hubbard, Gary Guo
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On Thu, Feb 19, 2026 at 4:02 AM John Hubbard <jhubbard@nvidia.com> wrote:
>
> That doesn't compile on rustc 1.78.0:
>
> error[E0658]: inline-const is experimental
>    --> drivers/gpu/nova-core/num.rs:225:5
>     |
> 225 |     const { core::assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>     |     ^^^^^
>     |
>     = note: see issue #76001 <https://github.com/rust-lang/rust/issues/76001> for more information
>     = help: add `#![feature(inline_const)]` to the crate attributes to enable
>     = note: this compiler was built on 2024-04-29; consider upgrading it if it is out of date
>
> error: aborting due to 1 previous error
>
> For more information about this error, try `rustc --explain E0658`.

Please feel free to add `inline_const` to `rust_allowed_features` in
`scripts/Makefile.build`.

We are going to do it here:

  https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Tegra notes for Nova: [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  2026-02-17 16:28   ` Danilo Krummrich
@ 2026-02-20 22:05     ` John Hubbard
  2026-02-23  3:36       ` Alexandre Courbot
  0 siblings, 1 reply; 66+ messages in thread
From: John Hubbard @ 2026-02-20 22:05 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML, Vikram Sethi

On 2/17/26 8:28 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> +    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
>> +    pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
>> +    pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
> 
> Just a quick note, since I just reminded myself on this: We should keep in mind
> that at some point we have to replace most (if not all) &Bar0 usages with
> &Mmio<SIZE> as nova-core will also support platform devices.
> 
> I think Tegra chips with GSP-based GPU IPs have a compatible register layout,
> right?

Well...I'd characterize it a little differently. Here are some notes
from a recent discussion with Tegra, to help shed some light on that:

(+Cc: Vikram Sethi, to keep things accurate.)

Some Tegra chips have a Blackwell iGPU that looks like a PCIe device and
acts like a PCIe device, and has GSP, just like a dGPU (discrete GPU).

iGPUs don’t have FSP, nor a separate VBIOS, nor even vidmem. So the boot
sequence for them is different, but only from the first couple of steps
where SoC root of trust does the work of FSP, and boot skips any vidmem
related code.

Tegra display is sufficiently different (with several quirky variations
too) from dGPU that it's an open question whether Nova is even the right
driver to use for that aspect.

The register layouts for Tegra iGPUs vs dGPUs have been similar
(remember that similar doesn't mean "the same") since the Kepler GPUs.

thanks,
-- 
John Hubbard

> 
> (I will prepare a patch to address this for the existing codebase.)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  2026-02-19  9:01       ` Miguel Ojeda
@ 2026-02-20 22:08         ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-20 22:08 UTC (permalink / raw)
  To: Miguel Ojeda, Gary Guo
  Cc: Danilo Krummrich, Alexandre Courbot, Joel Fernandes, Timur Tabi,
	Alistair Popple, Eliot Courtney, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, nouveau, rust-for-linux, LKML

On 2/19/26 1:01 AM, Miguel Ojeda wrote:
> On Thu, Feb 19, 2026 at 4:02 AM John Hubbard <jhubbard@nvidia.com> wrote:
>>
>> That doesn't compile on rustc 1.78.0:
>>
>> error[E0658]: inline-const is experimental
>>    --> drivers/gpu/nova-core/num.rs:225:5
>>     |
>> 225 |     const { core::assert!(ALIGN.is_power_of_two(), "ALIGN must be a power of two") };
>>     |     ^^^^^
>>     |
>>     = note: see issue #76001 <https://github.com/rust-lang/rust/issues/76001> for more information
>>     = help: add `#![feature(inline_const)]` to the crate attributes to enable
>>     = note: this compiler was built on 2024-04-29; consider upgrading it if it is out of date
>>
>> error: aborting due to 1 previous error
>>
>> For more information about this error, try `rustc --explain E0658`.
> 
> Please feel free to add `inline_const` to `rust_allowed_features` in
> `scripts/Makefile.build`.
> 

OK, then! Done in v5.

thanks,
-- 
John Hubbard

> We are going to do it here:
> 
>   https://lore.kernel.org/rust-for-linux/20260206171253.2704684-2-gary@kernel.org/
> 
> Cheers,
> Miguel



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-02-17 17:13   ` Danilo Krummrich
@ 2026-02-20 23:26     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-20 23:26 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/17/26 9:13 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> +/// MCTP (Management Component Transport Protocol) header values for FSP communication.
>> +pub(crate) mod mctp {
>> +    pub(super) const HEADER_SOM: u32 = 1; // Start of Message
>> +    pub(super) const HEADER_EOM: u32 = 1; // End of Message
>> +    pub(super) const HEADER_SEID: u32 = 0; // Source Endpoint ID
>> +    pub(super) const HEADER_SEQ: u32 = 0; // Sequence number
> 
> This looks like it should be a MctpMessageType enum instead.

Done. Or it's moral equivalent, anyway: I've overhauled this whole
thing, put it in its own module and file and patch, and used types
throughout.

> 
>> +
>> +    pub(super) const MSG_TYPE_VENDOR_PCI: u32 = 0x7e;
>> +    pub(super) const VENDOR_ID_NV: u32 = 0x10de;
>> +    pub(super) const NVDM_TYPE_COT: u32 = 0x14;
>> +    pub(super) const NVDM_TYPE_FSP_RESPONSE: u32 = 0x15;
> 
> This seems like it should be a different type (or even types) as they are
> specific header field values.

I think you'll like the new arrangement in v5.

> 
>> +}
>> +
>> +/// GSP FMC boot parameters structure.
>> +/// This is what FSP expects to receive for booting GSP-RM.
>> +/// GSP FMC initialization parameters.
>> +#[repr(C)]
>> +#[derive(Debug, Clone, Copy, Default)]
>> +struct GspFmcInitParams {
>> +    /// CC initialization "registry keys"
>> +    regkeys: u32,
>> +}
>> +
>> +// SAFETY: GspFmcInitParams is a simple C struct with only primitive types.
>> +unsafe impl AsBytes for GspFmcInitParams {}
>> +// SAFETY: All bit patterns are valid for the primitive fields.
>> +unsafe impl FromBytes for GspFmcInitParams {}
>> +
>> +/// GSP ACR (Authenticated Code RAM) boot parameters.
>> +#[repr(C)]
>> +#[derive(Debug, Clone, Copy, Default)]
>> +struct GspAcrBootGspRmParams {
>> +    /// Physical memory aperture through which gspRmDescPa is accessed
> 
> For the whole series, please end them with a period. Personally, I don't care
> too much, but it's a convention we have for Rust code.

Fixed.

> 
>> +/// FSP interface for Hopper/Blackwell GPUs.
>> +pub(crate) struct Fsp;
>> +
>> +impl Fsp {
>> +    /// Wait for FSP secure boot completion.
>> +    ///
>> +    /// Polls the thermal scratch register until FSP signals boot completion
>> +    /// or timeout occurs.
>> +    pub(crate) fn wait_secure_boot(
>> +        dev: &device::Device<device::Bound>,
>> +        bar: &crate::driver::Bar0,
>> +        arch: crate::gpu::Architecture,
>> +    ) -> Result<()> {
> 
> Please just use 'Result'.

Fixed everywhere. (And more than once, since it came back a couple
of times during a rebase glitch just now, haha.)

> 
>> +        let timeout = Delta::from_millis(FSP_SECURE_BOOT_TIMEOUT_MS);
>> +
>> +        read_poll_timeout(
>> +            || crate::regs::read_fsp_boot_complete_status(bar, arch),
> 
> Let's just import regs.

Done.

> 
>> +            |&status| {
>> +                dev_dbg!(
>> +                    dev,
>> +                    "FSP I2CS scratch register status: {:#x} (expected: {:#x})\n",
>> +                    status,
>> +                    FSP_BOOT_COMPLETE_SUCCESS
>> +                );
> 
> I don't think we should have print statements in the condition closure. Even for
> debug prints we don't want to spam the console.

Done.

> 
>> +                status == FSP_BOOT_COMPLETE_SUCCESS
>> +            },
>> +            Delta::ZERO,
>> +            timeout,
>> +        )
>> +        .map_err(|_| {
>> +            let final_status =
>> +                crate::regs::read_fsp_boot_complete_status(bar, arch).unwrap_or(0xDEADBEEF);
> 
> I think read_fsp_boot_complete_status() should just return an Option.

Done.

> 
> Also, do we really care about the actual value if we time out? If so, this won't
> work reliably, i.e. final_status could have changed to
> FSP_BOOT_COMPLETE_SUCCESS.
> 
>> +            dev_err!(
>> +                dev,
>> +                "FSP secure boot completion timeout - final status: {:#x}\n",
>> +                final_status
>> +            );
>> +            ETIMEDOUT
> 
> Even if it is the wrong architecture we return ETIMEDOUT? Actually, since FSP is
> for Hopper/Blackwell only/plus, we way even want to print a warning and use
> debug_assert().

Right, I added a debug_assert(), and removed the re-read in the error case.

> 
>> +        })
>> +        .map(|_| ())
>> +    }
>> +}

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  2026-02-17 18:16   ` Danilo Krummrich
@ 2026-02-20 23:35     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-20 23:35 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/17/26 10:16 AM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> +    /// Creates FMC boot parameters structure for FSP.
>> +    ///
>> +    /// This structure tells FSP how to boot GSP-RM with the correct memory layout.
>> +    pub(crate) fn create_fmc_boot_params(
>> +        dev: &device::Device<device::Bound>,
>> +        wpr_meta_addr: u64,
>> +        wpr_meta_size: u32,
>> +        libos_addr: u64,
>> +    ) -> Result<kernel::dma::CoherentAllocation<GspFmcBootParams>> {
>> +        use kernel::dma::CoherentAllocation;
>> +
>> +        const GSP_DMA_TARGET_COHERENT_SYSTEM: u32 = 1;
>> +        const GSP_DMA_TARGET_NONCOHERENT_SYSTEM: u32 = 2;
>> +
>> +        let fmc_boot_params = CoherentAllocation::<GspFmcBootParams>::alloc_coherent(
>> +            dev,
>> +            1,
>> +            GFP_KERNEL | __GFP_ZERO,
>> +        )?;
> 
> I've mentioned this in another context already (where it doesn't work
> unfortunately), but I think we should add a constructor that takes a closure
> with a &mut [T] argument, so we don't have to use dma_write!() for
> initialization. If you want I can prepare a patch.

Yes please. I'm up to 37 patches in this series now and am starting
to worry about it getting even larger.

> 
>> +
>> +        // Configure ACR boot parameters (WPR metadata location) using dma_write! macro
>> +        kernel::dma_write!(
>> +            fmc_boot_params[0].boot_gsp_rm_params.target = GSP_DMA_TARGET_COHERENT_SYSTEM
>> +        )?;
>> +        kernel::dma_write!(
>> +            fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_offset = wpr_meta_addr
>> +        )?;
>> +        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.gsp_rm_desc_size = wpr_meta_size)?;
>> +
>> +        // Blackwell FSP expects wpr_carveout_offset and wpr_carveout_size to be zero;
>> +        // it obtains WPR info from other sources.
>> +        kernel::dma_write!(fmc_boot_params[0].boot_gsp_rm_params.b_is_gsp_rm_boot = 1)?;
>> +
>> +        // Configure RM parameters (libos location) using dma_write! macro
>> +        kernel::dma_write!(
>> +            fmc_boot_params[0].gsp_rm_params.target = GSP_DMA_TARGET_NONCOHERENT_SYSTEM
>> +        )?;
>> +        kernel::dma_write!(fmc_boot_params[0].gsp_rm_params.boot_args_offset = libos_addr)?;
>> +
>> +        Ok(fmc_boot_params)
>> +    }
>> +
>> +    /// Boot GSP FMC with pre-extracted signatures.
>> +    ///
>> +    /// This version takes pre-extracted signatures and FMC image data.
>> +    /// Used when signatures are extracted separately from the full ELF file.
>> +    #[allow(clippy::too_many_arguments)]
> 
> Maybe we should just add a FmcBootArgs type with a corresponding constructor.
> This should also get us rid of the helper function create_fmc_boot_params().

Done.

> 
>> +    pub(crate) fn boot_gsp_fmc_with_signatures(
>>          dev: &device::Device<device::Bound>,
>>          bar: &crate::driver::Bar0,
>> +        chipset: crate::gpu::Chipset,
>> +        fmc_image_fw: &crate::dma::DmaObject, // Contains only the image section
>> +        fmc_boot_params: &kernel::dma::CoherentAllocation<GspFmcBootParams>,
>> +        total_reserved_size: u64,
>> +        resume: bool,
>>          fsp_falcon: &crate::falcon::Falcon<crate::falcon::fsp::Fsp>,
>> -        nvdm_type: u32,
>> -        packet: &[u8],
>> +        signatures: &FmcSignatures,
>>      ) -> Result<()> {
>> +        dev_dbg!(dev, "Starting FSP boot sequence for {}\n", chipset);
>> +
>> +        // Build FSP Chain of Trust message
>> +        let fmc_addr = fmc_image_fw.dma_handle(); // Now points to image data only
>> +        let fmc_boot_params_addr = fmc_boot_params.dma_handle();
>> +
>> +        // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
>> +        let frts_offset = if !resume {
>> +            let mut frts_reserved_size =
>> +                if let Some(heap_size) = crate::fb::hal::fb_hal(chipset).non_wpr_heap_size() {
>> +                    u64::from(heap_size)
>> +                } else {
>> +                    total_reserved_size
>> +                };
>> +
>> +            // Add PMU reserved size
>> +            frts_reserved_size += u64::from(crate::fb::PMU_RESERVED_SIZE);
>> +
>> +            frts_reserved_size
>> +                .align_up(Alignment::new::<SZ_2M>())
>> +                .unwrap_or(frts_reserved_size)
>> +        } else {
>> +            0
>> +        };
>> +        let frts_size = if !resume { SZ_1M as u32 } else { 0 };
>> +
>> +        // Build the FSP message
> 
> This comment seems superfluous.

Removed.

> 
>> +        let msg = KBox::new(
>> +            FspMessage {
>> +                mctp_header: (mctp::HEADER_SOM << mctp::HEADER_SOM_SHIFT)
>> +                    | (mctp::HEADER_EOM << mctp::HEADER_EOM_SHIFT)
>> +                    | (mctp::HEADER_SEID << mctp::HEADER_SEID_SHIFT)
>> +                    | (mctp::HEADER_SEQ << mctp::HEADER_SEQ_SHIFT),
>> +
>> +                nvdm_header: (mctp::MSG_TYPE_VENDOR_PCI)
>> +                    | (mctp::VENDOR_ID_NV << mctp::NVDM_VENDOR_ID_SHIFT)
>> +                    | (mctp::NVDM_TYPE_COT << mctp::NVDM_TYPE_SHIFT),
>> +
>> +                cot: NvdmPayloadCot {
>> +                    version: chipset.fsp_cot_version(),
>> +                    size: core::mem::size_of::<NvdmPayloadCot>() as u16,
>> +                    gsp_fmc_sysmem_offset: fmc_addr,
>> +                    frts_sysmem_offset: 0,
>> +                    frts_sysmem_size: 0,
>> +                    frts_vidmem_offset: frts_offset,
>> +                    frts_vidmem_size: frts_size,
>> +                    hash384: signatures.hash384,
>> +                    public_key: signatures.public_key,
>> +                    signature: signatures.signature,
>> +                    gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
>> +                },
>> +            },
>> +            GFP_KERNEL,
>> +        )?;
>> +
>> +        // Send COT message to FSP and wait for response
>> +        Self::send_sync_fsp(dev, bar, fsp_falcon, &*msg)?;
>> +
>> +        dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
>> +        Ok(())
>> +    }
> 
> <snip>
> 
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index f04e2a795e90..88b1546e3cb4 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -124,6 +124,18 @@ pub(crate) const fn arch(&self) -> Architecture {
>>              | Self::GB207 => Architecture::Blackwell,
>>          }
>>      }
>> +
>> +    /// Returns the FSP Chain of Trust (COT) protocol version for this chipset.
>> +    ///
>> +    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
>> +    pub(crate) const fn fsp_cot_version(&self) -> u16 {
>> +        match self.arch() {
>> +            Architecture::Hopper => 1,
>> +            Architecture::Blackwell => 2,
>> +            // Other architectures don't use FSP COT
>> +            _ => 0,
> 
> I think we should use a new type to represent this version and use Option, i.e.
> return Option<FspCotVersion>.

Done.
 

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-02-17 20:04   ` Danilo Krummrich
@ 2026-02-20 23:57     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-20 23:57 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/17/26 12:04 PM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> Hopper, Blackwell and later require more space for the non-WPR heap.
>>
>> Add a new FbHal method to return the non-WPR heap size, and create a new
>> GH100 HAL for Hopper and GB100 HAL for Blackwell that return the
>> appropriate value for each GPU architecture.
>>
>> Cc: Timur Tabi <ttabi@nvidia.com>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/fb.rs           | 14 +++++++---
>>  drivers/gpu/nova-core/fb/hal.rs       |  7 +++--
>>  drivers/gpu/nova-core/fb/hal/ga102.rs |  2 +-
>>  drivers/gpu/nova-core/fb/hal/gb100.rs | 37 +++++++++++++++++++++++++++
>>  drivers/gpu/nova-core/fb/hal/gh100.rs | 37 +++++++++++++++++++++++++++
>>  5 files changed, 91 insertions(+), 6 deletions(-)
>>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
>>  create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
>>
>> diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
>> index 3a2b79a5c107..7c502f15622c 100644
>> --- a/drivers/gpu/nova-core/fb.rs
>> +++ b/drivers/gpu/nova-core/fb.rs
>> @@ -98,6 +98,15 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
>>      }
>>  }
>>  
>> +/// Calculate non-WPR heap size based on chipset architecture.
>> +/// This matches the logic used in FSP for consistency.
>> +pub(crate) fn calc_non_wpr_heap_size(chipset: Chipset) -> u64 {
>> +    hal::fb_hal(chipset)
>> +        .non_wpr_heap_size()
>> +        .map(u64::from)
>> +        .unwrap_or(SZ_1M as u64)
> 
> This should use u64::from_safe_cast().
> 
> Also, I already brought this up in the context of GPU buddy, I wonder if we
> should just add SZ_* constants for 64-bit devices. Shouldn't be too hard to
> generate the corresponding code.
> 
> I think it is a repeating pattern, and having to use u64::from_safe_cast() all
> the time seems cumbersome.

I've added two patches (for rust/, and for nova-core) to the end of this
series for v5, to just fix all of that.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-02-17 20:10   ` Danilo Krummrich
@ 2026-02-21  1:01     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-21  1:01 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/17/26 12:10 PM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> @@ -74,7 +94,7 @@ fn management_overhead(fb_size: u64) -> u64 {
>>          u64::from(bindings::GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB)
>>              .saturating_mul(fb_size_gb)
>>              .align_up(GSP_HEAP_ALIGNMENT)
>> -            .unwrap_or(u64::MAX)
>> +            .expect("management_overhead alignment overflow")
> 
> Ultimately, the relevant value for this calculation (fb_size) comes from the
> hardware through a register read if I'm not mistaken, we shouldn't panic on
> that, but rather handle is as an error if the read value is not plausible.

Yes. Fixed in v5.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper
  2026-02-17 20:12   ` Danilo Krummrich
@ 2026-02-21  1:03     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-21  1:03 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML

On 2/17/26 12:12 PM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
>> index 465c18e4c888..6191986fc6b5 100644
>> --- a/drivers/gpu/nova-core/gsp/boot.rs
>> +++ b/drivers/gpu/nova-core/gsp/boot.rs
>> @@ -120,6 +120,40 @@ fn run_fwsec_frts(
>>          }
>>      }
>>  
>> +    fn run_booter(
>> +        dev: &device::Device<device::Bound>,
>> +        bar: &Bar0,
>> +        chipset: Chipset,
>> +        sec2_falcon: &Falcon<Sec2>,
>> +        wpr_meta: &CoherentAllocation<GspFwWprMeta>,
>> +    ) -> Result {
>> +        let booter_loader = BooterFirmware::new(
>> +            dev,
>> +            BooterKind::Loader,
>> +            chipset,
>> +            FIRMWARE_VERSION,
>> +            sec2_falcon,
>> +            bar,
>> +        )?;
> 
> Maybe we should just make the part below a method of BooterFirmware, i.e.
> BooterFirmware::run()?

Done.

thanks,
-- 
John Hubbard

> 
>> +        sec2_falcon.reset(bar)?;
>> +        sec2_falcon.load(bar, &booter_loader)?;
>> +        let wpr_handle = wpr_meta.dma_handle();
>> +        let (mbox0, mbox1) = sec2_falcon.boot(
>> +            bar,
>> +            Some(wpr_handle as u32),
>> +            Some((wpr_handle >> 32) as u32),
>> +        )?;
>> +        dev_dbg!(dev, "SEC2 MBOX0: {:#x}, MBOX1: {:#x}\n", mbox0, mbox1);
>> +
>> +        if mbox0 != 0 {
>> +            dev_err!(dev, "Booter-load failed with error {:#x}\n", mbox0);
>> +            return Err(ENODEV);
>> +        }
>> +
>> +        Ok(())
>> +    }



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  2026-02-17 20:20   ` Danilo Krummrich
@ 2026-02-21  1:06     ` John Hubbard
  0 siblings, 0 replies; 66+ messages in thread
From: John Hubbard @ 2026-02-21  1:06 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, nouveau, rust-for-linux, LKML



On 2/17/26 12:20 PM, Danilo Krummrich wrote:
> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>> +    /// Check if GSP lockdown has been released after FSP Chain of Trust
>> +    fn gsp_lockdown_released(
>> +        dev: &device::Device,
>> +        gsp_falcon: &Falcon<Gsp>,
>> +        bar: &Bar0,
>> +        fmc_boot_params_addr: u64,
>> +        mbox0: &mut u32,
>> +    ) -> bool {
>> +        // Read GSP falcon mailbox0
>> +        *mbox0 = gsp_falcon.read_mailbox0(bar);
>> +
>> +        // Check 1: If mbox0 has 0xbadf4100 pattern, GSP is still locked down
>> +        if *mbox0 != 0 && (*mbox0 & 0xffffff00) == 0xbadf4100 {
>> +            return false;
>> +        }
> 
> Hm...we could create a tiny type wrapper around this value, and do the check
> with a method, such as Mbox::is_locked_down(&self).

OK, done in v5.

> 
>> +        // Check 2: If mbox0 has a value, check if it's an error
>> +        if *mbox0 != 0 {
>> +            let mbox1 = gsp_falcon.read_mailbox1(bar);
>> +
>> +            let combined_addr = (u64::from(mbox1) << 32) | u64::from(*mbox0);
> 
> This could also be part of the type.

Yes.

> 
>> +            if combined_addr != fmc_boot_params_addr {
>> +                // Address doesn't match - GSP wrote an error code
>> +                // Return TRUE (lockdown released) with error
>> +                dev_dbg!(
>> +                    dev,
>> +                    "GSP lockdown error: mbox0={:#x}, combined_addr={:#x}, expected={:#x}\n",
>> +                    *mbox0,
>> +                    combined_addr,
>> +                    fmc_boot_params_addr
>> +                );
>> +                return true;
>> +            }
>> +        }
>> +
>> +        // Check 3: Verify HWCFG2 RISCV_BR_PRIV_LOCKDOWN bit is clear
>> +        let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &crate::falcon::gsp::Gsp::ID);
>> +        !hwcfg2.riscv_br_priv_lockdown()
>> +    }

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Tegra notes for Nova: [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  2026-02-20 22:05     ` Tegra notes for Nova: " John Hubbard
@ 2026-02-23  3:36       ` Alexandre Courbot
  0 siblings, 0 replies; 66+ messages in thread
From: Alexandre Courbot @ 2026-02-23  3:36 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Alistair Popple, Eliot Courtney,
	Zhi Wang, Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, nouveau,
	rust-for-linux, LKML, Vikram Sethi

On Sat Feb 21, 2026 at 7:05 AM JST, John Hubbard wrote:
> On 2/17/26 8:28 AM, Danilo Krummrich wrote:
>> On Tue Feb 10, 2026 at 3:45 AM CET, John Hubbard wrote:
>>> +    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
>>> +    pub(crate) fn send_msg(&self, bar: &Bar0, packet: &[u8]) -> Result {
>>> +    pub(crate) fn recv_msg(&self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result<usize> {
>> 
>> Just a quick note, since I just reminded myself on this: We should keep in mind
>> that at some point we have to replace most (if not all) &Bar0 usages with
>> &Mmio<SIZE> as nova-core will also support platform devices.
>> 
>> I think Tegra chips with GSP-based GPU IPs have a compatible register layout,
>> right?
>
> Well...I'd characterize it a little differently. Here are some notes
> from a recent discussion with Tegra, to help shed some light on that:
>
> (+Cc: Vikram Sethi, to keep things accurate.)
>
> Some Tegra chips have a Blackwell iGPU that looks like a PCIe device and
> acts like a PCIe device, and has GSP, just like a dGPU (discrete GPU).
>
> iGPUs don’t have FSP, nor a separate VBIOS, nor even vidmem. So the boot
> sequence for them is different, but only from the first couple of steps
> where SoC root of trust does the work of FSP, and boot skips any vidmem
> related code.
>
> Tegra display is sufficiently different (with several quirky variations
> too) from dGPU that it's an open question whether Nova is even the right
> driver to use for that aspect.

Back in the day (T124/T210), Nouveau was only providing a render node,
with a completely different driver (tegra_drm) driving the display.
Unless the display IP has aligned with that of dGPUs there seems to be
little reason to change that.

Regarding the initial discussion: for layers where the `Bar0`
characterization is not needed, I agree it makes sense to use the lowest
`Io` denominator. We could even consider making that code generic
against `Io`, as it would open the door to things like e.g. emulating
the behavior of some registers for tests that don't require hardware.
Not sure whether we want to go down that path though... :)

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2026-02-23  3:36 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-10  2:45 [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
2026-02-10  2:45 ` [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
2026-02-11 10:06   ` Danilo Krummrich
2026-02-11 18:48     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 02/33] gpu: nova-core: print FB sizes, along with ranges John Hubbard
2026-02-10  2:45 ` [PATCH v4 03/33] gpu: nova-core: add FbRange.len() and use it in boot.rs John Hubbard
2026-02-10  2:45 ` [PATCH v4 04/33] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
2026-02-10  2:45 ` [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section() John Hubbard
2026-02-11 10:16   ` Danilo Krummrich
2026-02-12  0:39     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 06/33] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
2026-02-10  2:45 ` [PATCH v4 07/33] gpu: nova-core: apply the one "use" item per line policy to commands.rs John Hubbard
2026-02-10  2:45 ` [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-02-11 10:28   ` Danilo Krummrich
2026-02-12  2:06     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
2026-02-11 10:09   ` Danilo Krummrich
2026-02-12  1:49     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 10/33] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
2026-02-10  2:45 ` [PATCH v4 11/33] gpu: nova-core: factor out a section_name_eq() function John Hubbard
2026-02-10  2:45 ` [PATCH v4 12/33] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-02-10  2:45 ` [PATCH v4 13/33] gpu: nova-core: add support for 32-bit " John Hubbard
2026-02-10  2:45 ` [PATCH v4 14/33] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-02-10  2:45 ` [PATCH v4 15/33] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
2026-02-10  2:45 ` [PATCH v4 16/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-02-10  2:45 ` [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-02-11 10:57   ` Danilo Krummrich
2026-02-12  2:09     ` John Hubbard
2026-02-17 15:43       ` Danilo Krummrich
2026-02-19  2:54         ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-02-17 16:28   ` Danilo Krummrich
2026-02-20 22:05     ` Tegra notes for Nova: " John Hubbard
2026-02-23  3:36       ` Alexandre Courbot
2026-02-10  2:45 ` [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
2026-02-17 16:39   ` Danilo Krummrich
2026-02-19  3:01     ` John Hubbard
2026-02-19  9:01       ` Miguel Ojeda
2026-02-20 22:08         ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-02-17 17:13   ` Danilo Krummrich
2026-02-20 23:26     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 21/33] gpu: nova-core: Hopper/Blackwell: add FSP message structures John Hubbard
2026-02-10  2:45 ` [PATCH v4 22/33] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-02-10  2:45 ` [PATCH v4 23/33] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-02-10  2:45 ` [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-02-17 18:16   ` Danilo Krummrich
2026-02-20 23:35     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-02-17 20:04   ` Danilo Krummrich
2026-02-20 23:57     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 26/33] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-02-10  2:45 ` [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-02-17 20:10   ` Danilo Krummrich
2026-02-21  1:01     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper John Hubbard
2026-02-17 20:12   ` Danilo Krummrich
2026-02-21  1:03     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-02-17 20:20   ` Danilo Krummrich
2026-02-21  1:06     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 30/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path John Hubbard
2026-02-10  2:45 ` [PATCH v4 31/33] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-02-10  2:45 ` [PATCH v4 32/33] gpu: nova-core: clarify the GPU firmware boot steps John Hubbard
2026-02-10  2:46 ` [PATCH v4 33/33] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
2026-02-10 22:27 ` [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox