[PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support

Rust for Linux List
 help / color / mirror / Atom feed

* [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support
@ 2026-05-30  3:09 John Hubbard
  2026-05-30  3:09 ` [PATCH v11 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
                   ` (22 more replies)
  0 siblings, 23 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Changes in v11:

* Made the FSP messaging path sound. The FSP falcon's EMEM window is a
  stateful register pair (program an offset, then touch the data
  register), so modeling it as a stateless I/O region let aliasing
  accesses corrupt each other's offset with no unsafe at the call site.
  The EMEM accessors and the send/receive helpers now take &mut self, so
  the falcon handle is the exclusive token for an in-flight exchange,
  and the unsafe Io/IoCapable impls and their unreachable! bounds checks
  are gone. The accessors now program the EMEM offset once and stream
  through the data register using the falcon's auto-increment, matching
  Open RM, instead of re-programming the offset for every word.

* Rebased onto a current drm-rust-next that already carries the v10
  preparatory patches, which are dropped from the series.

* Top of the series: the v10 boot-integration patch is replaced by "gsp:
  enable FSP boot path" (Alexandre Courbot) and "add non-sec2 unload
  path" (Eliot Courtney). The Hopper/Blackwell boot path now lives in
  the GSP HAL (gsp/hal/gh100.rs) and returns a BootUnloadGuard.

* Reordered per review: hardware-differences patches first (DMA mask,
  PCI config mirror, PMU-reserved framebuffer, non-WPR heap, WPR2 heap,
  sysmem flush registers), then the FSP/FMC stack, then GSP lockdown
  release polling.

* Hardware-difference patches are now HAL methods instead of inline
  Architecture matches: the PMU-reserved framebuffer size (patch
  retitled from "calculate reserved FB heap size" to "compute
  PMU-reserved framebuffer size"), the non-WPR heap size (now u32 with a
  1 MiB default instead of Option<u32>, per v10 review, with the GB10x
  value in the GB100 HAL and the larger GB20x value in the GB202 HAL),
  and the PCI config mirror range. The larger WPR2 heap pulls its base
  size from the generated bindings, drops the custom constants that have
  no Open RM counterpart, and matches all architectures exhaustively.

* FSP firmware handling moved into firmware/fsp.rs: FspFirmware now
  holds parsed signatures (KBox<FmcSignatures>) instead of a raw ELF
  copy, extracted through a get_section closure (per v10 review).

* FSP secure-boot polling uses a per-chipset FSP HAL
  (fsp/hal/{gh100,gb202}.rs) reading the correct NV_THERM_I2CS register,
  instead of a free function in regs.rs.

* FSP Chain of Trust boot was redone around a new FmcBootArgs type, and
  the response headers are strongly typed (MctpHeader/NvdmHeader instead
  of bare u32), with the vendor ID from kernel::pci::Vendor.

* GB10x/GB20x sysmem flush: the HSHUB0/FBHUB0 register details moved
  from module doccomments onto the write_sysmem_flush_page_* methods.

* Commit message cleanups: dropped stale claims, shortened an
  over-length subject, and fixed trailer ordering.

Changes in v10:

* Reordered per review (and direct assistance--thanks again) from
  Alexandre Courbot: the two refactoring patches (factor .fwsignature*
  selection, use GPU Architecture to simplify HALs) now come first,
  before GPU identification. The boot_via_fsp stub is introduced early
  and completed as FSP features arrive. The SEC2 refactoring, PCI config
  mirror, and reserved heap size patches are moved earlier in the
  series.

* Made pmuReservedSize conditional on Blackwell dGPU architectures.
  Open RM only sets this field for Blackwell (Turing/Ampere/Ada/Hopper
  all leave it zero). Added calc_pmu_reserved_size() helper and
  FbLayout.pmu_reserved_size field to route the value through the
  layout instead of using the constant unconditionally. Replaced
  `as u32` cast with usize_into_u32 for PMU_RESERVED_SIZE. (Alexandre)

* Split the GFW boot wait HAL change into two patches: one that moves
  the existing behavior into a GpuHal trait, and a second that adds the
  Hopper/Blackwell skip.

* Removed the Spec::chipset() accessor (no longer needed after
  restructuring). Updated the Copy/Clone commit message accordingly.

* Rebased onto drm-rust-next-staging, which includes
  const_align_up(), "move firmware image parsing code to firmware.rs",
  "factor out an elf_str() function", and "make WPR heap sizing
  fallible" from the v9 series. Series is now 28 patches (was 31).

* Depends on the "rust: sizes: SizeConstants trait" series[N], which
  adds typed SZ_* constants (u64::SZ_1M, u32::SZ_4K, etc.). The
  nova-core conversion patch ("use SizeConstants trait for u64 size
  constants") will be posted separately, but is already included in my
  git branch. The Blackwell patches that introduce new SZ_* usage
  (larger non-WPR heap, FSP Chain of Trust boot, larger WPR2 heap) use
  the trait form from the start.

* Fixed the PCI config mirror commit message: corrected hex offsets to
  match the code (older architectures use 0x088000, Hopper/Blackwell
  use 0x092000).

* Dropped the never-used nvdm_type_raw() method from the MCTP/NVDM
  introducing patch.

* Removed stale Co-developed-by tag from the FSP Chain of Trust boot
  commit per Alex's request. Rewrote the commit message to remove
  references to the no-longer-existent fmc_full field.

* Added missing #[expect(dead_code)] on GspFmcBootParams in the FSP
  secure boot commit, removed when the struct becomes used in the
  Chain of Trust boot commit.

Changes in v9:

* Rebased onto today's drm-rust-next.

* Split Architecture::Blackwell into BlackwellGB10x and BlackwellGB20x,
  after Gary Guo and Sashiko pointed out that GB10x and GB20x are
  distinct enough to warrant separate architecture variants. This
  surfaced several bugs where all Blackwell chips were incorrectly
  treated as a single group:
  * Fixed the FSP boot completion register address for GB10x. GB10x
    uses the same address as Hopper (0x000200bc), not the GB20x
    address (0x00ad00bc).
  * Made the FSP secure boot timeout architecture-dependent. GB20x
    now gets 5000ms while Hopper and GB10x keep 4000ms.
  * Removed chipset-level match arms that were working around the
    single-variant design in fb/hal.rs, firmware/gsp.rs, and regs.rs.

* Simplified find_gsp_sigs_section() to return &'static str instead of
  Option<&'static str>, since the Architecture enum is now exhaustive
  and every variant has a known signature section name.

* Moved dma_set_mask_and_coherent from probe() into Gpu::new(), with
  the unsafe block narrowed to just that call. Gpu::new() now takes
  pci::Device<device::Core> instead of device::Bound to support this.

* Dropped the local `chipset` variable in Gpu::new() and accessed
  spec.chipset() directly, since Spec is now Copy.

* Changed Spec::chipset() to take self instead of &self, since Spec is
  Copy.

* Removed the unnecessary Tu102/Gh100 consts in gpu/hal.rs and used the
  unit structs directly.

* Kept a hold on the Firmware object in FspFirmware instead of copying
  the FMC ELF into a KVec<u8>.

* Moved the dev_info formatting fix and the GFW_BOOT comment removal
  out of the Copy/Clone patch and into the patches that actually touch
  those lines.

* Added Reviewed-by tags from Gary Guo and Alice Ryhl.

Changes in v8:

* Added Clone/Copy derives to Spec and Revision. Removed the
  unnecessary pin_init_scope wrapping in Gpu::new() that the lack of
  Copy had forced. Added a Spec::chipset() accessor.

* Removed implementation-detail sentence from the
  Architecture::dma_mask() doccomment.

* Simplified the GPU HAL to two variants (Tu102, Gh100) instead of
  four. Renamed "Fsp" to "Gh100" to follow the HAL naming convention.
  Removed the spurious GA100 special case. Moved the GFW_BOOT wait into
  the HAL method itself instead of returning a bool.

* Increased the GFW_BOOT wait timeout from 4 seconds to 30 seconds,
  after Joel found that a different Blackwell SKU required extra time.

* Removed stray Cc lines from each patch.

* Fixed rustfmt issues in gsp/fw.rs and gsp/boot.rs reported by the
  kernel test robot against v7 patches 27 and 31.

Changes in v7:
* Rebased onto Alexandre Courbot's rust register!() series in
  drm-rust-next, including the related generic I/O accessor and
  IoCapable changes.

* Rebased onto drm-rust-next (v7.0-rc4 based).

* Dropped the v6 patches that are already in drm-rust-next: the
  aux-device fix, the pdev helper macro patch, and the one-item-per-line
  use cleanup.

* Reworked the GPU init pieces per review. DMA mask setup now stays in
  driver probe, with the mask width selected by GPU architecture, and
  the GFW boot policy now lives in a dedicated GPU HAL.

* Reworked firmware image parsing per review around a single ElfFormat
  trait with associated header types. Also added support for both ELF32
  and ELF64 images, with automatic format detection.

* Reworked the MCTP/NVDM protocol code to use bitfield! and typed
  accessors, removing the open-coded bit handling.

* Reworked the FSP messaging part of the series so that the message
  structures are introduced in the first patches that use them, instead
  of as a standalone dead-code-only patch. Also changed fmc_full to use
  KVec<u8> from the start.

* Split the WPR heap overflow handling out into a separate prep patch.
  That patch makes management_overhead() and wpr_heap_size() fallible,
  uses checked arithmetic, and leaves the larger WPR2 heap patch with
  only the Hopper and Blackwell sizing changes.

* Added a code comment documenting the Hopper and Blackwell PCI config
  mirror base change.

Changes in v6:

* Rebased onto drm-rust-next (v7.0-rc1 based).

* Dropped the first two patches from v5 (aux device fix and pdev
  macros), which have since been merged independently.

* const_align_up(): reworked per review from Gary Guo, Miguel Ojeda,
  and Danilo Krummrich: now returns Option<usize> instead of panicking,
  takes an Alignment argument instead of a const generic, and no longer
  needs the inline_const feature addition in scripts/Makefile.build.

* The rust/sizes and SZ_*_U64 patches from v5 are no longer included.
  I plan to post those as a separate series that depends on this one.

Changes in v5:

* Rebased onto linux.git master.

* Split MCTP protocol into its own module and file.

* Many Rust-based improvements: more use of types, especially. Also
  used Result and Option more.

* Lots of cleanup of comments and print output and error handling.

* Added const_align_up() to rust/ and used it in nova-core. This
  required enabling a Rust feature: inline_const, as recommended by
  Miguel Ojeda.

* Refactoring various things, such as Gpu::new() to own Spec creation,
  and several more such things.

* Fixed three Delta::ZERO busy-polls (patches 21, 24, 31) to use
  non-zero sleep intervals (after just realizing that it was a bad
  choice to have zero in there).

* Reduced GH100/GB100 HAL duplication. Made FSP_PKEY_SIZE/FSP_SIG_SIZE
  consistent across patches. Replaced fragile architecture checks with
  chipset.arch(). Renamed LIBOS_BLACKWELL.

* Narrowed the scope of some of the #![expect(dead_code)] cases,
  although that really only matters within the series, not once it is
  fully applied.

[1] https://github.com/Gnurou/linux/commits/drm-rust-next-staging/
[2] https://lore.kernel.org/20260411024118.471294-1-jhubbard@nvidia.com

Alexandre Courbot (1):
  gpu: nova-core: gsp: enable FSP boot path

Eliot Courtney (1):
  gpu: nova-core: add non-sec2 unload path

John Hubbard (20):
  gpu: nova-core: set DMA mask width based on GPU architecture
  gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
  gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  gpu: nova-core: Blackwell: use correct sysmem flush registers
  gpu: nova-core: don't assume 64-bit firmware images
  gpu: nova-core: add support for 32-bit firmware images
  gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  gpu: nova-core: Hopper/Blackwell: add FMC firmware image
  gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
    waiting
  gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  gpu: nova-core: add MCTP/NVDM protocol types for firmware
    communication
  gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling

 drivers/gpu/nova-core/driver.rs               |  15 -
 drivers/gpu/nova-core/falcon.rs               |   1 +
 drivers/gpu/nova-core/falcon/fsp.rs           | 202 +++++++++++
 drivers/gpu/nova-core/fb.rs                   |   8 +-
 drivers/gpu/nova-core/fb/hal.rs               |  28 +-
 drivers/gpu/nova-core/fb/hal/ga100.rs         |   5 +
 drivers/gpu/nova-core/fb/hal/ga102.rs         |   7 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs         | 102 ++++++
 drivers/gpu/nova-core/fb/hal/gb202.rs         |  86 +++++
 drivers/gpu/nova-core/fb/hal/gh100.rs         |  50 +++
 drivers/gpu/nova-core/fb/hal/tu102.rs         |   9 +
 drivers/gpu/nova-core/firmware.rs             | 176 +++++++--
 drivers/gpu/nova-core/firmware/fsp.rs         | 129 +++++++
 drivers/gpu/nova-core/firmware/gsp.rs         |   4 +-
 drivers/gpu/nova-core/fsp.rs                  | 334 ++++++++++++++++++
 drivers/gpu/nova-core/fsp/hal.rs              |  27 ++
 drivers/gpu/nova-core/fsp/hal/gb202.rs        |  23 ++
 drivers/gpu/nova-core/fsp/hal/gh100.rs        |  23 ++
 drivers/gpu/nova-core/gpu.rs                  |  34 +-
 drivers/gpu/nova-core/gpu/hal.rs              |  13 +-
 drivers/gpu/nova-core/gpu/hal/gh100.rs        |  18 +-
 drivers/gpu/nova-core/gpu/hal/tu102.rs        |  14 +
 drivers/gpu/nova-core/gsp.rs                  |   1 +
 drivers/gpu/nova-core/gsp/boot.rs             |   2 +-
 drivers/gpu/nova-core/gsp/commands.rs         |   8 +-
 drivers/gpu/nova-core/gsp/fw.rs               |  85 ++++-
 drivers/gpu/nova-core/gsp/fw/commands.rs      |  15 +-
 .../gpu/nova-core/gsp/fw/r570_144/bindings.rs |  83 +++++
 drivers/gpu/nova-core/gsp/hal/gh100.rs        | 166 ++++++++-
 drivers/gpu/nova-core/mctp.rs                 | 100 ++++++
 drivers/gpu/nova-core/nova_core.rs            |   2 +
 drivers/gpu/nova-core/regs.rs                 | 111 ++++++
 32 files changed, 1800 insertions(+), 81 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp/hal.rs
 create mode 100644 drivers/gpu/nova-core/fsp/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fsp/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/mctp.rs

base-commit: 2cfcf9dfb48e932d46c3fa9ae99f1607d1a80162
-- 
2.54.0

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 01/22] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  4:01   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Replace the hardcoded 47-bit DMA mask with a GPU HAL method that
provides the correct value for the architecture.

Set the DMA mask in Gpu::new(). Gpu owns all DMA allocations for
the device, so no concurrent allocations can exist while the
constructor is still running.

Acked-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs        | 15 ---------------
 drivers/gpu/nova-core/gpu.rs           | 12 ++++++++++--
 drivers/gpu/nova-core/gpu/hal.rs       |  8 +++++++-
 drivers/gpu/nova-core/gpu/hal/gh100.rs |  9 ++++++++-
 drivers/gpu/nova-core/gpu/hal/tu102.rs |  5 +++++
 5 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index cff5034c2dcd..ade73da68be5 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -3,8 +3,6 @@
 use kernel::{
     auxiliary,
     device::Core,
-    dma::Device,
-    dma::DmaMask,
     pci,
     pci::{
         Class,
@@ -38,14 +36,6 @@ pub(crate) struct NovaCore<'bound> {
 
 const BAR0_SIZE: usize = SZ_16M;
 
-// For now we only support Ampere which can use up to 47-bit DMA addresses.
-//
-// TODO: Add an abstraction for this to support newer GPUs which may support
-// larger DMA addresses. Limiting these GPUs to smaller address widths won't
-// have any adverse affects, unless installed on systems which require larger
-// DMA addresses. These systems should be quite rare.
-const GPU_DMA_BITS: u32 = 47;
-
 pub(crate) type Bar0 = kernel::io::Mmio<BAR0_SIZE>;
 
 kernel::pci_device_table!(
@@ -88,11 +78,6 @@ fn probe<'bound>(
             pdev.enable_device_mem()?;
             pdev.set_master();
 
-            // SAFETY: No concurrent DMA allocations or mappings can be made because
-            // the device is still being probed and therefore isn't being used by
-            // other threads of execution.
-            unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
-
             Ok(try_pin_init!(NovaCore {
                 bar: pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0")?,
                 // TODO: Use `&bar` self-referential pin-init syntax once available.
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index aed992488db3..38c75df77e16 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -2,6 +2,7 @@
 
 use kernel::{
     device,
+    dma::Device,
     fmt,
     io::Io,
     num::Bounded,
@@ -269,7 +270,7 @@ pub(crate) struct Gpu<'gpu> {
 
 impl<'gpu> Gpu<'gpu> {
     pub(crate) fn new(
-        pdev: &'gpu pci::Device<device::Bound>,
+        pdev: &'gpu pci::Device<device::Core<'_>>,
         bar: &'gpu Bar0,
     ) -> impl PinInit<Self, Error> + 'gpu {
         try_pin_init!(Self {
@@ -280,7 +281,14 @@ pub(crate) fn new(
 
             // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
             _: {
-                hal::gpu_hal(spec.chipset).wait_gfw_boot_completion(bar)
+                let hal = hal::gpu_hal(spec.chipset);
+                let dma_mask = hal.dma_mask();
+
+                // SAFETY: `Gpu` owns all DMA allocations for this device, and we are
+                // still constructing it, so no concurrent DMA allocations can exist.
+                unsafe { pdev.dma_set_mask_and_coherent(dma_mask)? };
+
+                hal.wait_gfw_boot_completion(bar)
                     .inspect_err(|_| dev_err!(pdev, "GFW boot did not complete\n"))?;
             },
 
diff --git a/drivers/gpu/nova-core/gpu/hal.rs b/drivers/gpu/nova-core/gpu/hal.rs
index 788de20ab5d3..0b636b713593 100644
--- a/drivers/gpu/nova-core/gpu/hal.rs
+++ b/drivers/gpu/nova-core/gpu/hal.rs
@@ -1,6 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use kernel::prelude::*;
+use kernel::{
+    dma::DmaMask,
+    prelude::*, //
+};
 
 use crate::{
     driver::Bar0,
@@ -16,6 +19,9 @@
 pub(crate) trait GpuHal {
     /// Waits for GFW_BOOT completion if required by this hardware family.
     fn wait_gfw_boot_completion(&self, bar: &Bar0) -> Result;
+
+    /// Returns the DMA mask for the current architecture.
+    fn dma_mask(&self) -> DmaMask;
 }
 
 pub(super) fn gpu_hal(chipset: Chipset) -> &'static dyn GpuHal {
diff --git a/drivers/gpu/nova-core/gpu/hal/gh100.rs b/drivers/gpu/nova-core/gpu/hal/gh100.rs
index 1ed5bccdda1d..41fbabb04ff8 100644
--- a/drivers/gpu/nova-core/gpu/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gpu/hal/gh100.rs
@@ -1,6 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use kernel::prelude::*;
+use kernel::{
+    dma::DmaMask,
+    prelude::*, //
+};
 
 use crate::driver::Bar0;
 
@@ -12,6 +15,10 @@ impl GpuHal for Gh100 {
     fn wait_gfw_boot_completion(&self, _bar: &Bar0) -> Result {
         Ok(())
     }
+
+    fn dma_mask(&self) -> DmaMask {
+        DmaMask::new::<52>()
+    }
 }
 
 const GH100: Gh100 = Gh100;
diff --git a/drivers/gpu/nova-core/gpu/hal/tu102.rs b/drivers/gpu/nova-core/gpu/hal/tu102.rs
index 08dd4434bd72..2881ab03dbcd 100644
--- a/drivers/gpu/nova-core/gpu/hal/tu102.rs
+++ b/drivers/gpu/nova-core/gpu/hal/tu102.rs
@@ -19,6 +19,7 @@
 //! Note that the devinit sequence also needs to run during suspend/resume.
 
 use kernel::{
+    dma::DmaMask,
     io::{
         poll::read_poll_timeout,
         Io, //
@@ -80,6 +81,10 @@ fn wait_gfw_boot_completion(&self, bar: &Bar0) -> Result {
         )
         .map(|_| ())
     }
+
+    fn dma_mask(&self) -> DmaMask {
+        DmaMask::new::<47>()
+    }
 }
 
 const TU102: Tu102 = Tu102;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 01/22] gpu: nova-core: set DMA mask width based on GPU architecture
  2026-05-30  3:09 ` [PATCH v11 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
@ 2026-06-01  4:01   ` Eliot Courtney
  0 siblings, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  4:01 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Replace the hardcoded 47-bit DMA mask with a GPU HAL method that
> provides the correct value for the architecture.
>
> Set the DMA mask in Gpu::new(). Gpu owns all DMA allocations for
> the device, so no concurrent allocations can exist while the
> constructor is still running.
>
> Acked-by: Danilo Krummrich <dakr@kernel.org>
> Reviewed-by: Gary Guo <gary@garyguo.net>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-05-30  3:09 ` [PATCH v11 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  4:04   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell GPUs moved the PCI config space mirror from
0x088000 to 0x092000. Select the correct address per architecture
when building the GSP system info command.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs             |  7 +++++++
 drivers/gpu/nova-core/gpu/hal.rs         |  5 +++++
 drivers/gpu/nova-core/gpu/hal/gh100.rs   |  9 +++++++++
 drivers/gpu/nova-core/gpu/hal/tu102.rs   |  9 +++++++++
 drivers/gpu/nova-core/gsp/boot.rs        |  2 +-
 drivers/gpu/nova-core/gsp/commands.rs    |  8 +++++---
 drivers/gpu/nova-core/gsp/fw/commands.rs | 15 +++++++++++----
 7 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 38c75df77e16..7dd736e5b190 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::ops::Range;
+
 use kernel::{
     device,
     dma::Device,
@@ -134,6 +136,11 @@ pub(crate) const fn arch(self) -> Architecture {
     pub(crate) const fn needs_fwsec_bootloader(self) -> bool {
         matches!(self.arch(), Architecture::Turing) || matches!(self, Self::GA100)
     }
+
+    /// Returns the address range of the PCI config mirror space.
+    pub(crate) fn pci_config_mirror_range(self) -> Range<u32> {
+        hal::gpu_hal(self).pci_config_mirror_range()
+    }
 }
 
 // TODO
diff --git a/drivers/gpu/nova-core/gpu/hal.rs b/drivers/gpu/nova-core/gpu/hal.rs
index 0b636b713593..cd833bd49b9b 100644
--- a/drivers/gpu/nova-core/gpu/hal.rs
+++ b/drivers/gpu/nova-core/gpu/hal.rs
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::ops::Range;
+
 use kernel::{
     dma::DmaMask,
     prelude::*, //
@@ -22,6 +24,9 @@ pub(crate) trait GpuHal {
 
     /// Returns the DMA mask for the current architecture.
     fn dma_mask(&self) -> DmaMask;
+
+    /// Returns the address range of the PCI config mirror space.
+    fn pci_config_mirror_range(&self) -> Range<u32>;
 }
 
 pub(super) fn gpu_hal(chipset: Chipset) -> &'static dyn GpuHal {
diff --git a/drivers/gpu/nova-core/gpu/hal/gh100.rs b/drivers/gpu/nova-core/gpu/hal/gh100.rs
index 41fbabb04ff8..17778a618900 100644
--- a/drivers/gpu/nova-core/gpu/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gpu/hal/gh100.rs
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::ops::Range;
+
 use kernel::{
     dma::DmaMask,
     prelude::*, //
@@ -19,6 +21,13 @@ fn wait_gfw_boot_completion(&self, _bar: &Bar0) -> Result {
     fn dma_mask(&self) -> DmaMask {
         DmaMask::new::<52>()
     }
+
+    fn pci_config_mirror_range(&self) -> Range<u32> {
+        const PCI_CONFIG_MIRROR_START: u32 = 0x092000;
+        const PCI_CONFIG_MIRROR_SIZE: u32 = 0x001000;
+
+        PCI_CONFIG_MIRROR_START..PCI_CONFIG_MIRROR_START + PCI_CONFIG_MIRROR_SIZE
+    }
 }
 
 const GH100: Gh100 = Gh100;
diff --git a/drivers/gpu/nova-core/gpu/hal/tu102.rs b/drivers/gpu/nova-core/gpu/hal/tu102.rs
index 2881ab03dbcd..125478bfe07a 100644
--- a/drivers/gpu/nova-core/gpu/hal/tu102.rs
+++ b/drivers/gpu/nova-core/gpu/hal/tu102.rs
@@ -18,6 +18,8 @@
 //!
 //! Note that the devinit sequence also needs to run during suspend/resume.
 
+use core::ops::Range;
+
 use kernel::{
     dma::DmaMask,
     io::{
@@ -85,6 +87,13 @@ fn wait_gfw_boot_completion(&self, bar: &Bar0) -> Result {
     fn dma_mask(&self) -> DmaMask {
         DmaMask::new::<47>()
     }
+
+    fn pci_config_mirror_range(&self) -> Range<u32> {
+        const PCI_CONFIG_MIRROR_START: u32 = 0x088000;
+        const PCI_CONFIG_MIRROR_SIZE: u32 = 0x001000;
+
+        PCI_CONFIG_MIRROR_START..PCI_CONFIG_MIRROR_START + PCI_CONFIG_MIRROR_SIZE
+    }
 }
 
 const TU102: Tu102 = Tu102;
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 087ee59da6d9..8c316fa2e585 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -144,7 +144,7 @@ pub(crate) fn boot(
         dev_dbg!(pdev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(bar),);
 
         self.cmdq
-            .send_command_no_wait(bar, commands::SetSystemInfo::new(pdev))?;
+            .send_command_no_wait(bar, commands::SetSystemInfo::new(pdev, chipset))?;
         self.cmdq
             .send_command_no_wait(bar, commands::SetRegistry::new())?;
 
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 3a365455d10c..f84de9f4f045 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -19,6 +19,7 @@
 };
 
 use crate::{
+    gpu::Chipset,
     gsp::{
         cmdq::{
             Cmdq,
@@ -37,12 +38,13 @@
 /// The `GspSetSystemInfo` command.
 pub(crate) struct SetSystemInfo<'a> {
     pdev: &'a pci::Device<device::Bound>,
+    chipset: Chipset,
 }
 
 impl<'a> SetSystemInfo<'a> {
     /// Creates a new `GspSetSystemInfo` command using the parameters of `pdev`.
-    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>) -> Self {
-        Self { pdev }
+    pub(crate) fn new(pdev: &'a pci::Device<device::Bound>, chipset: Chipset) -> Self {
+        Self { pdev, chipset }
     }
 }
 
@@ -53,7 +55,7 @@ impl<'a> CommandToGsp for SetSystemInfo<'a> {
     type InitError = Error;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
-        Self::Command::init(self.pdev)
+        Self::Command::init(self.pdev, self.chipset)
     }
 }
 
diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs
index 42985d446bae..7bcc41fc7fa0 100644
--- a/drivers/gpu/nova-core/gsp/fw/commands.rs
+++ b/drivers/gpu/nova-core/gsp/fw/commands.rs
@@ -11,7 +11,10 @@
     }, //
 };
 
-use crate::gsp::GSP_PAGE_SIZE;
+use crate::{
+    gpu::Chipset,
+    gsp::GSP_PAGE_SIZE, //
+};
 
 use super::bindings;
 
@@ -25,8 +28,12 @@ pub(crate) struct GspSetSystemInfo {
 impl GspSetSystemInfo {
     /// Returns an in-place initializer for the `GspSetSystemInfo` command.
     #[allow(non_snake_case)]
-    pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, Error> + 'a {
+    pub(crate) fn init<'a>(
+        dev: &'a pci::Device<device::Bound>,
+        chipset: Chipset,
+    ) -> impl Init<Self, Error> + 'a {
         type InnerGspSystemInfo = bindings::GspSystemInfo;
+        let pci_config_mirror_range = chipset.pci_config_mirror_range();
         let init_inner = try_init!(InnerGspSystemInfo {
             gpuPhysAddr: dev.resource_start(0)?,
             gpuPhysFbAddr: dev.resource_start(1)?,
@@ -36,8 +43,8 @@ pub(crate) fn init<'a>(dev: &'a pci::Device<device::Bound>) -> impl Init<Self, E
             // Using TASK_SIZE in r535_gsp_rpc_set_system_info() seems wrong because
             // TASK_SIZE is per-task. That's probably a design issue in GSP-RM though.
             maxUserVa: (1 << 47) - 4096,
-            pciConfigMirrorBase: 0x088000,
-            pciConfigMirrorSize: 0x001000,
+            pciConfigMirrorBase: pci_config_mirror_range.start,
+            pciConfigMirrorSize: pci_config_mirror_range.end - pci_config_mirror_range.start,
 
             PCIDeviceID: (u32::from(dev.device_id()) << 16) | u32::from(dev.vendor_id().as_raw()),
             PCISubDeviceID: (u32::from(dev.subsystem_device_id()) << 16)
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  2026-05-30  3:09 ` [PATCH v11 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
@ 2026-06-01  4:04   ` Eliot Courtney
  0 siblings, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  4:04 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Hopper and Blackwell GPUs moved the PCI config space mirror from
> 0x088000 to 0x092000. Select the correct address per architecture
> when building the GSP system info command.
>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  2026-05-30  3:09 ` [PATCH v11 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
  2026-05-30  3:09 ` [PATCH v11 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  2:07   ` Alexandre Courbot
  2026-06-01  4:41   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
                   ` (19 subsequent siblings)
  22 siblings, 2 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

GSP boot needs to know how much framebuffer memory is reserved for
the PMU. Compute it per architecture: Blackwell dGPUs reserve a
non-zero amount, earlier architectures leave it at zero, matching
Open RM behavior.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs           |  3 ++
 drivers/gpu/nova-core/fb/hal.rs       | 14 ++++---
 drivers/gpu/nova-core/fb/hal/ga100.rs |  5 +++
 drivers/gpu/nova-core/fb/hal/ga102.rs |  7 +++-
 drivers/gpu/nova-core/fb/hal/gb100.rs | 57 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/fb/hal/gh100.rs | 42 ++++++++++++++++++++
 drivers/gpu/nova-core/fb/hal/tu102.rs |  9 +++++
 drivers/gpu/nova-core/gsp/fw.rs       |  1 +
 8 files changed, 132 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 1fb65d4eb290..d7a4dc944131 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -165,6 +165,8 @@ pub(crate) struct FbLayout {
     pub(crate) wpr2: FbRange,
     pub(crate) heap: FbRange,
     pub(crate) vf_partition_count: u8,
+    /// PMU reserved memory size, in bytes.
+    pub(crate) pmu_reserved_size: u32,
 }
 
 impl FbLayout {
@@ -265,6 +267,7 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
             wpr2,
             heap,
             vf_partition_count: 0,
+            pmu_reserved_size: hal.pmu_reserved_size(),
         })
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index 8b192a503363..e6ac55bba9b9 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
 use kernel::prelude::*;
 
@@ -12,6 +13,8 @@
 
 mod ga100;
 mod ga102;
+mod gb100;
+mod gh100;
 mod tu102;
 
 pub(crate) trait FbHal {
@@ -29,6 +32,9 @@ pub(crate) trait FbHal {
     /// Returns the VRAM size, in bytes.
     fn vidmem_size(&self, bar: &Bar0) -> u64;
 
+    /// Returns the amount of VRAM to reserve for the PMU.
+    fn pmu_reserved_size(&self) -> u32;
+
     /// Returns the FRTS size, in bytes.
     fn frts_size(&self) -> u64;
 }
@@ -38,10 +44,8 @@ pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
     match chipset.arch() {
         Architecture::Turing => tu102::TU102_HAL,
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
-        Architecture::Ampere => ga102::GA102_HAL,
-        Architecture::Ada
-        | Architecture::Hopper
-        | Architecture::BlackwellGB10x
-        | Architecture::BlackwellGB20x => ga102::GA102_HAL,
+        Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
+        Architecture::Hopper => gh100::GH100_HAL,
+        Architecture::BlackwellGB10x | Architecture::BlackwellGB20x => gb100::GB100_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/ga100.rs b/drivers/gpu/nova-core/fb/hal/ga100.rs
index 2f5871d915c3..0f5132aa9c31 100644
--- a/drivers/gpu/nova-core/fb/hal/ga100.rs
+++ b/drivers/gpu/nova-core/fb/hal/ga100.rs
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
 use kernel::{
     io::Io,
@@ -67,6 +68,10 @@ fn vidmem_size(&self, bar: &Bar0) -> u64 {
         super::tu102::vidmem_size_gp102(bar)
     }
 
+    fn pmu_reserved_size(&self) -> u32 {
+        super::tu102::pmu_reserved_size_tu102()
+    }
+
     // GA100 is a special case where its FRTS region exists, but is empty.  We
     // return a size of 0 because we still need to record where the region starts.
     fn frts_size(&self) -> u64 {
diff --git a/drivers/gpu/nova-core/fb/hal/ga102.rs b/drivers/gpu/nova-core/fb/hal/ga102.rs
index 3bb66f64bef7..17a2fef1ad44 100644
--- a/drivers/gpu/nova-core/fb/hal/ga102.rs
+++ b/drivers/gpu/nova-core/fb/hal/ga102.rs
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
 use kernel::{
     io::Io,
@@ -11,7 +12,7 @@
     regs, //
 };
 
-fn vidmem_size_ga102(bar: &Bar0) -> u64 {
+pub(super) fn vidmem_size_ga102(bar: &Bar0) -> u64 {
     bar.read(regs::NV_USABLE_FB_SIZE_IN_MB).usable_fb_size()
 }
 
@@ -36,6 +37,10 @@ fn vidmem_size(&self, bar: &Bar0) -> u64 {
         vidmem_size_ga102(bar)
     }
 
+    fn pmu_reserved_size(&self) -> u32 {
+        super::tu102::pmu_reserved_size_tu102()
+    }
+
     fn frts_size(&self) -> u64 {
         super::tu102::frts_size_tu102()
     }
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
new file mode 100644
index 000000000000..c78027c26a9e
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+//! Blackwell framebuffer HAL.
+
+use kernel::{
+    prelude::*,
+    ptr::{
+        const_align_up,
+        Alignment, //
+    },
+    sizes::*, //
+};
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal,
+    num::usize_into_u32, //
+};
+
+struct Gb100;
+
+const fn pmu_reserved_size_gb100() -> u32 {
+    usize_into_u32::<{ const_align_up(SZ_8M + SZ_16M + SZ_4K, Alignment::new::<SZ_128K>()).unwrap() }>(
+    )
+}
+
+impl FbHal for Gb100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn pmu_reserved_size(&self) -> u32 {
+        pmu_reserved_size_gb100()
+    }
+
+    fn frts_size(&self) -> u64 {
+        super::tu102::frts_size_tu102()
+    }
+}
+
+const GB100: Gb100 = Gb100;
+pub(super) const GB100_HAL: &dyn FbHal = &GB100;
diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
new file mode 100644
index 000000000000..c122ac2091f8
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+use kernel::prelude::*;
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gh100;
+
+impl FbHal for Gh100 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn pmu_reserved_size(&self) -> u32 {
+        super::tu102::pmu_reserved_size_tu102()
+    }
+
+    fn frts_size(&self) -> u64 {
+        super::tu102::frts_size_tu102()
+    }
+}
+
+const GH100: Gh100 = Gh100;
+pub(super) const GH100_HAL: &dyn FbHal = &GH100;
diff --git a/drivers/gpu/nova-core/fb/hal/tu102.rs b/drivers/gpu/nova-core/fb/hal/tu102.rs
index 22c174bf1472..1755bbc27866 100644
--- a/drivers/gpu/nova-core/fb/hal/tu102.rs
+++ b/drivers/gpu/nova-core/fb/hal/tu102.rs
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
 use kernel::{
     io::Io,
@@ -39,6 +40,10 @@ pub(super) fn vidmem_size_gp102(bar: &Bar0) -> u64 {
         .usable_fb_size()
 }
 
+pub(super) const fn pmu_reserved_size_tu102() -> u32 {
+    0
+}
+
 pub(super) const fn frts_size_tu102() -> u64 {
     u64::SZ_1M
 }
@@ -62,6 +67,10 @@ fn vidmem_size(&self, bar: &Bar0) -> u64 {
         vidmem_size_gp102(bar)
     }
 
+    fn pmu_reserved_size(&self) -> u32 {
+        pmu_reserved_size_tu102()
+    }
+
     fn frts_size(&self) -> u64 {
         frts_size_tu102()
     }
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 33c9f5860771..919d3ab00075 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -247,6 +247,7 @@ pub(crate) fn new<'a>(
             fbSize: fb_layout.fb.end - fb_layout.fb.start,
             vgaWorkspaceOffset: fb_layout.vga_workspace.start,
             vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
+            pmuReservedSize: fb_layout.pmu_reserved_size,
             ..Zeroable::init_zeroed()
         });
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
  2026-05-30  3:09 ` [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
@ 2026-06-01  2:07   ` Alexandre Courbot
  2026-06-01  5:34     ` Alexandre Courbot
  2026-06-01  4:41   ` Eliot Courtney
  1 sibling, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01  2:07 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
<snip>
> diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
> new file mode 100644
> index 000000000000..c122ac2091f8
> --- /dev/null
> +++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
> @@ -0,0 +1,42 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
> +
> +use kernel::prelude::*;
> +
> +use crate::{
> +    driver::Bar0,
> +    fb::hal::FbHal, //
> +};
> +
> +struct Gh100;
> +
> +impl FbHal for Gh100 {
> +    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
> +        super::ga100::read_sysmem_flush_page_ga100(bar)
> +    }
> +
> +    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
> +        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
> +
> +        Ok(())
> +    }
> +
> +    fn supports_display(&self, bar: &Bar0) -> bool {
> +        super::ga100::display_enabled_ga100(bar)
> +    }
> +
> +    fn vidmem_size(&self, bar: &Bar0) -> u64 {
> +        super::ga102::vidmem_size_ga102(bar)
> +    }
> +
> +    fn pmu_reserved_size(&self) -> u32 {
> +        super::tu102::pmu_reserved_size_tu102()
> +    }
> +
> +    fn frts_size(&self) -> u64 {
> +        super::tu102::frts_size_tu102()
> +    }
> +}

As of this patch this HAL is strictly equivalent to GA102's, with the
first difference between the two introduced by the "larger non-WPR
heap". Could you move the introduction of this HAL to this patch as
well?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
  2026-06-01  2:07   ` Alexandre Courbot
@ 2026-06-01  5:34     ` Alexandre Courbot
  2026-06-01 18:01       ` John Hubbard
  0 siblings, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01  5:34 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Mon Jun 1, 2026 at 11:07 AM JST, Alexandre Courbot wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> <snip>
>> diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
>> new file mode 100644
>> index 000000000000..c122ac2091f8
>> --- /dev/null
>> +++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
>> @@ -0,0 +1,42 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
>> +
>> +use kernel::prelude::*;
>> +
>> +use crate::{
>> +    driver::Bar0,
>> +    fb::hal::FbHal, //
>> +};
>> +
>> +struct Gh100;
>> +
>> +impl FbHal for Gh100 {
>> +    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
>> +        super::ga100::read_sysmem_flush_page_ga100(bar)
>> +    }
>> +
>> +    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
>> +        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
>> +
>> +        Ok(())
>> +    }
>> +
>> +    fn supports_display(&self, bar: &Bar0) -> bool {
>> +        super::ga100::display_enabled_ga100(bar)
>> +    }
>> +
>> +    fn vidmem_size(&self, bar: &Bar0) -> u64 {
>> +        super::ga102::vidmem_size_ga102(bar)
>> +    }
>> +
>> +    fn pmu_reserved_size(&self) -> u32 {
>> +        super::tu102::pmu_reserved_size_tu102()
>> +    }
>> +
>> +    fn frts_size(&self) -> u64 {
>> +        super::tu102::frts_size_tu102()
>> +    }
>> +}
>
> As of this patch this HAL is strictly equivalent to GA102's, with the
> first difference between the two introduced by the "larger non-WPR
> heap". Could you move the introduction of this HAL to this patch as
> well?

Actually please let me know if you prefer me to roll this minor change
and merge the result (and same for the next patch) as we have another
review and there is little point in respinning just for that. I just
want to confirm that this wasn't done on purpose for a reason that
escaped me.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
  2026-06-01  5:34     ` Alexandre Courbot
@ 2026-06-01 18:01       ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:01 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 5/31/26 10:34 PM, Alexandre Courbot wrote:
> On Mon Jun 1, 2026 at 11:07 AM JST, Alexandre Courbot wrote:
>> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
>> <snip>
>>> diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
>>> new file mode 100644
>>> index 000000000000..c122ac2091f8
>>> --- /dev/null
>>> +++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
>>> @@ -0,0 +1,42 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
>>> +
>>> +use kernel::prelude::*;
>>> +
>>> +use crate::{
>>> +    driver::Bar0,
>>> +    fb::hal::FbHal, //
>>> +};
>>> +
>>> +struct Gh100;
>>> +
>>> +impl FbHal for Gh100 {
>>> +    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
>>> +        super::ga100::read_sysmem_flush_page_ga100(bar)
>>> +    }
>>> +
>>> +    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
>>> +        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
>>> +
>>> +        Ok(())
>>> +    }
>>> +
>>> +    fn supports_display(&self, bar: &Bar0) -> bool {
>>> +        super::ga100::display_enabled_ga100(bar)
>>> +    }
>>> +
>>> +    fn vidmem_size(&self, bar: &Bar0) -> u64 {
>>> +        super::ga102::vidmem_size_ga102(bar)
>>> +    }
>>> +
>>> +    fn pmu_reserved_size(&self) -> u32 {
>>> +        super::tu102::pmu_reserved_size_tu102()
>>> +    }
>>> +
>>> +    fn frts_size(&self) -> u64 {
>>> +        super::tu102::frts_size_tu102()
>>> +    }
>>> +}
>>
>> As of this patch this HAL is strictly equivalent to GA102's, with the
>> first difference between the two introduced by the "larger non-WPR
>> heap". Could you move the introduction of this HAL to this patch as
>> well?
> 
> Actually please let me know if you prefer me to roll this minor change
> and merge the result (and same for the next patch) as we have another
> review and there is little point in respinning just for that. I just
> want to confirm that this wasn't done on purpose for a reason that
> escaped me.

Sure, that's fine. There's no particular reason, other than lots
of patch churn over the months.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
  2026-05-30  3:09 ` [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
  2026-06-01  2:07   ` Alexandre Courbot
@ 2026-06-01  4:41   ` Eliot Courtney
  1 sibling, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  4:41 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> GSP boot needs to know how much framebuffer memory is reserved for
> the PMU. Compute it per architecture: Blackwell dGPUs reserve a
> non-zero amount, earlier architectures leave it at zero, matching
> Open RM behavior.
>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

Moving the Gh100 HAL to a later patch where it is first needed as Alex
suggests sounds reasonable to me.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (2 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  2:24   ` Alexandre Courbot
  2026-06-01  5:01   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
                   ` (18 subsequent siblings)
  22 siblings, 2 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell need a larger non-WPR heap than the 1 MiB that
earlier architectures use. Hopper and Blackwell GB10x need 2 MiB, while
Blackwell GB20x needs 2 MiB + 128 KiB. Because GB10x and GB20x diverge
here, give each Blackwell family its own framebuffer HAL and select the
non-WPR heap size per chipset family.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb.rs           |  5 ++-
 drivers/gpu/nova-core/fb/hal.rs       | 16 +++++++--
 drivers/gpu/nova-core/fb/hal/gb100.rs |  9 +++--
 drivers/gpu/nova-core/fb/hal/gb202.rs | 52 +++++++++++++++++++++++++++
 drivers/gpu/nova-core/fb/hal/gh100.rs | 10 +++++-
 5 files changed, 84 insertions(+), 8 deletions(-)
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs

diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index d7a4dc944131..0aaee718c2c3 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -252,9 +252,8 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
         };
 
         let heap = {
-            const HEAP_SIZE: u64 = u64::SZ_1M;
-
-            FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
+            let heap_size = u64::from(hal.non_wpr_heap_size());
+            FbRange(wpr2.start - heap_size..wpr2.start)
         };
 
         Ok(Self {
diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
index e6ac55bba9b9..acb934f9aa9f 100644
--- a/drivers/gpu/nova-core/fb/hal.rs
+++ b/drivers/gpu/nova-core/fb/hal.rs
@@ -1,7 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 // SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
-use kernel::prelude::*;
+use kernel::{
+    prelude::*,
+    sizes::SizeConstants, //
+};
 
 use crate::{
     driver::Bar0,
@@ -14,6 +17,7 @@
 mod ga100;
 mod ga102;
 mod gb100;
+mod gb202;
 mod gh100;
 mod tu102;
 
@@ -37,6 +41,13 @@ pub(crate) trait FbHal {
 
     /// Returns the FRTS size, in bytes.
     fn frts_size(&self) -> u64;
+
+    /// Returns the non-WPR heap size for this chipset, in bytes.
+    ///
+    /// Older architectures use 1 MiB. Hopper and Blackwell override this.
+    fn non_wpr_heap_size(&self) -> u32 {
+        u32::SZ_1M
+    }
 }
 
 /// Returns the HAL corresponding to `chipset`.
@@ -46,6 +57,7 @@ pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
         Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
         Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
         Architecture::Hopper => gh100::GH100_HAL,
-        Architecture::BlackwellGB10x | Architecture::BlackwellGB20x => gb100::GB100_HAL,
+        Architecture::BlackwellGB10x => gb100::GB100_HAL,
+        Architecture::BlackwellGB20x => gb202::GB202_HAL,
     }
 }
diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
index c78027c26a9e..8d63350abf8a 100644
--- a/drivers/gpu/nova-core/fb/hal/gb100.rs
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 // SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
-//! Blackwell framebuffer HAL.
+//! Blackwell GB10x framebuffer HAL.
 
 use kernel::{
     prelude::*,
@@ -20,7 +20,7 @@
 
 struct Gb100;
 
-const fn pmu_reserved_size_gb100() -> u32 {
+pub(super) const fn pmu_reserved_size_gb100() -> u32 {
     usize_into_u32::<{ const_align_up(SZ_8M + SZ_16M + SZ_4K, Alignment::new::<SZ_128K>()).unwrap() }>(
     )
 }
@@ -48,6 +48,11 @@ fn pmu_reserved_size(&self) -> u32 {
         pmu_reserved_size_gb100()
     }
 
+    fn non_wpr_heap_size(&self) -> u32 {
+        // Non-WPR heap for GB10x (see Open RM: kgspGetNonWprHeapSize, GB100/GB102).
+        u32::SZ_2M
+    }
+
     fn frts_size(&self) -> u64 {
         super::tu102::frts_size_tu102()
     }
diff --git a/drivers/gpu/nova-core/fb/hal/gb202.rs b/drivers/gpu/nova-core/fb/hal/gb202.rs
new file mode 100644
index 000000000000..542c1d7429e9
--- /dev/null
+++ b/drivers/gpu/nova-core/fb/hal/gb202.rs
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+//! Blackwell GB20x framebuffer HAL.
+
+use kernel::{
+    prelude::*,
+    sizes::SizeConstants, //
+};
+
+use crate::{
+    driver::Bar0,
+    fb::hal::FbHal, //
+};
+
+struct Gb202;
+
+impl FbHal for Gb202 {
+    fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
+        super::ga100::read_sysmem_flush_page_ga100(bar)
+    }
+
+    fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
+        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+
+        Ok(())
+    }
+
+    fn supports_display(&self, bar: &Bar0) -> bool {
+        super::ga100::display_enabled_ga100(bar)
+    }
+
+    fn vidmem_size(&self, bar: &Bar0) -> u64 {
+        super::ga102::vidmem_size_ga102(bar)
+    }
+
+    fn pmu_reserved_size(&self) -> u32 {
+        super::gb100::pmu_reserved_size_gb100()
+    }
+
+    fn non_wpr_heap_size(&self) -> u32 {
+        // Non-WPR heap for GB20x (see Open RM: kgspGetNonWprHeapSize, GB202+).
+        u32::SZ_2M + u32::SZ_128K
+    }
+
+    fn frts_size(&self) -> u64 {
+        super::tu102::frts_size_tu102()
+    }
+}
+
+const GB202: Gb202 = Gb202;
+pub(super) const GB202_HAL: &dyn FbHal = &GB202;
diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
index c122ac2091f8..8f79c72b1823 100644
--- a/drivers/gpu/nova-core/fb/hal/gh100.rs
+++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
@@ -1,7 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 // SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
-use kernel::prelude::*;
+use kernel::{
+    prelude::*,
+    sizes::SizeConstants, //
+};
 
 use crate::{
     driver::Bar0,
@@ -33,6 +36,11 @@ fn pmu_reserved_size(&self) -> u32 {
         super::tu102::pmu_reserved_size_tu102()
     }
 
+    fn non_wpr_heap_size(&self) -> u32 {
+        // Non-WPR heap for Hopper (see Open RM: kgspCalculateFbLayout_GH100).
+        u32::SZ_2M
+    }
+
     fn frts_size(&self) -> u64 {
         super::tu102::frts_size_tu102()
     }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-05-30  3:09 ` [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
@ 2026-06-01  2:24   ` Alexandre Courbot
  2026-06-01 18:03     ` John Hubbard
  2026-06-01  5:01   ` Eliot Courtney
  1 sibling, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01  2:24 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Hopper and Blackwell need a larger non-WPR heap than the 1 MiB that
> earlier architectures use. Hopper and Blackwell GB10x need 2 MiB, while
> Blackwell GB20x needs 2 MiB + 128 KiB. Because GB10x and GB20x diverge
> here, give each Blackwell family its own framebuffer HAL and select the
> non-WPR heap size per chipset family.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/fb.rs           |  5 ++-
>  drivers/gpu/nova-core/fb/hal.rs       | 16 +++++++--
>  drivers/gpu/nova-core/fb/hal/gb100.rs |  9 +++--
>  drivers/gpu/nova-core/fb/hal/gb202.rs | 52 +++++++++++++++++++++++++++
>  drivers/gpu/nova-core/fb/hal/gh100.rs | 10 +++++-
>  5 files changed, 84 insertions(+), 8 deletions(-)
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
>
> diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
> index d7a4dc944131..0aaee718c2c3 100644
> --- a/drivers/gpu/nova-core/fb.rs
> +++ b/drivers/gpu/nova-core/fb.rs
> @@ -252,9 +252,8 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
>          };
>  
>          let heap = {
> -            const HEAP_SIZE: u64 = u64::SZ_1M;
> -
> -            FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
> +            let heap_size = u64::from(hal.non_wpr_heap_size());
> +            FbRange(wpr2.start - heap_size..wpr2.start)
>          };
>  
>          Ok(Self {
> diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
> index e6ac55bba9b9..acb934f9aa9f 100644
> --- a/drivers/gpu/nova-core/fb/hal.rs
> +++ b/drivers/gpu/nova-core/fb/hal.rs
> @@ -1,7 +1,10 @@
>  // SPDX-License-Identifier: GPL-2.0
>  // SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
>  
> -use kernel::prelude::*;
> +use kernel::{
> +    prelude::*,
> +    sizes::SizeConstants, //
> +};
>  
>  use crate::{
>      driver::Bar0,
> @@ -14,6 +17,7 @@
>  mod ga100;
>  mod ga102;
>  mod gb100;
> +mod gb202;
>  mod gh100;
>  mod tu102;
>  
> @@ -37,6 +41,13 @@ pub(crate) trait FbHal {
>  
>      /// Returns the FRTS size, in bytes.
>      fn frts_size(&self) -> u64;
> +
> +    /// Returns the non-WPR heap size for this chipset, in bytes.
> +    ///
> +    /// Older architectures use 1 MiB. Hopper and Blackwell override this.
> +    fn non_wpr_heap_size(&self) -> u32 {
> +        u32::SZ_1M
> +    }

I'm not sure that there is a "default" value here - this carries the
risk that future implementations will forget to implement this method
and get the same value as Turing/Ampere. Could you instead use a
`non_wpr_heap_size_tu102` method that is called by ga100/ga102 as well?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-06-01  2:24   ` Alexandre Courbot
@ 2026-06-01 18:03     ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:03 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 5/31/26 7:24 PM, Alexandre Courbot wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
...>> @@ -37,6 +41,13 @@ pub(crate) trait FbHal {
>>  
>>      /// Returns the FRTS size, in bytes.
>>      fn frts_size(&self) -> u64;
>> +
>> +    /// Returns the non-WPR heap size for this chipset, in bytes.
>> +    ///
>> +    /// Older architectures use 1 MiB. Hopper and Blackwell override this.
>> +    fn non_wpr_heap_size(&self) -> u32 {
>> +        u32::SZ_1M
>> +    }
> 
> I'm not sure that there is a "default" value here - this carries the
> risk that future implementations will forget to implement this method
> and get the same value as Turing/Ampere. Could you instead use a
> `non_wpr_heap_size_tu102` method that is called by ga100/ga102 as well?

OK, will do that.


thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  2026-05-30  3:09 ` [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
  2026-06-01  2:24   ` Alexandre Courbot
@ 2026-06-01  5:01   ` Eliot Courtney
  1 sibling, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  5:01 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Hopper and Blackwell need a larger non-WPR heap than the 1 MiB that
> earlier architectures use. Hopper and Blackwell GB10x need 2 MiB, while
> Blackwell GB20x needs 2 MiB + 128 KiB. Because GB10x and GB20x diverge
> here, give each Blackwell family its own framebuffer HAL and select the
> non-WPR heap size per chipset family.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

I agree with Alex that we should not default impl `non_wpr_heap_size`
on `FbHal`. Other than that,

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (3 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  5:21   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

The GSP-RM boot working memory portion of the WPR2 heap must be
larger on Hopper and later GPUs than on Turing, Ampere, and Ada.
Select the larger value for those generations.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/fw.rs               | 20 +++++++++++++------
 .../gpu/nova-core/gsp/fw/r570_144/bindings.rs |  1 +
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 919d3ab00075..0c54e8bf4bb3 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
 pub(crate) mod commands;
 mod r570_144;
@@ -29,7 +30,10 @@
 use crate::{
     fb::FbLayout,
     firmware::gsp::GspFirmware,
-    gpu::Chipset,
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
     gsp::{
         cmdq::Cmdq, //
         GSP_PAGE_SIZE,
@@ -106,11 +110,15 @@ enum GspFwHeapParams {}
 impl GspFwHeapParams {
     /// Returns the amount of GSP-RM heap memory used during GSP-RM boot and initialization (up to
     /// and including the first client subdevice allocation).
-    fn base_rm_size(_chipset: Chipset) -> u64 {
-        // TODO: this needs to be updated to return the correct value for Hopper+ once support for
-        // them is added:
-        // u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100)
-        u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+    fn base_rm_size(chipset: Chipset) -> u64 {
+        match chipset.arch() {
+            Architecture::Turing | Architecture::Ampere | Architecture::Ada => {
+                u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X)
+            }
+            Architecture::Hopper | Architecture::BlackwellGB10x | Architecture::BlackwellGB20x => {
+                u64::from(bindings::GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100)
+            }
+        }
     }
 
     /// Returns the amount of heap memory required to support a single channel allocation.
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
index f82ed097b283..1d592bd3f9ed 100644
--- a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
@@ -37,6 +37,7 @@ fn fmt(&self, fmt: &mut ::core::fmt::Formatter<'_>) -> ::core::fmt::Result {
 pub const GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS2: u32 = 0;
 pub const GSP_FW_HEAP_PARAM_OS_SIZE_LIBOS3_BAREMETAL: u32 = 23068672;
 pub const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_TU10X: u32 = 8388608;
+pub const GSP_FW_HEAP_PARAM_BASE_RM_SIZE_GH100: u32 = 14680064;
 pub const GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB: u32 = 98304;
 pub const GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE: u32 = 100663296;
 pub const GSP_FW_HEAP_SIZE_OVERRIDE_LIBOS2_MIN_MB: u32 = 64;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  2026-05-30  3:09 ` [PATCH v11 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
@ 2026-06-01  5:21   ` Eliot Courtney
  0 siblings, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  5:21 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> The GSP-RM boot working memory portion of the WPR2 heap must be
> larger on Hopper and later GPUs than on Turing, Ampere, and Ada.
> Select the larger value for those generations.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (4 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  7:01   ` Alexandre Courbot
  2026-06-01  7:33   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 07/22] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
                   ` (16 subsequent siblings)
  22 siblings, 2 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Blackwell GPUs moved the sysmem flush page registers away from the
Ampere/Ada location. GB10x routes the flush through a pair of HSHUB0
register sets (primary and egress) that must both be programmed to
the same address. GB20x routes it through FBHUB0.

Implement these paths in the GB10x and GB20x framebuffer HALs.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fb/hal/gb100.rs | 46 +++++++++++++++++++++++++--
 drivers/gpu/nova-core/fb/hal/gb202.rs | 40 +++++++++++++++++++++--
 drivers/gpu/nova-core/regs.rs         | 37 +++++++++++++++++++++
 3 files changed, 117 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
index 8d63350abf8a..70f4c11b1e77 100644
--- a/drivers/gpu/nova-core/fb/hal/gb100.rs
+++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
@@ -4,6 +4,8 @@
 //! Blackwell GB10x framebuffer HAL.
 
 use kernel::{
+    io::Io,
+    num::Bounded,
     prelude::*,
     ptr::{
         const_align_up,
@@ -15,11 +17,45 @@
 use crate::{
     driver::Bar0,
     fb::hal::FbHal,
-    num::usize_into_u32, //
+    num::usize_into_u32,
+    regs, //
 };
 
 struct Gb100;
 
+fn read_sysmem_flush_page_gb100(bar: &Bar0) -> u64 {
+    let lo = u64::from(
+        bar.read(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO)
+            .adr(),
+    );
+    let hi = u64::from(
+        bar.read(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI)
+            .adr(),
+    );
+
+    lo | (hi << 32)
+}
+
+/// Write the sysmem flush page address through the GB10x HSHUB0 registers.
+///
+/// Both the primary and EG (egress) register pairs must be programmed to the same address,
+/// as required by hardware.
+fn write_sysmem_flush_page_gb100(bar: &Bar0, addr: Bounded<u64, 52>) {
+    // CAST: lower 32 bits. Hardware ignores bits 7:0.
+    let addr_lo = *addr as u32;
+    let addr_hi = addr.shr::<32, 20>().cast::<u32>();
+
+    // Write HI first. The hardware will trigger the flush on the LO write.
+
+    // Primary HSHUB pair.
+    bar.write_reg(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed().with_adr(addr_hi));
+    bar.write_reg(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed().with_adr(addr_lo));
+
+    // EG (egress) pair -- must match the primary pair.
+    bar.write_reg(regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed().with_adr(addr_hi));
+    bar.write_reg(regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed().with_adr(addr_lo));
+}
+
 pub(super) const fn pmu_reserved_size_gb100() -> u32 {
     usize_into_u32::<{ const_align_up(SZ_8M + SZ_16M + SZ_4K, Alignment::new::<SZ_128K>()).unwrap() }>(
     )
@@ -27,11 +63,15 @@ pub(super) const fn pmu_reserved_size_gb100() -> u32 {
 
 impl FbHal for Gb100 {
     fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
-        super::ga100::read_sysmem_flush_page_ga100(bar)
+        read_sysmem_flush_page_gb100(bar)
     }
 
     fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
-        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+        let addr: Bounded<u64, 52> = Bounded::<u64, 64>::from(addr)
+            .try_shrink::<52>()
+            .ok_or(EINVAL)?;
+
+        write_sysmem_flush_page_gb100(bar, addr);
 
         Ok(())
     }
diff --git a/drivers/gpu/nova-core/fb/hal/gb202.rs b/drivers/gpu/nova-core/fb/hal/gb202.rs
index 542c1d7429e9..5a6b815eec3d 100644
--- a/drivers/gpu/nova-core/fb/hal/gb202.rs
+++ b/drivers/gpu/nova-core/fb/hal/gb202.rs
@@ -4,24 +4,58 @@
 //! Blackwell GB20x framebuffer HAL.
 
 use kernel::{
+    io::Io,
+    num::Bounded,
     prelude::*,
     sizes::SizeConstants, //
 };
 
 use crate::{
     driver::Bar0,
-    fb::hal::FbHal, //
+    fb::hal::FbHal,
+    regs, //
 };
 
 struct Gb202;
 
+fn read_sysmem_flush_page_gb202(bar: &Bar0) -> u64 {
+    let lo = u64::from(
+        bar.read(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO)
+            .adr(),
+    );
+    let hi = u64::from(
+        bar.read(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI)
+            .adr(),
+    );
+
+    lo | (hi << 32)
+}
+
+/// Write the sysmem flush page address through the GB20x FBHUB0 registers.
+fn write_sysmem_flush_page_gb202(bar: &Bar0, addr: Bounded<u64, 52>) {
+    // Write HI first. The hardware will trigger the flush on the LO write.
+    bar.write_reg(
+        regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed()
+            .with_adr(addr.shr::<32, 20>().cast::<u32>()),
+    );
+    bar.write_reg(
+        regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed()
+            // CAST: lower 32 bits. Hardware ignores bits 7:0.
+            .with_adr(*addr as u32),
+    );
+}
+
 impl FbHal for Gb202 {
     fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
-        super::ga100::read_sysmem_flush_page_ga100(bar)
+        read_sysmem_flush_page_gb202(bar)
     }
 
     fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
-        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
+        let addr: Bounded<u64, 52> = Bounded::<u64, 64>::from(addr)
+            .try_shrink::<52>()
+            .ok_or(EINVAL)?;
+
+        write_sysmem_flush_page_gb202(bar, addr);
 
         Ok(())
     }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 356fbf364ea5..65be6ec71ed4 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
 use kernel::{
     io::{
@@ -145,6 +146,42 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
         /// Bits 12..40 of the higher (exclusive) bound of the WPR2 region.
         31:4    hi_val;
     }
+
+    // Blackwell GB10x sysmem flush registers (HSHUB0).
+    //
+    // GB10x GPUs use two pairs of HSHUB registers for sysmembar: a primary pair and an EG
+    // (egress) pair. Both must be programmed to the same address. Hardware ignores bits 7:0
+    // of each LO register. HSHUB0 base is 0x00891000.
+
+    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x00891e50 {
+        31:0    adr => u32;
+    }
+
+    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x00891e54 {
+        19:0    adr;
+    }
+
+    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008916c0 {
+        31:0    adr => u32;
+    }
+
+    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008916c4 {
+        19:0    adr;
+    }
+
+    // Blackwell GB20x sysmem flush registers (FBHUB0).
+    //
+    // Unlike the older NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an
+    // 8-bit right-shift, these registers take the raw address split into lower/upper 32-bit halves.
+    // The hardware ignores bits 7:0 of the LO register.
+
+    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008a1d58 {
+        31:0    adr => u32;
+    }
+
+    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008a1d5c {
+        19:0    adr;
+    }
 }
 
 impl NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-05-30  3:09 ` [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
@ 2026-06-01  7:01   ` Alexandre Courbot
  2026-06-01 18:16     ` John Hubbard
  2026-06-01  7:33   ` Eliot Courtney
  1 sibling, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01  7:01 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
<snip>
> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
> index 356fbf364ea5..65be6ec71ed4 100644
> --- a/drivers/gpu/nova-core/regs.rs
> +++ b/drivers/gpu/nova-core/regs.rs
> @@ -1,4 +1,5 @@
>  // SPDX-License-Identifier: GPL-2.0
> +// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
>  
>  use kernel::{
>      io::{
> @@ -145,6 +146,42 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
>          /// Bits 12..40 of the higher (exclusive) bound of the WPR2 region.
>          31:4    hi_val;
>      }
> +
> +    // Blackwell GB10x sysmem flush registers (HSHUB0).
> +    //
> +    // GB10x GPUs use two pairs of HSHUB registers for sysmembar: a primary pair and an EG
> +    // (egress) pair. Both must be programmed to the same address. Hardware ignores bits 7:0
> +    // of each LO register. HSHUB0 base is 0x00891000.
> +
> +    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x00891e50 {
> +        31:0    adr => u32;
> +    }
> +
> +    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x00891e54 {
> +        19:0    adr;
> +    }
> +
> +    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008916c0 {
> +        31:0    adr => u32;
> +    }
> +
> +    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008916c4 {
> +        19:0    adr;
> +    }

I am still a bit uncertain about the addresses of these registers.
OpenRM seems to use `0x00870000` as the base [1][2] and use relative
offsets from it; that would make the address of e.g. `SYSMEM_ADDR_LO` be
`0x00870e500`. I cannot find any reference to a `0x00891000` base. Can
you double-check where it comes from?

Ideally these should also be relative registers with the same names as
OpenRM (e.g. `NV_PFB_HSHUB_PCIE_FLUSH_SYSMEM_ADDR_LO`) for easy lookup.
The `Hshub0` base can be declared in the `gb100` HAL since it's the only
place that uses it for now.

As I mentioned in my v10 review [3], there is more complexity to the
HSHUB module that involves runtime-detected values, but for boot it
looks like it also does rely on HSHUB0 as a stable base, so thankfully
we don't have to worry about this for now.

[1] https://github.com/NVIDIA/open-gpu-kernel-modules/blob/570.148/src/nvidia/src/kernel/gpu/mem_sys/arch/blackwell/kern_mem_sys_gb100.c#L54
[2] https://github.com/NVIDIA/open-gpu-kernel-modules/blob/570.148/src/common/inc/swref/published/blackwell/gb100/dev_hshub_base.h
[3] https://lore.kernel.org/all/DHY1D4IOXGRF.UQMCOXYG78CZ@nvidia.com/

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-06-01  7:01   ` Alexandre Courbot
@ 2026-06-01 18:16     ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:16 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 6/1/26 12:01 AM, Alexandre Courbot wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> <snip>
...
>> +    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008916c4 {
>> +        19:0    adr;
>> +    }
> 
> I am still a bit uncertain about the addresses of these registers.
> OpenRM seems to use `0x00870000` as the base [1][2] and use relative
> offsets from it; that would make the address of e.g. `SYSMEM_ADDR_LO` be
> `0x00870e500`. I cannot find any reference to a `0x00891000` base. Can
> you double-check where it comes from?
> 
> Ideally these should also be relative registers with the same names as
> OpenRM (e.g. `NV_PFB_HSHUB_PCIE_FLUSH_SYSMEM_ADDR_LO`) for easy lookup.
> The `Hshub0` base can be declared in the `gb100` HAL since it's the only
> place that uses it for now.
> 
> As I mentioned in my v10 review [3], there is more complexity to the
> HSHUB module that involves runtime-detected values, but for boot it
> looks like it also does rely on HSHUB0 as a stable base, so thankfully
> we don't have to worry about this for now.
> 

OK, yes, Let me get to the bottom of this.

> [1] https://github.com/NVIDIA/open-gpu-kernel-modules/blob/570.148/src/nvidia/src/kernel/gpu/mem_sys/arch/blackwell/kern_mem_sys_gb100.c#L54
> [2] https://github.com/NVIDIA/open-gpu-kernel-modules/blob/570.148/src/common/inc/swref/published/blackwell/gb100/dev_hshub_base.h
> [3] https://lore.kernel.org/all/DHY1D4IOXGRF.UQMCOXYG78CZ@nvidia.com/

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-05-30  3:09 ` [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
  2026-06-01  7:01   ` Alexandre Courbot
@ 2026-06-01  7:33   ` Eliot Courtney
  2026-06-01 13:13     ` Alexandre Courbot
  1 sibling, 1 reply; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  7:33 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Blackwell GPUs moved the sysmem flush page registers away from the
> Ampere/Ada location. GB10x routes the flush through a pair of HSHUB0
> register sets (primary and egress) that must both be programmed to
> the same address. GB20x routes it through FBHUB0.
>
> Implement these paths in the GB10x and GB20x framebuffer HALs.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/fb/hal/gb100.rs | 46 +++++++++++++++++++++++++--
>  drivers/gpu/nova-core/fb/hal/gb202.rs | 40 +++++++++++++++++++++--
>  drivers/gpu/nova-core/regs.rs         | 37 +++++++++++++++++++++
>  3 files changed, 117 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
> index 8d63350abf8a..70f4c11b1e77 100644
> --- a/drivers/gpu/nova-core/fb/hal/gb100.rs
> +++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
> @@ -4,6 +4,8 @@
>  //! Blackwell GB10x framebuffer HAL.
>  
>  use kernel::{
> +    io::Io,
> +    num::Bounded,
>      prelude::*,
>      ptr::{
>          const_align_up,
> @@ -15,11 +17,45 @@
>  use crate::{
>      driver::Bar0,
>      fb::hal::FbHal,
> -    num::usize_into_u32, //
> +    num::usize_into_u32,
> +    regs, //
>  };
>  
>  struct Gb100;
>  
> +fn read_sysmem_flush_page_gb100(bar: &Bar0) -> u64 {
> +    let lo = u64::from(
> +        bar.read(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO)
> +            .adr(),
> +    );
> +    let hi = u64::from(
> +        bar.read(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI)
> +            .adr(),
> +    );
> +
> +    lo | (hi << 32)
> +}
> +
> +/// Write the sysmem flush page address through the GB10x HSHUB0 registers.
> +///
> +/// Both the primary and EG (egress) register pairs must be programmed to the same address,
> +/// as required by hardware.
> +fn write_sysmem_flush_page_gb100(bar: &Bar0, addr: Bounded<u64, 52>) {
> +    // CAST: lower 32 bits. Hardware ignores bits 7:0.
> +    let addr_lo = *addr as u32;
> +    let addr_hi = addr.shr::<32, 20>().cast::<u32>();
> +
> +    // Write HI first. The hardware will trigger the flush on the LO write.
> +
> +    // Primary HSHUB pair.
> +    bar.write_reg(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed().with_adr(addr_hi));
> +    bar.write_reg(regs::NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed().with_adr(addr_lo));
> +
> +    // EG (egress) pair -- must match the primary pair.
> +    bar.write_reg(regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed().with_adr(addr_hi));
> +    bar.write_reg(regs::NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed().with_adr(addr_lo));
> +}
> +
>  pub(super) const fn pmu_reserved_size_gb100() -> u32 {
>      usize_into_u32::<{ const_align_up(SZ_8M + SZ_16M + SZ_4K, Alignment::new::<SZ_128K>()).unwrap() }>(
>      )
> @@ -27,11 +63,15 @@ pub(super) const fn pmu_reserved_size_gb100() -> u32 {
>  
>  impl FbHal for Gb100 {
>      fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
> -        super::ga100::read_sysmem_flush_page_ga100(bar)
> +        read_sysmem_flush_page_gb100(bar)
>      }
>  
>      fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
> -        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
> +        let addr: Bounded<u64, 52> = Bounded::<u64, 64>::from(addr)
> +            .try_shrink::<52>()
> +            .ok_or(EINVAL)?;

Maybe more simply written:
`let addr = Bounded::<u64, 52>::try_new(addr).ok_or(EINVAL)?;`

> +
> +        write_sysmem_flush_page_gb100(bar, addr);
>  
>          Ok(())
>      }
> diff --git a/drivers/gpu/nova-core/fb/hal/gb202.rs b/drivers/gpu/nova-core/fb/hal/gb202.rs
> index 542c1d7429e9..5a6b815eec3d 100644
> --- a/drivers/gpu/nova-core/fb/hal/gb202.rs
> +++ b/drivers/gpu/nova-core/fb/hal/gb202.rs
> @@ -4,24 +4,58 @@
>  //! Blackwell GB20x framebuffer HAL.
>  
>  use kernel::{
> +    io::Io,
> +    num::Bounded,
>      prelude::*,
>      sizes::SizeConstants, //
>  };
>  
>  use crate::{
>      driver::Bar0,
> -    fb::hal::FbHal, //
> +    fb::hal::FbHal,
> +    regs, //
>  };
>  
>  struct Gb202;
>  
> +fn read_sysmem_flush_page_gb202(bar: &Bar0) -> u64 {
> +    let lo = u64::from(
> +        bar.read(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO)
> +            .adr(),
> +    );
> +    let hi = u64::from(
> +        bar.read(regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI)
> +            .adr(),
> +    );
> +
> +    lo | (hi << 32)
> +}
> +
> +/// Write the sysmem flush page address through the GB20x FBHUB0 registers.
> +fn write_sysmem_flush_page_gb202(bar: &Bar0, addr: Bounded<u64, 52>) {
> +    // Write HI first. The hardware will trigger the flush on the LO write.
> +    bar.write_reg(
> +        regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI::zeroed()
> +            .with_adr(addr.shr::<32, 20>().cast::<u32>()),
> +    );
> +    bar.write_reg(
> +        regs::NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO::zeroed()
> +            // CAST: lower 32 bits. Hardware ignores bits 7:0.
> +            .with_adr(*addr as u32),
> +    );
> +}
> +
>  impl FbHal for Gb202 {
>      fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
> -        super::ga100::read_sysmem_flush_page_ga100(bar)
> +        read_sysmem_flush_page_gb202(bar)
>      }
>  
>      fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
> -        super::ga100::write_sysmem_flush_page_ga100(bar, addr);
> +        let addr: Bounded<u64, 52> = Bounded::<u64, 64>::from(addr)
> +            .try_shrink::<52>()
> +            .ok_or(EINVAL)?;

Same here.

> +
> +        write_sysmem_flush_page_gb202(bar, addr);
>  
>          Ok(())
>      }
> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
> index 356fbf364ea5..65be6ec71ed4 100644
> --- a/drivers/gpu/nova-core/regs.rs
> +++ b/drivers/gpu/nova-core/regs.rs
> @@ -1,4 +1,5 @@
>  // SPDX-License-Identifier: GPL-2.0
> +// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
>  
>  use kernel::{
>      io::{
> @@ -145,6 +146,42 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
>          /// Bits 12..40 of the higher (exclusive) bound of the WPR2 region.
>          31:4    hi_val;
>      }
> +
> +    // Blackwell GB10x sysmem flush registers (HSHUB0).
> +    //
> +    // GB10x GPUs use two pairs of HSHUB registers for sysmembar: a primary pair and an EG
> +    // (egress) pair. Both must be programmed to the same address. Hardware ignores bits 7:0
> +    // of each LO register. HSHUB0 base is 0x00891000.
> +
> +    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x00891e50 {
> +        31:0    adr => u32;
> +    }
> +
> +    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x00891e54 {
> +        19:0    adr;
> +    }
> +
> +    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008916c0 {
> +        31:0    adr => u32;
> +    }
> +
> +    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008916c4 {
> +        19:0    adr;
> +    }
> +
> +    // Blackwell GB20x sysmem flush registers (FBHUB0).
> +    //
> +    // Unlike the older NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an
> +    // 8-bit right-shift, these registers take the raw address split into lower/upper 32-bit halves.
> +    // The hardware ignores bits 7:0 of the LO register.
> +
> +    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008a1d58 {
> +        31:0    adr => u32;
> +    }
> +
> +    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008a1d5c {
> +        19:0    adr;
> +    }
>  }

May be nice to move these to the place (HAL) they are used if they
aren't used anywhere else (and reduce visibility). I am also curious
about where 0x00891000 comes from.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-06-01  7:33   ` Eliot Courtney
@ 2026-06-01 13:13     ` Alexandre Courbot
  2026-06-01 18:09       ` John Hubbard
  0 siblings, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01 13:13 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: John Hubbard, Danilo Krummrich, Joel Fernandes, Timur Tabi,
	Alistair Popple, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Mon Jun 1, 2026 at 4:33 PM JST, Eliot Courtney wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
>> +
>> +    // Blackwell GB10x sysmem flush registers (HSHUB0).
>> +    //
>> +    // GB10x GPUs use two pairs of HSHUB registers for sysmembar: a primary pair and an EG
>> +    // (egress) pair. Both must be programmed to the same address. Hardware ignores bits 7:0
>> +    // of each LO register. HSHUB0 base is 0x00891000.
>> +
>> +    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x00891e50 {
>> +        31:0    adr => u32;
>> +    }
>> +
>> +    pub(crate) NV_PFB_HSHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x00891e54 {
>> +        19:0    adr;
>> +    }
>> +
>> +    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008916c0 {
>> +        31:0    adr => u32;
>> +    }
>> +
>> +    pub(crate) NV_PFB_HSHUB0_EG_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008916c4 {
>> +        19:0    adr;
>> +    }
>> +
>> +    // Blackwell GB20x sysmem flush registers (FBHUB0).
>> +    //
>> +    // Unlike the older NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an
>> +    // 8-bit right-shift, these registers take the raw address split into lower/upper 32-bit halves.
>> +    // The hardware ignores bits 7:0 of the LO register.
>> +
>> +    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008a1d58 {
>> +        31:0    adr => u32;
>> +    }
>> +
>> +    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008a1d5c {
>> +        19:0    adr;
>> +    }
>>  }
>
> May be nice to move these to the place (HAL) they are used if they
> aren't used anywhere else (and reduce visibility).

Indeed, we have a thread on Zulip [1] suggesting to move register
definitions into subdevice-level `regs.rs` modules. I guess we could
proactively start doing this with these new registers; on the other
hand, I'm also fine with keeping the current pattern and doing the move
later if John prefers to keep things the current way.

[1] https://rust-for-linux.zulipchat.com/#narrow/channel/509436-Nova/topic/.5BRFC.5D.20Moving.20register.20definitions.20into.20the.20module.20using.20them/with/599138666

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers
  2026-06-01 13:13     ` Alexandre Courbot
@ 2026-06-01 18:09       ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:09 UTC (permalink / raw)
  To: Alexandre Courbot, Eliot Courtney
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On 6/1/26 6:13 AM, Alexandre Courbot wrote:
> On Mon Jun 1, 2026 at 4:33 PM JST, Eliot Courtney wrote:
>> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
...
>>> +    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x008a1d58 {
>>> +        31:0    adr => u32;
>>> +    }
>>> +
>>> +    pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x008a1d5c {
>>> +        19:0    adr;
>>> +    }
>>>  }
>>
>> May be nice to move these to the place (HAL) they are used if they
>> aren't used anywhere else (and reduce visibility).
> 
> Indeed, we have a thread on Zulip [1] suggesting to move register
> definitions into subdevice-level `regs.rs` modules. I guess we could
> proactively start doing this with these new registers; on the other
> hand, I'm also fine with keeping the current pattern and doing the move
> later if John prefers to keep things the current way.

I very much want us to focus on making a *serious* attempt at merging
Blackwell support in this kernel cycle.

I emphasize that because moving registers is something that can
*definitely* be done later.


thanks,
-- 
John Hubbard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 07/22] gpu: nova-core: don't assume 64-bit firmware images
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (5 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  6:36   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 08/22] gpu: nova-core: add support for 32-bit " John Hubbard
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Introduce a single ELF format abstraction that ties each ELF header
type to its matching section-header type. This keeps the shared
section parser ready for upcoming ELF32 support and avoids mixing
32-bit and 64-bit ELF layouts by mistake.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 112 +++++++++++++++++++++++-------
 1 file changed, 85 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 3aac073efee2..38088e950980 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 
 //! Contains structures and functions dedicated to the parsing, building and patching of firmwares
 //! to be loaded into a given execution unit.
@@ -467,17 +468,72 @@ mod elf {
         transmute::FromBytes, //
     };
 
+    /// Trait to abstract over ELF header differences.
+    trait ElfHeader: FromBytes {
+        fn shnum(&self) -> u16;
+        fn shoff(&self) -> u64;
+        fn shstrndx(&self) -> u16;
+    }
+
+    /// Trait to abstract over ELF section-header differences.
+    trait ElfSectionHeader: FromBytes {
+        fn name(&self) -> u32;
+        fn offset(&self) -> u64;
+        fn size(&self) -> u64;
+    }
+
+    /// Trait describing a matching ELF header and section-header format.
+    trait ElfFormat {
+        type Header: ElfHeader;
+        type SectionHeader: ElfSectionHeader;
+    }
+
     /// Newtype to provide a [`FromBytes`] implementation.
     #[repr(transparent)]
     struct Elf64Hdr(bindings::elf64_hdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64Hdr {}
 
+    impl ElfHeader for Elf64Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            self.0.e_shoff
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
     #[repr(transparent)]
     struct Elf64SHdr(bindings::elf64_shdr);
     // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
     unsafe impl FromBytes for Elf64SHdr {}
 
+    impl ElfSectionHeader for Elf64SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            self.0.sh_offset
+        }
+
+        fn size(&self) -> u64 {
+            self.0.sh_size
+        }
+    }
+
+    struct Elf64Format;
+
+    impl ElfFormat for Elf64Format {
+        type Header = Elf64Hdr;
+        type SectionHeader = Elf64SHdr;
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -485,47 +541,49 @@ fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         CStr::from_bytes_until_nul(bytes).ok()?.to_str().ok()
     }
 
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a, 'b>(elf: &'a [u8], name: &'b str) -> Option<&'a [u8]> {
-        let hdr = &elf
-            .get(0..size_of::<bindings::elf64_hdr>())
-            .and_then(Elf64Hdr::from_bytes)?
-            .0;
-
-        // Get all the section headers.
-        let mut shdr = {
-            let shdr_num = usize::from(hdr.e_shnum);
-            let shdr_start = usize::try_from(hdr.e_shoff).ok()?;
-            let shdr_end = shdr_num
-                .checked_mul(size_of::<Elf64SHdr>())
-                .and_then(|v| v.checked_add(shdr_start))?;
-
-            elf.get(shdr_start..shdr_end)
-                .map(|slice| slice.chunks_exact(size_of::<Elf64SHdr>()))?
-        };
+    fn elf_section_generic<'a, F>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
+    where
+        F: ElfFormat,
+    {
+        let hdr = F::Header::from_bytes(elf.get(0..size_of::<F::Header>())?)?;
+
+        let shdr_num = usize::from(hdr.shnum());
+        let shdr_start = usize::try_from(hdr.shoff()).ok()?;
+        let shdr_end = shdr_num
+            .checked_mul(size_of::<F::SectionHeader>())
+            .and_then(|v| v.checked_add(shdr_start))?;
+
+        // Get all the section headers as an iterator over byte chunks.
+        let shdr_bytes = elf.get(shdr_start..shdr_end)?;
+        let mut shdr_iter = shdr_bytes.chunks_exact(size_of::<F::SectionHeader>());
 
         // Get the strings table.
-        let strhdr = shdr
+        let strhdr = shdr_iter
             .clone()
-            .nth(usize::from(hdr.e_shstrndx))
-            .and_then(Elf64SHdr::from_bytes)?;
+            .nth(usize::from(hdr.shstrndx()))
+            .and_then(F::SectionHeader::from_bytes)?;
 
         // Find the section which name matches `name` and return it.
-        shdr.find_map(|sh| {
-            let hdr = Elf64SHdr::from_bytes(sh)?;
-            let name_offset = strhdr.0.sh_offset.checked_add(u64::from(hdr.0.sh_name))?;
+        shdr_iter.find_map(|sh_bytes| {
+            let sh = F::SectionHeader::from_bytes(sh_bytes)?;
+            let name_offset = strhdr.offset().checked_add(u64::from(sh.name()))?;
             let section_name = elf_str(elf, name_offset)?;
 
             if section_name != name {
                 return None;
             }
 
-            let start = usize::try_from(hdr.0.sh_offset).ok()?;
-            let end = usize::try_from(hdr.0.sh_size)
+            let start = usize::try_from(sh.offset()).ok()?;
+            let end = usize::try_from(sh.size())
                 .ok()
-                .and_then(|sh_size| start.checked_add(sh_size))?;
+                .and_then(|sz| start.checked_add(sz))?;
 
             elf.get(start..end)
         })
     }
+
+    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
+    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf64Format>(elf, name)
+    }
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 07/22] gpu: nova-core: don't assume 64-bit firmware images
  2026-05-30  3:09 ` [PATCH v11 07/22] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
@ 2026-06-01  6:36   ` Eliot Courtney
  0 siblings, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  6:36 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Introduce a single ELF format abstraction that ties each ELF header
> type to its matching section-header type. This keeps the shared
> section parser ready for upcoming ELF32 support and avoids mixing
> 32-bit and 64-bit ELF layouts by mistake.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 08/22] gpu: nova-core: add support for 32-bit firmware images
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (6 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 07/22] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  6:37   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
                   ` (14 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Some GPU firmware images are packaged as 32-bit ELF rather than 64-bit.
Add a 32-bit implementation of the shared ELF section-parsing
abstraction so those images can be parsed alongside the existing 64-bit
path.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 53 +++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 38088e950980..e4dcc9a87b7e 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -534,6 +534,53 @@ impl ElfFormat for Elf64Format {
         type SectionHeader = Elf64SHdr;
     }
 
+    /// Newtype to provide [`FromBytes`] and [`ElfHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32Hdr(bindings::elf32_hdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32Hdr {}
+
+    impl ElfHeader for Elf32Hdr {
+        fn shnum(&self) -> u16 {
+            self.0.e_shnum
+        }
+
+        fn shoff(&self) -> u64 {
+            u64::from(self.0.e_shoff)
+        }
+
+        fn shstrndx(&self) -> u16 {
+            self.0.e_shstrndx
+        }
+    }
+
+    /// Newtype to provide [`FromBytes`] and [`ElfSectionHeader`] implementations for ELF32.
+    #[repr(transparent)]
+    struct Elf32SHdr(bindings::elf32_shdr);
+    // SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+    unsafe impl FromBytes for Elf32SHdr {}
+
+    impl ElfSectionHeader for Elf32SHdr {
+        fn name(&self) -> u32 {
+            self.0.sh_name
+        }
+
+        fn offset(&self) -> u64 {
+            u64::from(self.0.sh_offset)
+        }
+
+        fn size(&self) -> u64 {
+            u64::from(self.0.sh_size)
+        }
+    }
+
+    struct Elf32Format;
+
+    impl ElfFormat for Elf32Format {
+        type Header = Elf32Hdr;
+        type SectionHeader = Elf32SHdr;
+    }
+
     /// Returns a NULL-terminated string from the ELF image at `offset`.
     fn elf_str(elf: &[u8], offset: u64) -> Option<&str> {
         let idx = usize::try_from(offset).ok()?;
@@ -586,4 +633,10 @@ fn elf_section_generic<'a, F>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
     pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Format>(elf, name)
     }
+
+    /// Extract the section with name `name` from the ELF32 image `elf`.
+    #[expect(dead_code)]
+    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        elf_section_generic::<Elf32Format>(elf, name)
+    }
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 08/22] gpu: nova-core: add support for 32-bit firmware images
  2026-05-30  3:09 ` [PATCH v11 08/22] gpu: nova-core: add support for 32-bit " John Hubbard
@ 2026-06-01  6:37   ` Eliot Courtney
  0 siblings, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  6:37 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Some GPU firmware images are packaged as 32-bit ELF rather than 64-bit.
> Add a 32-bit implementation of the shared ELF section-parsing
> abstraction so those images can be parsed alongside the existing 64-bit
> path.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (7 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 08/22] gpu: nova-core: add support for 32-bit " John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  6:49   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

A firmware image may be either a 32-bit or a 64-bit ELF, and callers
should not have to know which. Detect the ELF class from the image
header at parse time and dispatch to the matching parser, so a single
entry point handles both layouts.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     | 22 ++++++++++++++++++----
 drivers/gpu/nova-core/firmware/gsp.rs |  4 ++--
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index e4dcc9a87b7e..866bc9b3571e 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -629,14 +629,28 @@ fn elf_section_generic<'a, F>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
         })
     }
 
-    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
-    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    /// Extract the section with name `name` from the ELF64 image `elf`.
+    fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf64Format>(elf, name)
     }
 
     /// Extract the section with name `name` from the ELF32 image `elf`.
-    #[expect(dead_code)]
-    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         elf_section_generic::<Elf32Format>(elf, name)
     }
+
+    /// Automatically detects ELF32 vs ELF64 based on the ELF header.
+    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+        // Check ELF magic.
+        if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
+            return None;
+        }
+
+        // Check ELF class: 1 = 32-bit, 2 = 64-bit.
+        match elf.get(4)? {
+            1 => elf32_section(elf, name),
+            2 => elf64_section(elf, name),
+            _ => None,
+        }
+    }
 }
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index e576bc8a9b1c..99a302bae567 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -88,7 +88,7 @@ pub(crate) fn new<'a>(
         pin_init::pin_init_scope(move || {
             let firmware = super::request_firmware(dev, chipset, "gsp", ver)?;
 
-            let fw_section = elf::elf64_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
+            let fw_section = elf::elf_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
 
             let size = fw_section.len();
 
@@ -148,7 +148,7 @@ pub(crate) fn new<'a>(
                 signatures: {
                     let sigs_section = Self::find_gsp_sigs_section(chipset);
 
-                    elf::elf64_section(firmware.data(), sigs_section)
+                    elf::elf_section(firmware.data(), sigs_section)
                         .ok_or(EINVAL)
                         .and_then(|data| Coherent::from_slice(dev, data, GFP_KERNEL))?
                 },
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  2026-05-30  3:09 ` [PATCH v11 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
@ 2026-06-01  6:49   ` Eliot Courtney
  0 siblings, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  6:49 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> A firmware image may be either a 32-bit or a 64-bit ELF, and callers
> should not have to know which. Detect the ELF class from the image
> header at parse time and dispatch to the matching parser, so a single
> entry point handles both layouts.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware.rs     | 22 ++++++++++++++++++----
>  drivers/gpu/nova-core/firmware/gsp.rs |  4 ++--
>  2 files changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
> index e4dcc9a87b7e..866bc9b3571e 100644
> --- a/drivers/gpu/nova-core/firmware.rs
> +++ b/drivers/gpu/nova-core/firmware.rs
> @@ -629,14 +629,28 @@ fn elf_section_generic<'a, F>(elf: &'a [u8], name: &str) -> Option<&'a [u8]>
>          })
>      }
>  
> -    /// Tries to extract section with name `name` from the ELF64 image `elf`, and returns it.
> -    pub(super) fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
> +    /// Extract the section with name `name` from the ELF64 image `elf`.
> +    fn elf64_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>          elf_section_generic::<Elf64Format>(elf, name)
>      }
>  
>      /// Extract the section with name `name` from the ELF32 image `elf`.
> -    #[expect(dead_code)]
> -    pub(super) fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
> +    fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>          elf_section_generic::<Elf32Format>(elf, name)
>      }
> +
> +    /// Automatically detects ELF32 vs ELF64 based on the ELF header.
> +    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
> +        // Check ELF magic.
> +        if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
> +            return None;
> +        }
> +
> +        // Check ELF class: 1 = 32-bit, 2 = 64-bit.
> +        match elf.get(4)? {
> +            1 => elf32_section(elf, name),
> +            2 => elf64_section(elf, name),
> +            _ => None,
> +        }
> +    }
>  }

What about adding named constants (inline in the function) for these
magic numbers?

The elf.len() check looks unnecessary since you are using .get() and
returning an Option rather than a Result. Instead you could do
`if elf.get(0..SELFMAG) != Some(ELFMAG)`.

With those resolved,

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (8 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  7:47   ` Eliot Courtney
  2026-06-01 16:10   ` Timur Tabi
  2026-05-30  3:09 ` [PATCH v11 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image John Hubbard
                   ` (12 subsequent siblings)
  22 siblings, 2 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add the FSP (Firmware System Processor) falcon engine type that will
handle secure boot and Chain of Trust operations on Hopper and Blackwell
architectures.

The FSP falcon replaces SEC2's role in the boot sequence for these newer
architectures. This initial stub just defines the falcon type and its
base address.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs        |  1 +
 drivers/gpu/nova-core/falcon/fsp.rs    | 28 ++++++++++++++++++++++++++
 drivers/gpu/nova-core/gsp/hal/gh100.rs |  7 +++++--
 3 files changed, 34 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 24cc2c26e28d..053ce5bea6cd 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -40,6 +40,7 @@
     regs,
 };
 
+pub(crate) mod fsp;
 pub(crate) mod gsp;
 mod hal;
 pub(crate) mod sec2;
diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
new file mode 100644
index 000000000000..73fb73cb73a5
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+//! FSP (Firmware System Processor) falcon engine for Hopper/Blackwell GPUs.
+//!
+//! The FSP falcon handles secure boot and Chain of Trust operations
+//! on Hopper and Blackwell architectures, replacing SEC2's role.
+
+use kernel::io::register::RegisterBase;
+
+use crate::falcon::{
+    FalconEngine,
+    PFalcon2Base,
+    PFalconBase, //
+};
+
+/// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
+pub(crate) struct Fsp(());
+
+impl RegisterBase<PFalconBase> for Fsp {
+    const BASE: usize = 0x8f2000;
+}
+
+impl RegisterBase<PFalcon2Base> for Fsp {
+    const BASE: usize = 0x8f3000;
+}
+
+impl FalconEngine for Fsp {}
diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
index 9a4bb22578b3..fe2689764c8d 100644
--- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
@@ -11,6 +11,7 @@
 use crate::{
     driver::Bar0,
     falcon::{
+        fsp::Fsp as FspEngine,
         gsp::Gsp as GspEngine,
         sec2::Sec2,
         Falcon, //
@@ -35,14 +36,16 @@ impl GspHal for Gh100 {
     fn boot<'a>(
         &self,
         _gsp: &'a Gsp,
-        _dev: &'a device::Device<device::Bound>,
+        dev: &'a device::Device<device::Bound>,
         _bar: &'a Bar0,
-        _chipset: Chipset,
+        chipset: Chipset,
         _fb_layout: &FbLayout,
         _wpr_meta: &Coherent<GspFwWprMeta>,
         _gsp_falcon: &'a Falcon<GspEngine>,
         _sec2_falcon: &'a Falcon<Sec2>,
     ) -> Result<BootUnloadGuard<'a>> {
+        let _fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
+
         Err(ENOTSUPP)
     }
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  2026-05-30  3:09 ` [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
@ 2026-06-01  7:47   ` Eliot Courtney
  2026-06-01 16:10   ` Timur Tabi
  1 sibling, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  7:47 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Add the FSP (Firmware System Processor) falcon engine type that will
> handle secure boot and Chain of Trust operations on Hopper and Blackwell
> architectures.
>
> The FSP falcon replaces SEC2's role in the boot sequence for these newer
> architectures. This initial stub just defines the falcon type and its
> base address.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  2026-05-30  3:09 ` [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
  2026-06-01  7:47   ` Eliot Courtney
@ 2026-06-01 16:10   ` Timur Tabi
  2026-06-01 18:17     ` John Hubbard
  1 sibling, 1 reply; 56+ messages in thread
From: Timur Tabi @ 2026-06-01 16:10 UTC (permalink / raw)
  To: Alexandre Courbot, dakr@kernel.org, John Hubbard
  Cc: aliceryhl@google.com, lossin@kernel.org, Shashank Sharma,
	boqun.feng@gmail.com, a.hindborg@kernel.org, Zhi Wang,
	simona@ffwll.ch, alex.gaynor@gmail.com, ojeda@kernel.org,
	tmgross@umich.edu, linux-kernel@vger.kernel.org,
	rust-for-linux@vger.kernel.org, bjorn3_gh@protonmail.com,
	Eliot Courtney, airlied@gmail.com, Joel Fernandes,
	bhelgaas@google.com, gary@garyguo.net, Alistair Popple

On Fri, 2026-05-29 at 20:09 -0700, John Hubbard wrote:
> Add the FSP (Firmware System Processor) falcon engine type that will

FYI, FSP = Foundation Security Processor

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  2026-06-01 16:10   ` Timur Tabi
@ 2026-06-01 18:17     ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:17 UTC (permalink / raw)
  To: Timur Tabi, Alexandre Courbot, dakr@kernel.org
  Cc: aliceryhl@google.com, lossin@kernel.org, Shashank Sharma,
	boqun.feng@gmail.com, a.hindborg@kernel.org, Zhi Wang,
	simona@ffwll.ch, alex.gaynor@gmail.com, ojeda@kernel.org,
	tmgross@umich.edu, linux-kernel@vger.kernel.org,
	rust-for-linux@vger.kernel.org, bjorn3_gh@protonmail.com,
	Eliot Courtney, airlied@gmail.com, Joel Fernandes,
	bhelgaas@google.com, gary@garyguo.net, Alistair Popple

On 6/1/26 9:10 AM, Timur Tabi wrote:
> On Fri, 2026-05-29 at 20:09 -0700, John Hubbard wrote:
>> Add the FSP (Firmware System Processor) falcon engine type that will
> 
> FYI, FSP = Foundation Security Processor

My first few search passes found the other name, so thanks for digging
up the official name. I'll update it.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (9 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  8:38   ` Eliot Courtney
  2026-05-30  3:09 ` [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
                   ` (11 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

FSP is the Falcon that runs FMC firmware on Hopper and Blackwell.
Load the FMC ELF in two forms: the image section that FSP boots from,
and the full Firmware object for later signature extraction during
Chain of Trust verification.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs      |  1 +
 drivers/gpu/nova-core/firmware/fsp.rs  | 47 ++++++++++++++++++++++++++
 drivers/gpu/nova-core/gsp/hal/gh100.rs |  5 +++
 3 files changed, 53 insertions(+)
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 866bc9b3571e..6edb50b83a29 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -28,6 +28,7 @@
 };
 
 pub(crate) mod booter;
+pub(crate) mod fsp;
 pub(crate) mod fwsec;
 pub(crate) mod gsp;
 pub(crate) mod riscv;
diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
new file mode 100644
index 000000000000..011be1e571c2
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -0,0 +1,47 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+//! FSP is a hardware unit that runs FMC firmware.
+
+use kernel::{
+    device,
+    dma::Coherent,
+    firmware::Firmware,
+    prelude::*, //
+};
+
+use crate::{
+    firmware::elf,
+    gpu::Chipset, //
+};
+
+pub(crate) struct FspFirmware {
+    /// FMC firmware image data (only the "image" ELF section).
+    #[expect(dead_code)]
+    pub(crate) fmc_image: Coherent<[u8]>,
+    /// Full FMC ELF for signature extraction.
+    #[expect(dead_code)]
+    pub(crate) fmc_elf: Firmware,
+}
+
+impl FspFirmware {
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: Chipset,
+        ver: &str,
+    ) -> Result<Self> {
+        let fw = super::request_firmware(dev, chipset, "fmc", ver)?;
+
+        // FSP expects only the "image" section, not the entire ELF file.
+        let fmc_image_data = elf::elf_section(fw.data(), "image").ok_or_else(|| {
+            dev_err!(dev, "FMC ELF file missing 'image' section\n");
+            EINVAL
+        })?;
+        let fmc_image = Coherent::from_slice(dev, fmc_image_data, GFP_KERNEL)?;
+
+        Ok(Self {
+            fmc_image,
+            fmc_elf: fw,
+        })
+    }
+}
diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
index fe2689764c8d..c38d88bc42b0 100644
--- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
@@ -17,6 +17,10 @@
         Falcon, //
     },
     fb::FbLayout,
+    firmware::{
+        fsp::FspFirmware,
+        FIRMWARE_VERSION, //
+    },
     gpu::Chipset,
     gsp::{
         boot::BootUnloadGuard,
@@ -45,6 +49,7 @@ fn boot<'a>(
         _sec2_falcon: &'a Falcon<Sec2>,
     ) -> Result<BootUnloadGuard<'a>> {
         let _fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
+        let _fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
 
         Err(ENOTSUPP)
     }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image
  2026-05-30  3:09 ` [PATCH v11 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image John Hubbard
@ 2026-06-01  8:38   ` Eliot Courtney
  0 siblings, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  8:38 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> FSP is the Falcon that runs FMC firmware on Hopper and Blackwell.
> Load the FMC ELF in two forms: the image section that FSP boots from,
> and the full Firmware object for later signature extraction during
> Chain of Trust verification.
>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware.rs      |  1 +
>  drivers/gpu/nova-core/firmware/fsp.rs  | 47 ++++++++++++++++++++++++++
>  drivers/gpu/nova-core/gsp/hal/gh100.rs |  5 +++
>  3 files changed, 53 insertions(+)
>  create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
>
> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
> index 866bc9b3571e..6edb50b83a29 100644
> --- a/drivers/gpu/nova-core/firmware.rs
> +++ b/drivers/gpu/nova-core/firmware.rs
> @@ -28,6 +28,7 @@
>  };
>  
>  pub(crate) mod booter;
> +pub(crate) mod fsp;
>  pub(crate) mod fwsec;
>  pub(crate) mod gsp;
>  pub(crate) mod riscv;
> diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
> new file mode 100644
> index 000000000000..011be1e571c2
> --- /dev/null
> +++ b/drivers/gpu/nova-core/firmware/fsp.rs
> @@ -0,0 +1,47 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
> +
> +//! FSP is a hardware unit that runs FMC firmware.
> +
> +use kernel::{
> +    device,
> +    dma::Coherent,
> +    firmware::Firmware,
> +    prelude::*, //
> +};
> +
> +use crate::{
> +    firmware::elf,
> +    gpu::Chipset, //
> +};
> +
> +pub(crate) struct FspFirmware {
> +    /// FMC firmware image data (only the "image" ELF section).
> +    #[expect(dead_code)]
> +    pub(crate) fmc_image: Coherent<[u8]>,
> +    /// Full FMC ELF for signature extraction.
> +    #[expect(dead_code)]
> +    pub(crate) fmc_elf: Firmware,
> +}
> +
> +impl FspFirmware {
> +    pub(crate) fn new(
> +        dev: &device::Device<device::Bound>,
> +        chipset: Chipset,
> +        ver: &str,
> +    ) -> Result<Self> {
> +        let fw = super::request_firmware(dev, chipset, "fmc", ver)?;

Do we need to add this to ModInfoBuilder::make_entry_chipset?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (10 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  7:48   ` Alexandre Courbot
  2026-05-30  3:09 ` [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Hopper and Blackwell use FSP instead of SEC2 for secure boot. The
driver must wait for FSP secure boot to complete before continuing
with GSP bring-up. Poll for boot success with a 5-second timeout.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs           | 51 ++++++++++++++++++++++++++
 drivers/gpu/nova-core/fsp/hal.rs       | 27 ++++++++++++++
 drivers/gpu/nova-core/fsp/hal/gb202.rs | 23 ++++++++++++
 drivers/gpu/nova-core/fsp/hal/gh100.rs | 23 ++++++++++++
 drivers/gpu/nova-core/gsp/hal/gh100.rs |  5 ++-
 drivers/gpu/nova-core/nova_core.rs     |  1 +
 drivers/gpu/nova-core/regs.rs          | 36 ++++++++++++++++++
 7 files changed, 165 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/nova-core/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp/hal.rs
 create mode 100644 drivers/gpu/nova-core/fsp/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fsp/hal/gh100.rs

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
new file mode 100644
index 000000000000..ee8fc384fe38
--- /dev/null
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
+//!
+//! Hopper/Blackwell use a simplified firmware boot sequence: FMC, then FSP, then GSP.
+//! Unlike Turing/Ampere/Ada, there is no SEC2 (Security Engine 2) usage.
+//! FSP handles secure boot directly using FMC firmware and Chain of Trust.
+
+use kernel::{
+    device,
+    io::poll::read_poll_timeout,
+    prelude::*,
+    time::Delta, //
+};
+
+use crate::{
+    driver::Bar0,
+    gpu::Chipset,
+    regs, //
+};
+
+mod hal;
+
+/// FSP interface for Hopper/Blackwell GPUs.
+pub(crate) struct Fsp;
+
+impl Fsp {
+    /// Wait for FSP secure boot completion.
+    ///
+    /// Polls the thermal scratch register until FSP signals boot completion
+    /// or timeout occurs.
+    pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipset) -> Result {
+        /// FSP secure boot completion timeout in milliseconds.
+        const FSP_SECURE_BOOT_TIMEOUT_MS: i64 = 5000;
+
+        let hal = hal::fsp_hal(chipset).ok_or(ENOTSUPP)?;
+
+        read_poll_timeout(
+            || Ok(hal.fsp_boot_status(bar)),
+            |&status| status == regs::NV_THERM_I2CS_SCRATCH_FSP_BOOT_COMPLETE_STATUS_SUCCESS,
+            Delta::from_millis(10),
+            Delta::from_millis(FSP_SECURE_BOOT_TIMEOUT_MS),
+        )
+        .map_err(|_| {
+            dev_err!(dev, "FSP secure boot completion timeout\n");
+            ETIMEDOUT
+        })
+        .map(|_| ())
+    }
+}
diff --git a/drivers/gpu/nova-core/fsp/hal.rs b/drivers/gpu/nova-core/fsp/hal.rs
new file mode 100644
index 000000000000..83d1e7daa998
--- /dev/null
+++ b/drivers/gpu/nova-core/fsp/hal.rs
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+use crate::{
+    driver::Bar0,
+    gpu::{
+        Architecture,
+        Chipset, //
+    },
+};
+
+mod gb202;
+mod gh100;
+
+pub(super) trait FspHal {
+    /// Returns the secure boot status from the architecture-specific `NV_THERM_I2CS_SCRATCH` register.
+    fn fsp_boot_status(&self, bar: &Bar0) -> u32;
+}
+
+/// Returns the FSP HAL, or `None` if the architecture doesn't support FSP.
+pub(crate) fn fsp_hal(chipset: Chipset) -> Option<&'static dyn FspHal> {
+    match chipset.arch() {
+        Architecture::Turing | Architecture::Ampere | Architecture::Ada => None,
+        Architecture::Hopper | Architecture::BlackwellGB10x => Some(gh100::GH100_HAL),
+        Architecture::BlackwellGB20x => Some(gb202::GB202_HAL),
+    }
+}
diff --git a/drivers/gpu/nova-core/fsp/hal/gb202.rs b/drivers/gpu/nova-core/fsp/hal/gb202.rs
new file mode 100644
index 000000000000..2f08b6c9f308
--- /dev/null
+++ b/drivers/gpu/nova-core/fsp/hal/gb202.rs
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+use kernel::io::Io;
+
+use crate::{
+    driver::Bar0,
+    fsp::hal::FspHal,
+    regs, //
+};
+
+struct Gb202;
+
+impl FspHal for Gb202 {
+    fn fsp_boot_status(&self, bar: &Bar0) -> u32 {
+        bar.read(regs::gb202::NV_THERM_I2CS_SCRATCH_FSP_BOOT_COMPLETE)
+            .fsp_boot_complete()
+            .into()
+    }
+}
+
+const GB202: Gb202 = Gb202;
+pub(super) const GB202_HAL: &dyn FspHal = &GB202;
diff --git a/drivers/gpu/nova-core/fsp/hal/gh100.rs b/drivers/gpu/nova-core/fsp/hal/gh100.rs
new file mode 100644
index 000000000000..290fb55a81da
--- /dev/null
+++ b/drivers/gpu/nova-core/fsp/hal/gh100.rs
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+use kernel::io::Io;
+
+use crate::{
+    driver::Bar0,
+    fsp::hal::FspHal,
+    regs, //
+};
+
+struct Gh100;
+
+impl FspHal for Gh100 {
+    fn fsp_boot_status(&self, bar: &Bar0) -> u32 {
+        bar.read(regs::gh100::NV_THERM_I2CS_SCRATCH_FSP_BOOT_COMPLETE)
+            .fsp_boot_complete()
+            .into()
+    }
+}
+
+const GH100: Gh100 = Gh100;
+pub(super) const GH100_HAL: &dyn FspHal = &GH100;
diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
index c38d88bc42b0..151df05e303b 100644
--- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
@@ -21,6 +21,7 @@
         fsp::FspFirmware,
         FIRMWARE_VERSION, //
     },
+    fsp::Fsp,
     gpu::Chipset,
     gsp::{
         boot::BootUnloadGuard,
@@ -41,7 +42,7 @@ fn boot<'a>(
         &self,
         _gsp: &'a Gsp,
         dev: &'a device::Device<device::Bound>,
-        _bar: &'a Bar0,
+        bar: &'a Bar0,
         chipset: Chipset,
         _fb_layout: &FbLayout,
         _wpr_meta: &Coherent<GspFwWprMeta>,
@@ -51,6 +52,8 @@ fn boot<'a>(
         let _fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
         let _fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
 
+        Fsp::wait_secure_boot(dev, bar, chipset)?;
+
         Err(ENOTSUPP)
     }
 }
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 5a260062295f..7b6c331da10e 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -17,6 +17,7 @@
 mod falcon;
 mod fb;
 mod firmware;
+mod fsp;
 mod gpu;
 mod gsp;
 #[macro_use]
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 65be6ec71ed4..270779d31ab3 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -579,3 +579,39 @@ pub(crate) mod ga100 {
         }
     }
 }
+
+pub(crate) const NV_THERM_I2CS_SCRATCH_FSP_BOOT_COMPLETE_STATUS_SUCCESS: u32 = 0xff;
+
+pub(crate) mod gh100 {
+    use kernel::io::register;
+
+    // PTHERM
+
+    register! {
+        pub(crate) NV_THERM_I2CS_SCRATCH(u32) @ 0x000200bc {
+            31:0    data;
+        }
+
+        // Alias to `NV_THERM_I2CS_SCRATCH` when used to check for FSP boot completion.
+        pub(crate) NV_THERM_I2CS_SCRATCH_FSP_BOOT_COMPLETE(u32) => NV_THERM_I2CS_SCRATCH {
+            31:0    fsp_boot_complete;
+        }
+    }
+}
+
+pub(crate) mod gb202 {
+    use kernel::io::register;
+
+    // PTHERM
+
+    register! {
+        pub(crate) NV_THERM_I2CS_SCRATCH(u32) @ 0x00ad00bc {
+            31:0    data;
+        }
+
+        // Alias to `NV_THERM_I2CS_SCRATCH` when used to check for FSP boot completion.
+        pub(crate) NV_THERM_I2CS_SCRATCH_FSP_BOOT_COMPLETE(u32) => NV_THERM_I2CS_SCRATCH {
+            31:0    fsp_boot_complete;
+        }
+    }
+}
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-05-30  3:09 ` [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
@ 2026-06-01  7:48   ` Alexandre Courbot
  2026-06-01  8:32     ` Eliot Courtney
  0 siblings, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01  7:48 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Hopper and Blackwell use FSP instead of SEC2 for secure boot. The
> driver must wait for FSP secure boot to complete before continuing
> with GSP bring-up. Poll for boot success with a 5-second timeout.
>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/fsp.rs           | 51 ++++++++++++++++++++++++++
>  drivers/gpu/nova-core/fsp/hal.rs       | 27 ++++++++++++++
>  drivers/gpu/nova-core/fsp/hal/gb202.rs | 23 ++++++++++++
>  drivers/gpu/nova-core/fsp/hal/gh100.rs | 23 ++++++++++++
>  drivers/gpu/nova-core/gsp/hal/gh100.rs |  5 ++-
>  drivers/gpu/nova-core/nova_core.rs     |  1 +
>  drivers/gpu/nova-core/regs.rs          | 36 ++++++++++++++++++
>  7 files changed, 165 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/nova-core/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fsp/hal.rs
>  create mode 100644 drivers/gpu/nova-core/fsp/hal/gb202.rs
>  create mode 100644 drivers/gpu/nova-core/fsp/hal/gh100.rs
>
> diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
> new file mode 100644
> index 000000000000..ee8fc384fe38
> --- /dev/null
> +++ b/drivers/gpu/nova-core/fsp.rs
> @@ -0,0 +1,51 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
> +
> +//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
> +//!
> +//! Hopper/Blackwell use a simplified firmware boot sequence: FMC, then FSP, then GSP.
> +//! Unlike Turing/Ampere/Ada, there is no SEC2 (Security Engine 2) usage.
> +//! FSP handles secure boot directly using FMC firmware and Chain of Trust.
> +
> +use kernel::{
> +    device,
> +    io::poll::read_poll_timeout,
> +    prelude::*,
> +    time::Delta, //
> +};
> +
> +use crate::{
> +    driver::Bar0,
> +    gpu::Chipset,
> +    regs, //
> +};
> +
> +mod hal;
> +
> +/// FSP interface for Hopper/Blackwell GPUs.
> +pub(crate) struct Fsp;

Throughout the patchset, this type is never instantiated and is only
used as a namespace for static methods - something that the `fsp` module
itself could also do.

But I think it could be useful to create it and pass it as the `&mut
self` parameter of the other methods in the module, as doing so would
make the module more resilient: we could request `&mut self` for its two
other methods added later in the patchset and guarantee that there won't
be any concurrency issue.

This should also probably create and own the `Falcon<Fsp>`, as it is the
only user through this patchset (and this makes sense from an
architectural point of view). The `FSP falcon engine stub` patch could
then refrain from creating the `Falcon<Fsp>`, which would be created by
this patch.

> +
> +impl Fsp {
> +    /// Wait for FSP secure boot completion.
> +    ///
> +    /// Polls the thermal scratch register until FSP signals boot completion
> +    /// or timeout occurs.
> +    pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipset) -> Result {

... and with the design proposed above, this method can return
`Result<Fsp>`: we are not supposed to use `Fsp` it until secure boot has
successfully completed, so making it return the instance that enables
the other methods guarantees that this has happened at the API level.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-06-01  7:48   ` Alexandre Courbot
@ 2026-06-01  8:32     ` Eliot Courtney
  2026-06-01 13:07       ` Alexandre Courbot
  0 siblings, 1 reply; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  8:32 UTC (permalink / raw)
  To: Alexandre Courbot, John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Mon Jun 1, 2026 at 4:48 PM JST, Alexandre Courbot wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
>> Hopper and Blackwell use FSP instead of SEC2 for secure boot. The
>> driver must wait for FSP secure boot to complete before continuing
>> with GSP bring-up. Poll for boot success with a 5-second timeout.
>>
>> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/fsp.rs           | 51 ++++++++++++++++++++++++++
>>  drivers/gpu/nova-core/fsp/hal.rs       | 27 ++++++++++++++
>>  drivers/gpu/nova-core/fsp/hal/gb202.rs | 23 ++++++++++++
>>  drivers/gpu/nova-core/fsp/hal/gh100.rs | 23 ++++++++++++
>>  drivers/gpu/nova-core/gsp/hal/gh100.rs |  5 ++-
>>  drivers/gpu/nova-core/nova_core.rs     |  1 +
>>  drivers/gpu/nova-core/regs.rs          | 36 ++++++++++++++++++
>>  7 files changed, 165 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/gpu/nova-core/fsp.rs
>>  create mode 100644 drivers/gpu/nova-core/fsp/hal.rs
>>  create mode 100644 drivers/gpu/nova-core/fsp/hal/gb202.rs
>>  create mode 100644 drivers/gpu/nova-core/fsp/hal/gh100.rs
>>
>> diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
>> new file mode 100644
>> index 000000000000..ee8fc384fe38
>> --- /dev/null
>> +++ b/drivers/gpu/nova-core/fsp.rs
>> @@ -0,0 +1,51 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
>> +
>> +//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
>> +//!
>> +//! Hopper/Blackwell use a simplified firmware boot sequence: FMC, then FSP, then GSP.
>> +//! Unlike Turing/Ampere/Ada, there is no SEC2 (Security Engine 2) usage.
>> +//! FSP handles secure boot directly using FMC firmware and Chain of Trust.
>> +
>> +use kernel::{
>> +    device,
>> +    io::poll::read_poll_timeout,
>> +    prelude::*,
>> +    time::Delta, //
>> +};
>> +
>> +use crate::{
>> +    driver::Bar0,
>> +    gpu::Chipset,
>> +    regs, //
>> +};
>> +
>> +mod hal;
>> +
>> +/// FSP interface for Hopper/Blackwell GPUs.
>> +pub(crate) struct Fsp;
>
> Throughout the patchset, this type is never instantiated and is only
> used as a namespace for static methods - something that the `fsp` module
> itself could also do.
>
> But I think it could be useful to create it and pass it as the `&mut
> self` parameter of the other methods in the module, as doing so would
> make the module more resilient: we could request `&mut self` for its two
> other methods added later in the patchset and guarantee that there won't
> be any concurrency issue.
>
> This should also probably create and own the `Falcon<Fsp>`, as it is the
> only user through this patchset (and this makes sense from an
> architectural point of view). The `FSP falcon engine stub` patch could
> then refrain from creating the `Falcon<Fsp>`, which would be created by
> this patch.
>
>> +
>> +impl Fsp {
>> +    /// Wait for FSP secure boot completion.
>> +    ///
>> +    /// Polls the thermal scratch register until FSP signals boot completion
>> +    /// or timeout occurs.
>> +    pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipset) -> Result {
>
> ... and with the design proposed above, this method can return
> `Result<Fsp>`: we are not supposed to use `Fsp` it until secure boot has
> successfully completed, so making it return the instance that enables
> the other methods guarantees that this has happened at the API level.

I think this is a good direction. I think we can also make FspFirmware
owned by this fsp::Fsp object - i.e., move more control of FSP from
Gh100::boot to the Fsp object. You might need to change up FmcBootArgs
and e.g. return something from `boot_fmc` to keep the DMA allocation
alive.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-06-01  8:32     ` Eliot Courtney
@ 2026-06-01 13:07       ` Alexandre Courbot
  2026-06-01 18:18         ` John Hubbard
  0 siblings, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01 13:07 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: John Hubbard, Danilo Krummrich, Joel Fernandes, Timur Tabi,
	Alistair Popple, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Mon Jun 1, 2026 at 5:32 PM JST, Eliot Courtney wrote:
> On Mon Jun 1, 2026 at 4:48 PM JST, Alexandre Courbot wrote:
>> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
>>> +
>>> +impl Fsp {
>>> +    /// Wait for FSP secure boot completion.
>>> +    ///
>>> +    /// Polls the thermal scratch register until FSP signals boot completion
>>> +    /// or timeout occurs.
>>> +    pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipset) -> Result {
>>
>> ... and with the design proposed above, this method can return
>> `Result<Fsp>`: we are not supposed to use `Fsp` it until secure boot has
>> successfully completed, so making it return the instance that enables
>> the other methods guarantees that this has happened at the API level.
>
> I think this is a good direction. I think we can also make FspFirmware
> owned by this fsp::Fsp object - i.e., move more control of FSP from
> Gh100::boot to the Fsp object. You might need to change up FmcBootArgs
> and e.g. return something from `boot_fmc` to keep the DMA allocation
> alive.

Good idea - that would also remove one parameter from `FmcBootArgs` later.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting
  2026-06-01 13:07       ` Alexandre Courbot
@ 2026-06-01 18:18         ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:18 UTC (permalink / raw)
  To: Alexandre Courbot, Eliot Courtney
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On 6/1/26 6:07 AM, Alexandre Courbot wrote:
> On Mon Jun 1, 2026 at 5:32 PM JST, Eliot Courtney wrote:
>> On Mon Jun 1, 2026 at 4:48 PM JST, Alexandre Courbot wrote:
>>> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
>>>> +
>>>> +impl Fsp {
>>>> +    /// Wait for FSP secure boot completion.
>>>> +    ///
>>>> +    /// Polls the thermal scratch register until FSP signals boot completion
>>>> +    /// or timeout occurs.
>>>> +    pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipset) -> Result {
>>>
>>> ... and with the design proposed above, this method can return
>>> `Result<Fsp>`: we are not supposed to use `Fsp` it until secure boot has
>>> successfully completed, so making it return the instance that enables
>>> the other methods guarantees that this has happened at the API level.
>>
>> I think this is a good direction. I think we can also make FspFirmware
>> owned by this fsp::Fsp object - i.e., move more control of FSP from
>> Gh100::boot to the Fsp object. You might need to change up FmcBootArgs
>> and e.g. return something from `boot_fmc` to keep the DMA allocation
>> alive.
> 
> Good idea - that would also remove one parameter from `FmcBootArgs` later.

OK, I will trot off in that direction, sounds good.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (11 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01  8:55   ` Eliot Courtney
  2026-06-01 14:45   ` Alexandre Courbot
  2026-05-30  3:09 ` [PATCH v11 14/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
                   ` (9 subsequent siblings)
  22 siblings, 2 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Extract the SHA-384 hash, RSA public key, and RSA signature from the
FMC ELF32 firmware sections. FSP Chain of Trust verification needs
these to validate the FMC image during boot.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs     |  2 +-
 drivers/gpu/nova-core/firmware/fsp.rs | 90 ++++++++++++++++++++++++++-
 2 files changed, 88 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 6edb50b83a29..569efee0d4ac 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -641,7 +641,7 @@ fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
     }
 
     /// Automatically detects ELF32 vs ELF64 based on the ELF header.
-    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
+    pub(crate) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
         // Check ELF magic.
         if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
             return None;
diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index 011be1e571c2..dc28d0cc2d03 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -15,13 +15,35 @@
     gpu::Chipset, //
 };
 
+/// Size of the FSP SHA-384 hash, in bytes.
+pub(crate) const FSP_HASH_SIZE: usize = 48;
+/// Maximum size of the FSP public key (RSA-3072), in bytes.
+///
+/// The FMC ELF `publickey` section may be shorter, so the remaining bytes are zero-padded.
+pub(crate) const FSP_PKEY_SIZE: usize = 384;
+/// Maximum size of the FSP signature (RSA-3072), in bytes.
+///
+/// The FMC ELF `signature` section may be shorter, so the remaining bytes are zero-padded.
+pub(crate) const FSP_SIG_SIZE: usize = 384;
+
+/// Structure to hold FMC signatures.
+///
+/// C representation is used because this type is used for communication with the FSP.
+#[derive(Debug, Clone, Copy)]
+#[repr(C)]
+pub(crate) struct FmcSignatures {
+    pub(crate) hash384: [u8; FSP_HASH_SIZE],
+    pub(crate) public_key: [u8; FSP_PKEY_SIZE],
+    pub(crate) signature: [u8; FSP_SIG_SIZE],
+}
+
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the "image" ELF section).
     #[expect(dead_code)]
     pub(crate) fmc_image: Coherent<[u8]>,
-    /// Full FMC ELF for signature extraction.
+    /// FMC firmware signatures.
     #[expect(dead_code)]
-    pub(crate) fmc_elf: Firmware,
+    pub(crate) fmc_sigs: KBox<FmcSignatures>,
 }
 
 impl FspFirmware {
@@ -41,7 +63,69 @@ pub(crate) fn new(
 
         Ok(Self {
             fmc_image,
-            fmc_elf: fw,
+            fmc_sigs: Self::extract_fmc_signatures(&fw, dev)?,
         })
     }
+
+    /// Extract FMC firmware signatures for Chain of Trust verification.
+    ///
+    /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
+    /// Returns signatures in a heap-allocated structure to prevent stack overflow.
+    fn extract_fmc_signatures(
+        fmc_fw: &Firmware,
+        dev: &device::Device,
+    ) -> Result<KBox<FmcSignatures>> {
+        let get_section = |name: &str, max_len: usize| {
+            elf::elf_section(fmc_fw.data(), name)
+                .ok_or(EINVAL)
+                .inspect_err(|_| dev_err!(dev, "FMC firmware missing '{}' section\n", name))
+                .and_then(|section| {
+                    if section.len() > max_len {
+                        dev_err!(
+                            dev,
+                            "FMC {} section size {} > maximum {}\n",
+                            name,
+                            section.len(),
+                            max_len
+                        );
+                        Err(EINVAL)
+                    } else {
+                        Ok(section)
+                    }
+                })
+        };
+
+        let hash_section = get_section("hash", FSP_HASH_SIZE)?;
+        let pkey_section = get_section("publickey", FSP_PKEY_SIZE)?;
+        let sig_section = get_section("signature", FSP_SIG_SIZE)?;
+
+        // The hash section is a SHA-384 output: it must be exactly FSP_HASH_SIZE bytes.
+        if hash_section.len() != FSP_HASH_SIZE {
+            dev_err!(
+                dev,
+                "FMC hash section size {} != expected {}\n",
+                hash_section.len(),
+                FSP_HASH_SIZE
+            );
+            return Err(EINVAL);
+        }
+
+        let mut signatures = KBox::new(
+            FmcSignatures {
+                hash384: [0; _],
+                public_key: [0; _],
+                signature: [0; _],
+            },
+            GFP_KERNEL,
+        )?;
+
+        // PANIC: src and dst lengths are both FSP_HASH_SIZE (verified above).
+        signatures.hash384.copy_from_slice(hash_section);
+        // PANIC: dst is sliced to src.len(); src.len() <= FSP_PKEY_SIZE per `get_section`.
+        signatures.public_key[..pkey_section.len()].copy_from_slice(pkey_section);
+        // PANIC: dst is sliced to src.len(); src.len() <= FSP_SIG_SIZE per `get_section`.
+        signatures.signature[..sig_section.len()].copy_from_slice(sig_section);
+
+        Ok(signatures)
+    }
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-05-30  3:09 ` [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
@ 2026-06-01  8:55   ` Eliot Courtney
  2026-06-01 14:45   ` Alexandre Courbot
  1 sibling, 0 replies; 56+ messages in thread
From: Eliot Courtney @ 2026-06-01  8:55 UTC (permalink / raw)
  To: John Hubbard, Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Extract the SHA-384 hash, RSA public key, and RSA signature from the
> FMC ELF32 firmware sections. FSP Chain of Trust verification needs
> these to validate the FMC image during boot.
>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware.rs     |  2 +-
>  drivers/gpu/nova-core/firmware/fsp.rs | 90 ++++++++++++++++++++++++++-
>  2 files changed, 88 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
> index 6edb50b83a29..569efee0d4ac 100644
> --- a/drivers/gpu/nova-core/firmware.rs
> +++ b/drivers/gpu/nova-core/firmware.rs
> @@ -641,7 +641,7 @@ fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>      }
>  
>      /// Automatically detects ELF32 vs ELF64 based on the ELF header.
> -    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
> +    pub(crate) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {

I think we don't need to widen visibility here.

>          // Check ELF magic.
>          if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
>              return None;
> diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
> index 011be1e571c2..dc28d0cc2d03 100644
> --- a/drivers/gpu/nova-core/firmware/fsp.rs
> +++ b/drivers/gpu/nova-core/firmware/fsp.rs
> @@ -15,13 +15,35 @@
>      gpu::Chipset, //
>  };
>  
> +/// Size of the FSP SHA-384 hash, in bytes.
> +pub(crate) const FSP_HASH_SIZE: usize = 48;
> +/// Maximum size of the FSP public key (RSA-3072), in bytes.
> +///
> +/// The FMC ELF `publickey` section may be shorter, so the remaining bytes are zero-padded.
> +pub(crate) const FSP_PKEY_SIZE: usize = 384;
> +/// Maximum size of the FSP signature (RSA-3072), in bytes.
> +///
> +/// The FMC ELF `signature` section may be shorter, so the remaining bytes are zero-padded.
> +pub(crate) const FSP_SIG_SIZE: usize = 384;

These constants look unused outside of this file to me so we can keep
them private.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-05-30  3:09 ` [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
  2026-06-01  8:55   ` Eliot Courtney
@ 2026-06-01 14:45   ` Alexandre Courbot
  2026-06-01 14:49     ` Alexandre Courbot
  1 sibling, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01 14:45 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> Extract the SHA-384 hash, RSA public key, and RSA signature from the
> FMC ELF32 firmware sections. FSP Chain of Trust verification needs
> these to validate the FMC image during boot.
>
> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware.rs     |  2 +-
>  drivers/gpu/nova-core/firmware/fsp.rs | 90 ++++++++++++++++++++++++++-
>  2 files changed, 88 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
> index 6edb50b83a29..569efee0d4ac 100644
> --- a/drivers/gpu/nova-core/firmware.rs
> +++ b/drivers/gpu/nova-core/firmware.rs
> @@ -641,7 +641,7 @@ fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>      }
>  
>      /// Automatically detects ELF32 vs ELF64 based on the ELF header.
> -    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
> +    pub(crate) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>          // Check ELF magic.
>          if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
>              return None;
> diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
> index 011be1e571c2..dc28d0cc2d03 100644
> --- a/drivers/gpu/nova-core/firmware/fsp.rs
> +++ b/drivers/gpu/nova-core/firmware/fsp.rs
> @@ -15,13 +15,35 @@
>      gpu::Chipset, //
>  };
>  
> +/// Size of the FSP SHA-384 hash, in bytes.
> +pub(crate) const FSP_HASH_SIZE: usize = 48;
> +/// Maximum size of the FSP public key (RSA-3072), in bytes.
> +///
> +/// The FMC ELF `publickey` section may be shorter, so the remaining bytes are zero-padded.
> +pub(crate) const FSP_PKEY_SIZE: usize = 384;
> +/// Maximum size of the FSP signature (RSA-3072), in bytes.
> +///
> +/// The FMC ELF `signature` section may be shorter, so the remaining bytes are zero-padded.
> +pub(crate) const FSP_SIG_SIZE: usize = 384;
> +
> +/// Structure to hold FMC signatures.
> +///
> +/// C representation is used because this type is used for communication with the FSP.
> +#[derive(Debug, Clone, Copy)]
> +#[repr(C)]
> +pub(crate) struct FmcSignatures {
> +    pub(crate) hash384: [u8; FSP_HASH_SIZE],
> +    pub(crate) public_key: [u8; FSP_PKEY_SIZE],
> +    pub(crate) signature: [u8; FSP_SIG_SIZE],
> +}
> +
>  pub(crate) struct FspFirmware {
>      /// FMC firmware image data (only the "image" ELF section).
>      #[expect(dead_code)]
>      pub(crate) fmc_image: Coherent<[u8]>,
> -    /// Full FMC ELF for signature extraction.
> +    /// FMC firmware signatures.
>      #[expect(dead_code)]
> -    pub(crate) fmc_elf: Firmware,
> +    pub(crate) fmc_sigs: KBox<FmcSignatures>,
>  }
>  
>  impl FspFirmware {
> @@ -41,7 +63,69 @@ pub(crate) fn new(
>  
>          Ok(Self {
>              fmc_image,
> -            fmc_elf: fw,
> +            fmc_sigs: Self::extract_fmc_signatures(&fw, dev)?,
>          })
>      }
> +
> +    /// Extract FMC firmware signatures for Chain of Trust verification.
> +    ///
> +    /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
> +    /// Returns signatures in a heap-allocated structure to prevent stack overflow.
> +    fn extract_fmc_signatures(
> +        fmc_fw: &Firmware,
> +        dev: &device::Device,
> +    ) -> Result<KBox<FmcSignatures>> {
> +        let get_section = |name: &str, max_len: usize| {
> +            elf::elf_section(fmc_fw.data(), name)
> +                .ok_or(EINVAL)
> +                .inspect_err(|_| dev_err!(dev, "FMC firmware missing '{}' section\n", name))
> +                .and_then(|section| {
> +                    if section.len() > max_len {
> +                        dev_err!(
> +                            dev,
> +                            "FMC {} section size {} > maximum {}\n",
> +                            name,
> +                            section.len(),
> +                            max_len
> +                        );
> +                        Err(EINVAL)
> +                    } else {
> +                        Ok(section)
> +                    }
> +                })
> +        };
> +
> +        let hash_section = get_section("hash", FSP_HASH_SIZE)?;
> +        let pkey_section = get_section("publickey", FSP_PKEY_SIZE)?;
> +        let sig_section = get_section("signature", FSP_SIG_SIZE)?;
> +
> +        // The hash section is a SHA-384 output: it must be exactly FSP_HASH_SIZE bytes.
> +        if hash_section.len() != FSP_HASH_SIZE {
> +            dev_err!(
> +                dev,
> +                "FMC hash section size {} != expected {}\n",
> +                hash_section.len(),
> +                FSP_HASH_SIZE
> +            );
> +            return Err(EINVAL);
> +        }
> +
> +        let mut signatures = KBox::new(
> +            FmcSignatures {
> +                hash384: [0; _],
> +                public_key: [0; _],
> +                signature: [0; _],
> +            },
> +            GFP_KERNEL,
> +        )?;

This construct may create the 816 bytes long `FmcSignatures` instance on
the stack, where space is at a premium. `KBox::init` guarantees in-place
initialization:

    let mut signatures = KBox::init(
        init!(FmcSignatures {
            hash384: [0; _],
            public_key: [0; _],
            signature: [0; _],
        }),
        )?;
        GFP_KERNEL,

And by chaining the initializer we can also avoid making `signatures`
mutable:

    let signatures = KBox::init(
        init!(FmcSignatures {
            hash384 <- Zeroable::init_zeroed(),
            public_key <- Zeroable::init_zeroed(),
            signature <- Zeroable::init_zeroed(),
        })
        .chain(|sigs| {
            // PANIC: src and dst lengths are both FSP_HASH_SIZE (verified above).
            sigs.hash384.copy_from_slice(hash_section);
            // PANIC: dst is sliced to src.len(); src.len() <= FSP_PKEY_SIZE per `get_section`.
            sigs.public_key[..pkey_section.len()].copy_from_slice(pkey_section);
            // PANIC: dst is sliced to src.len(); src.len() <= FSP_SIG_SIZE per `get_section`.
            sigs.signature[..sig_section.len()].copy_from_slice(sig_section);
            Ok(())
        }),
        GFP_KERNEL,
    )?;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-06-01 14:45   ` Alexandre Courbot
@ 2026-06-01 14:49     ` Alexandre Courbot
  2026-06-01 18:21       ` John Hubbard
  0 siblings, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01 14:49 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Mon Jun 1, 2026 at 11:45 PM JST, Alexandre Courbot wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
>> Extract the SHA-384 hash, RSA public key, and RSA signature from the
>> FMC ELF32 firmware sections. FSP Chain of Trust verification needs
>> these to validate the FMC image during boot.
>>
>> Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/firmware.rs     |  2 +-
>>  drivers/gpu/nova-core/firmware/fsp.rs | 90 ++++++++++++++++++++++++++-
>>  2 files changed, 88 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
>> index 6edb50b83a29..569efee0d4ac 100644
>> --- a/drivers/gpu/nova-core/firmware.rs
>> +++ b/drivers/gpu/nova-core/firmware.rs
>> @@ -641,7 +641,7 @@ fn elf32_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>>      }
>>  
>>      /// Automatically detects ELF32 vs ELF64 based on the ELF header.
>> -    pub(super) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>> +    pub(crate) fn elf_section<'a>(elf: &'a [u8], name: &str) -> Option<&'a [u8]> {
>>          // Check ELF magic.
>>          if elf.len() < 5 || elf.get(0..4)? != b"\x7fELF" {
>>              return None;
>> diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
>> index 011be1e571c2..dc28d0cc2d03 100644
>> --- a/drivers/gpu/nova-core/firmware/fsp.rs
>> +++ b/drivers/gpu/nova-core/firmware/fsp.rs
>> @@ -15,13 +15,35 @@
>>      gpu::Chipset, //
>>  };
>>  
>> +/// Size of the FSP SHA-384 hash, in bytes.
>> +pub(crate) const FSP_HASH_SIZE: usize = 48;
>> +/// Maximum size of the FSP public key (RSA-3072), in bytes.
>> +///
>> +/// The FMC ELF `publickey` section may be shorter, so the remaining bytes are zero-padded.
>> +pub(crate) const FSP_PKEY_SIZE: usize = 384;
>> +/// Maximum size of the FSP signature (RSA-3072), in bytes.
>> +///
>> +/// The FMC ELF `signature` section may be shorter, so the remaining bytes are zero-padded.
>> +pub(crate) const FSP_SIG_SIZE: usize = 384;
>> +
>> +/// Structure to hold FMC signatures.
>> +///
>> +/// C representation is used because this type is used for communication with the FSP.
>> +#[derive(Debug, Clone, Copy)]
>> +#[repr(C)]
>> +pub(crate) struct FmcSignatures {
>> +    pub(crate) hash384: [u8; FSP_HASH_SIZE],
>> +    pub(crate) public_key: [u8; FSP_PKEY_SIZE],
>> +    pub(crate) signature: [u8; FSP_SIG_SIZE],
>> +}
>> +
>>  pub(crate) struct FspFirmware {
>>      /// FMC firmware image data (only the "image" ELF section).
>>      #[expect(dead_code)]
>>      pub(crate) fmc_image: Coherent<[u8]>,
>> -    /// Full FMC ELF for signature extraction.
>> +    /// FMC firmware signatures.
>>      #[expect(dead_code)]
>> -    pub(crate) fmc_elf: Firmware,
>> +    pub(crate) fmc_sigs: KBox<FmcSignatures>,
>>  }
>>  
>>  impl FspFirmware {
>> @@ -41,7 +63,69 @@ pub(crate) fn new(
>>  
>>          Ok(Self {
>>              fmc_image,
>> -            fmc_elf: fw,
>> +            fmc_sigs: Self::extract_fmc_signatures(&fw, dev)?,
>>          })
>>      }
>> +
>> +    /// Extract FMC firmware signatures for Chain of Trust verification.
>> +    ///
>> +    /// Extracts real cryptographic signatures from FMC ELF32 firmware sections.
>> +    /// Returns signatures in a heap-allocated structure to prevent stack overflow.
>> +    fn extract_fmc_signatures(
>> +        fmc_fw: &Firmware,
>> +        dev: &device::Device,
>> +    ) -> Result<KBox<FmcSignatures>> {
>> +        let get_section = |name: &str, max_len: usize| {
>> +            elf::elf_section(fmc_fw.data(), name)
>> +                .ok_or(EINVAL)
>> +                .inspect_err(|_| dev_err!(dev, "FMC firmware missing '{}' section\n", name))
>> +                .and_then(|section| {
>> +                    if section.len() > max_len {
>> +                        dev_err!(
>> +                            dev,
>> +                            "FMC {} section size {} > maximum {}\n",
>> +                            name,
>> +                            section.len(),
>> +                            max_len
>> +                        );
>> +                        Err(EINVAL)
>> +                    } else {
>> +                        Ok(section)
>> +                    }
>> +                })
>> +        };
>> +
>> +        let hash_section = get_section("hash", FSP_HASH_SIZE)?;
>> +        let pkey_section = get_section("publickey", FSP_PKEY_SIZE)?;
>> +        let sig_section = get_section("signature", FSP_SIG_SIZE)?;
>> +
>> +        // The hash section is a SHA-384 output: it must be exactly FSP_HASH_SIZE bytes.
>> +        if hash_section.len() != FSP_HASH_SIZE {
>> +            dev_err!(
>> +                dev,
>> +                "FMC hash section size {} != expected {}\n",
>> +                hash_section.len(),
>> +                FSP_HASH_SIZE
>> +            );
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        let mut signatures = KBox::new(
>> +            FmcSignatures {
>> +                hash384: [0; _],
>> +                public_key: [0; _],
>> +                signature: [0; _],
>> +            },
>> +            GFP_KERNEL,
>> +        )?;
>
> This construct may create the 816 bytes long `FmcSignatures` instance on
> the stack, where space is at a premium. `KBox::init` guarantees in-place
> initialization:
>
>     let mut signatures = KBox::init(
>         init!(FmcSignatures {
>             hash384: [0; _],
>             public_key: [0; _],
>             signature: [0; _],
>         }),
>         )?;
>         GFP_KERNEL,
>
> And by chaining the initializer we can also avoid making `signatures`
> mutable:
>
>     let signatures = KBox::init(
>         init!(FmcSignatures {
>             hash384 <- Zeroable::init_zeroed(),
>             public_key <- Zeroable::init_zeroed(),
>             signature <- Zeroable::init_zeroed(),

Oops, I got a bit ahead of myself. The three lines above should be:

    hash384: [0; _],
    public_key: [0; _],
    signature: [0; _],

The rest should be working as expected.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  2026-06-01 14:49     ` Alexandre Courbot
@ 2026-06-01 18:21       ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:21 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 6/1/26 7:49 AM, Alexandre Courbot wrote:
> On Mon Jun 1, 2026 at 11:45 PM JST, Alexandre Courbot wrote:
>> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
...
>>> +        let mut signatures = KBox::new(
>>> +            FmcSignatures {
>>> +                hash384: [0; _],
>>> +                public_key: [0; _],
>>> +                signature: [0; _],
>>> +            },
>>> +            GFP_KERNEL,
>>> +        )?;
>>
>> This construct may create the 816 bytes long `FmcSignatures` instance on
>> the stack, where space is at a premium. `KBox::init` guarantees in-place

Ouch, that's huge. Good catch.

>> initialization:
>>
>>     let mut signatures = KBox::init(
>>         init!(FmcSignatures {
>>             hash384: [0; _],
>>             public_key: [0; _],
>>             signature: [0; _],
>>         }),
>>         )?;
>>         GFP_KERNEL,
>>
>> And by chaining the initializer we can also avoid making `signatures`
>> mutable:
>>
>>     let signatures = KBox::init(
>>         init!(FmcSignatures {
>>             hash384 <- Zeroable::init_zeroed(),
>>             public_key <- Zeroable::init_zeroed(),
>>             signature <- Zeroable::init_zeroed(),
> 
> Oops, I got a bit ahead of myself. The three lines above should be:
> 
>     hash384: [0; _],
>     public_key: [0; _],
>     signature: [0; _],
> 
> The rest should be working as expected.

OK, will fix this.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 14/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (12 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:09 ` [PATCH v11 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add external memory (EMEM) read/write operations to the GPU's FSP falcon
engine. These operations use Falcon PIO (Programmed I/O) to communicate
with the FSP through indirect memory access.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 130 ++++++++++++++++++++++++++--
 drivers/gpu/nova-core/regs.rs       |  15 ++++
 2 files changed, 140 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index 73fb73cb73a5..7067c1963745 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -6,12 +6,28 @@
 //! The FSP falcon handles secure boot and Chain of Trust operations
 //! on Hopper and Blackwell architectures, replacing SEC2's role.
 
-use kernel::io::register::RegisterBase;
+use kernel::{
+    io::{
+        register::{
+            RegisterBase,
+            WithBase, //
+        },
+        Io, //
+    },
+    num::Bounded,
+    prelude::*,
+    ptr::Alignment, //
+};
 
-use crate::falcon::{
-    FalconEngine,
-    PFalcon2Base,
-    PFalconBase, //
+use crate::{
+    driver::Bar0,
+    falcon::{
+        Falcon,
+        FalconEngine,
+        PFalcon2Base,
+        PFalconBase, //
+    },
+    regs,
 };
 
 /// Type specifying the `Fsp` falcon engine. Cannot be instantiated.
@@ -26,3 +42,107 @@ impl RegisterBase<PFalcon2Base> for Fsp {
 }
 
 impl FalconEngine for Fsp {}
+
+/// Maximum addressable EMEM size, derived from the 24-bit offset field
+/// in `NV_PFALCON_FALCON_EMEM_CTL`.
+const EMEM_MAX_SIZE: Alignment = Alignment::new::<{ 1 << 24 }>();
+
+/// I/O backend for the FSP falcon's external memory (EMEM).
+///
+/// `EMEM_CTL` is programmed once with a start offset and an auto-increment
+/// mode, then each access to `EMEM_DATA` advances the offset by one 32-bit
+/// word in hardware.
+struct Emem<'a> {
+    bar: &'a Bar0,
+}
+
+impl<'a> Emem<'a> {
+    fn new(bar: &'a Bar0) -> Self {
+        Self { bar }
+    }
+
+    /// Programs `EMEM_CTL` with the start byte `offset` and the `ctl` mode bits.
+    ///
+    /// Returns `EINVAL` if `offset` is outside the addressable EMEM window.
+    fn program(&mut self, offset: usize, ctl: regs::NV_PFALCON_FALCON_EMEM_CTL) -> Result {
+        let offset = Bounded::<usize, { EMEM_MAX_SIZE.log2() }>::try_new(offset)
+            .map(Bounded::cast::<u32>)
+            .ok_or(EINVAL)?;
+
+        self.bar
+            .write(WithBase::of::<Fsp>(), ctl.with_offset(offset));
+
+        Ok(())
+    }
+
+    /// Begins a write burst at byte `offset`, auto-incrementing on each write.
+    fn begin_write(&mut self, offset: usize) -> Result {
+        self.program(
+            offset,
+            regs::NV_PFALCON_FALCON_EMEM_CTL::zeroed().with_auto_increment_write(true),
+        )
+    }
+
+    /// Begins a read burst at byte `offset`, auto-incrementing on each read.
+    fn begin_read(&mut self, offset: usize) -> Result {
+        self.program(
+            offset,
+            regs::NV_PFALCON_FALCON_EMEM_CTL::zeroed().with_auto_increment_read(true),
+        )
+    }
+
+    /// Writes the next 32-bit `value`; hardware advances the offset.
+    fn write_next(&mut self, value: u32) {
+        self.bar.write(
+            WithBase::of::<Fsp>(),
+            regs::NV_PFALCON_FALCON_EMEM_DATA::zeroed().with_data(value),
+        );
+    }
+
+    /// Reads the next 32-bit word; hardware advances the offset.
+    fn read_next(&mut self) -> u32 {
+        self.bar
+            .read(regs::NV_PFALCON_FALCON_EMEM_DATA::of::<Fsp>())
+            .data()
+    }
+}
+
+impl Falcon<Fsp> {
+    /// Writes `data` to FSP external memory at byte `offset`.
+    ///
+    /// `data` is interpreted as little-endian 32-bit words. Returns `EINVAL`
+    /// if `offset` or the `data` length is not 4-byte aligned.
+    #[expect(dead_code)]
+    fn write_emem(&mut self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        let mut emem = Emem::new(bar);
+        emem.begin_write(offset as usize)?;
+        for chunk in data.chunks_exact(4) {
+            emem.write_next(u32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]));
+        }
+
+        Ok(())
+    }
+
+    /// Reads FSP external memory at byte `offset` into `data`.
+    ///
+    /// `data` is stored as little-endian 32-bit words. Returns `EINVAL` if
+    /// `offset` or the `data` length is not 4-byte aligned.
+    #[expect(dead_code)]
+    fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
+        if offset % 4 != 0 || data.len() % 4 != 0 {
+            return Err(EINVAL);
+        }
+
+        let mut emem = Emem::new(bar);
+        emem.begin_read(offset as usize)?;
+        for chunk in data.chunks_exact_mut(4) {
+            chunk.copy_from_slice(&emem.read_next().to_le_bytes());
+        }
+
+        Ok(())
+    }
+}
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 270779d31ab3..5871bbce7052 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -467,6 +467,21 @@ pub(crate) fn vga_workspace_addr(self) -> Option<u64> {
     pub(crate) NV_PFALCON_FBIF_CTL(u32) @ PFalconBase + 0x00000624 {
         7:7     allow_phys_no_ctx => bool;
     }
+
+    // Falcon EMEM PIO registers (used by FSP on Hopper/Blackwell).
+    // These provide the falcon external memory communication interface.
+    pub(crate) NV_PFALCON_FALCON_EMEM_CTL(u32) @ PFalconBase + 0x00000ac0 {
+        /// EMEM byte offset (must be 4-byte aligned).
+        23:0    offset;
+        /// Auto-increment the offset after each write.
+        24:24   auto_increment_write => bool;
+        /// Auto-increment the offset after each read.
+        25:25   auto_increment_read => bool;
+    }
+
+    pub(crate) NV_PFALCON_FALCON_EMEM_DATA(u32) @ PFalconBase + 0x00000ac4 {
+        31:0    data => u32;
+    }
 }
 
 impl NV_PFALCON_FALCON_DMACTL {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v11 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (13 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 14/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:09 ` [PATCH v11 16/22] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

FSP communication uses a pair of non-circular queues in the FSP
falcon's EMEM, one for messages from the driver to FSP and one for
replies, with the driver polling for response data. Add the queue
registers and the low-level helpers used by the higher-level FSP
message layer.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs | 61 ++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs       | 21 ++++++++++
 2 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index 7067c1963745..57880c4289cc 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -112,7 +112,6 @@ impl Falcon<Fsp> {
     ///
     /// `data` is interpreted as little-endian 32-bit words. Returns `EINVAL`
     /// if `offset` or the `data` length is not 4-byte aligned.
-    #[expect(dead_code)]
     fn write_emem(&mut self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
         if offset % 4 != 0 || data.len() % 4 != 0 {
             return Err(EINVAL);
@@ -131,7 +130,6 @@ fn write_emem(&mut self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
     ///
     /// `data` is stored as little-endian 32-bit words. Returns `EINVAL` if
     /// `offset` or the `data` length is not 4-byte aligned.
-    #[expect(dead_code)]
     fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
         if offset % 4 != 0 || data.len() % 4 != 0 {
             return Err(EINVAL);
@@ -145,4 +143,63 @@ fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
 
         Ok(())
     }
+
+    /// Poll FSP for incoming data.
+    ///
+    /// Returns the size of available data in bytes, or 0 if no data is available.
+    ///
+    /// The FSP message queue is not circular. Pointers are reset to 0 after each
+    /// message exchange, so `tail >= head` is always true when data is present.
+    #[expect(dead_code)]
+    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
+        let head = bar.read(regs::NV_PFSP_MSGQ_HEAD).address();
+        let tail = bar.read(regs::NV_PFSP_MSGQ_TAIL).address();
+
+        if head == tail {
+            return 0;
+        }
+
+        // TAIL points at last DWORD written, so add 4 to get total size
+        tail.saturating_sub(head) + 4
+    }
+
+    /// Writes `packet` to FSP EMEM and updates the queue pointers to notify FSP.
+    ///
+    /// Returns `EINVAL` if `packet` is empty or its length is not 4-byte aligned.
+    #[expect(dead_code)]
+    pub(crate) fn send_msg(&mut self, bar: &Bar0, packet: &[u8]) -> Result {
+        if packet.is_empty() {
+            return Err(EINVAL);
+        }
+
+        // Write message to EMEM at offset 0 (validates 4-byte alignment)
+        self.write_emem(bar, 0, packet)?;
+
+        // Update queue pointers. TAIL points at the last DWORD written.
+        let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
+        bar.write_reg(regs::NV_PFSP_QUEUE_TAIL::zeroed().with_address(tail_offset));
+        bar.write_reg(regs::NV_PFSP_QUEUE_HEAD::zeroed().with_address(0));
+
+        Ok(())
+    }
+
+    /// Reads `size` bytes from FSP EMEM into `buffer` and resets the queue pointers.
+    ///
+    /// `size` comes from `poll_msgq`. Returns `EINVAL` if `size` is 0, exceeds
+    /// `buffer`, or is not 4-byte aligned.
+    #[expect(dead_code)]
+    pub(crate) fn recv_msg(&mut self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result {
+        if size == 0 || size > buffer.len() {
+            return Err(EINVAL);
+        }
+
+        // Read response from EMEM at offset 0 (validates 4-byte alignment)
+        self.read_emem(bar, 0, &mut buffer[..size])?;
+
+        // Reset message queue pointers after reading
+        bar.write_reg(regs::NV_PFSP_MSGQ_TAIL::zeroed().with_address(0));
+        bar.write_reg(regs::NV_PFSP_MSGQ_HEAD::zeroed().with_address(0));
+
+        Ok(())
+    }
 }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 5871bbce7052..d4067efb8772 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -568,6 +568,27 @@ pub(crate) fn mem_scrubbing_done(self) -> bool {
     }
 }
 
+// FSP (Firmware System Processor) queue registers for Hopper/Blackwell Chain of Trust.
+// These registers manage falcon EMEM communication queues.
+
+register! {
+    pub(crate) NV_PFSP_QUEUE_HEAD(u32) @ 0x008f2c00 {
+        31:0    address => u32;
+    }
+
+    pub(crate) NV_PFSP_QUEUE_TAIL(u32) @ 0x008f2c04 {
+        31:0    address => u32;
+    }
+
+    pub(crate) NV_PFSP_MSGQ_HEAD(u32) @ 0x008f2c80 {
+        31:0    address => u32;
+    }
+
+    pub(crate) NV_PFSP_MSGQ_TAIL(u32) @ 0x008f2c84 {
+        31:0    address => u32;
+    }
+}
+
 // The modules below provide registers that are not identical on all supported chips. They should
 // only be used in HAL modules.
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v11 16/22] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (14 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:09 ` [PATCH v11 17/22] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Add the MCTP (Management Component Transport Protocol) and NVDM (NVIDIA
Device Management) wire-format types used for communication between the
kernel driver and GPU firmware processors.

This includes typed MCTP transport headers, NVDM message headers, and
NVDM message type identifiers. Both the FSP boot path and the upcoming
GSP RPC message queue share this protocol layer.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/mctp.rs      | 102 +++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 2 files changed, 103 insertions(+)
 create mode 100644 drivers/gpu/nova-core/mctp.rs

diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs
new file mode 100644
index 000000000000..a13146dc0cca
--- /dev/null
+++ b/drivers/gpu/nova-core/mctp.rs
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
+//! MCTP/NVDM protocol types for NVIDIA GPU firmware communication.
+//!
+//! MCTP (Management Component Transport Protocol) carries NVDM (NVIDIA
+//! Device Management) messages between the kernel driver and GPU firmware
+//! processors such as FSP and GSP.
+
+#![expect(dead_code)]
+
+use kernel::pci::Vendor;
+
+/// NVDM message type identifiers carried over MCTP.
+#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
+#[repr(u8)]
+pub(crate) enum NvdmType {
+    #[default]
+    /// Chain of Trust boot message.
+    Cot = 0x14,
+    /// FSP command response.
+    FspResponse = 0x15,
+}
+
+impl TryFrom<u8> for NvdmType {
+    type Error = u8;
+
+    fn try_from(value: u8) -> Result<Self, Self::Error> {
+        match value {
+            x if x == u8::from(Self::Cot) => Ok(Self::Cot),
+            x if x == u8::from(Self::FspResponse) => Ok(Self::FspResponse),
+            _ => Err(value),
+        }
+    }
+}
+
+impl From<NvdmType> for u8 {
+    fn from(value: NvdmType) -> Self {
+        value as u8
+    }
+}
+
+bitfield! {
+    pub(crate) struct MctpHeader(u32), "MCTP transport header for NVIDIA firmware messages." {
+        31:31 som as bool, "Start-of-message bit.";
+        30:30 eom as bool, "End-of-message bit.";
+        29:28 seq as u8, "Packet sequence number.";
+        23:16 seid as u8, "Source endpoint ID.";
+    }
+}
+
+impl MctpHeader {
+    /// Builds a single-packet MCTP header (`SOM=1`, `EOM=1`, `SEQ=0`, `SEID=0`).
+    pub(crate) fn single_packet() -> Self {
+        Self::default().set_som(true).set_eom(true)
+    }
+
+    /// Returns whether this is a complete single-packet message (`SOM=1` and `EOM=1`).
+    pub(crate) fn is_single_packet(self) -> bool {
+        self.som() && self.eom()
+    }
+}
+
+impl From<u32> for MctpHeader {
+    fn from(raw: u32) -> Self {
+        Self(raw)
+    }
+}
+
+/// MCTP message type for PCI vendor-defined messages.
+const MSG_TYPE_VENDOR_PCI: u8 = 0x7e;
+
+bitfield! {
+    pub(crate) struct NvdmHeader(u32), "NVIDIA Vendor-Defined Message header over MCTP." {
+        31:24 nvdm_type as u8 ?=> NvdmType, "NVDM message type.";
+        23:8 vendor_id as u16, "PCI vendor ID.";
+        6:0 msg_type as u8, "MCTP vendor-defined message type.";
+    }
+}
+
+impl NvdmHeader {
+    /// Builds an NVDM header for the given message type.
+    pub(crate) fn new(nvdm_type: NvdmType) -> Self {
+        Self::default()
+            .set_msg_type(MSG_TYPE_VENDOR_PCI)
+            .set_vendor_id(Vendor::NVIDIA.as_raw())
+            .set_nvdm_type(nvdm_type)
+    }
+
+    /// Validates this header against the expected NVIDIA NVDM format and type.
+    pub(crate) fn validate(self, expected_type: NvdmType) -> bool {
+        self.msg_type() == MSG_TYPE_VENDOR_PCI
+            && self.vendor_id() == Vendor::NVIDIA.as_raw()
+            && matches!(self.nvdm_type(), Ok(nvdm_type) if nvdm_type == expected_type)
+    }
+}
+
+impl From<u32> for NvdmHeader {
+    fn from(raw: u32) -> Self {
+        Self(raw)
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 7b6c331da10e..9f0199f7b38c 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -20,6 +20,7 @@
 mod fsp;
 mod gpu;
 mod gsp;
+mod mctp;
 #[macro_use]
 mod num;
 mod regs;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v11 17/22] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (15 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 16/22] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:09 ` [PATCH v11 18/22] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

FSP exchanges are request/response: the driver sends an MCTP/NVDM
message and must match the reply against the request before acting on
it. Add the synchronous send-and-wait path that validates the response
transport and message headers and confirms the reply corresponds to the
request that was sent.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/falcon/fsp.rs |   3 -
 drivers/gpu/nova-core/fsp.rs        | 129 +++++++++++++++++++++++++++-
 2 files changed, 128 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
index 57880c4289cc..a3345121485d 100644
--- a/drivers/gpu/nova-core/falcon/fsp.rs
+++ b/drivers/gpu/nova-core/falcon/fsp.rs
@@ -150,7 +150,6 @@ fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
     ///
     /// The FSP message queue is not circular. Pointers are reset to 0 after each
     /// message exchange, so `tail >= head` is always true when data is present.
-    #[expect(dead_code)]
     pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
         let head = bar.read(regs::NV_PFSP_MSGQ_HEAD).address();
         let tail = bar.read(regs::NV_PFSP_MSGQ_TAIL).address();
@@ -166,7 +165,6 @@ pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
     /// Writes `packet` to FSP EMEM and updates the queue pointers to notify FSP.
     ///
     /// Returns `EINVAL` if `packet` is empty or its length is not 4-byte aligned.
-    #[expect(dead_code)]
     pub(crate) fn send_msg(&mut self, bar: &Bar0, packet: &[u8]) -> Result {
         if packet.is_empty() {
             return Err(EINVAL);
@@ -187,7 +185,6 @@ pub(crate) fn send_msg(&mut self, bar: &Bar0, packet: &[u8]) -> Result {
     ///
     /// `size` comes from `poll_msgq`. Returns `EINVAL` if `size` is 0, exceeds
     /// `buffer`, or is not 4-byte aligned.
-    #[expect(dead_code)]
     pub(crate) fn recv_msg(&mut self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result {
         if size == 0 || size > buffer.len() {
             return Err(EINVAL);
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index ee8fc384fe38..cc2ebc3f6e78 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -11,17 +11,64 @@
     device,
     io::poll::read_poll_timeout,
     prelude::*,
-    time::Delta, //
+    time::Delta,
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    },
 };
 
 use crate::{
     driver::Bar0,
+    falcon::{
+        self,
+        Falcon, //
+    },
     gpu::Chipset,
+    mctp::{
+        MctpHeader,
+        NvdmHeader,
+        NvdmType, //
+    },
+    num,
     regs, //
 };
 
 mod hal;
 
+/// FSP message timeout in milliseconds.
+const FSP_MSG_TIMEOUT_MS: i64 = 2000;
+
+/// FSP command response payload (`NVDM_PAYLOAD_COMMAND_RESPONSE`).
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCommandResponse {
+    task_id: u32,
+    command_nvdm_type: u32,
+    error_code: u32,
+}
+
+/// Complete FSP response structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct FspResponse {
+    mctp_header: MctpHeader,
+    nvdm_header: NvdmHeader,
+    response: NvdmPayloadCommandResponse,
+}
+
+// SAFETY: FspResponse is a packed C struct with only integral fields.
+unsafe impl FromBytes for FspResponse {}
+
+/// Trait implemented by types representing a message to send to FSP.
+///
+/// This provides [`Fsp::send_sync_fsp`] with the information it needs to send
+/// a given message, following the same pattern as GSP's `CommandToGsp`.
+pub(crate) trait MessageToFsp: AsBytes {
+    /// NVDM type identifying this message to FSP.
+    const NVDM_TYPE: u32;
+}
+
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -48,4 +95,84 @@ pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipse
         })
         .map(|_| ())
     }
+
+    /// Sends a message to FSP and waits for the response.
+    #[expect(dead_code)]
+    fn send_sync_fsp<M>(
+        dev: &device::Device,
+        bar: &Bar0,
+        fsp_falcon: &mut Falcon<falcon::fsp::Fsp>,
+        msg: &M,
+    ) -> Result
+    where
+        M: MessageToFsp,
+    {
+        fsp_falcon.send_msg(bar, msg.as_bytes())?;
+
+        let packet_size = read_poll_timeout(
+            || Ok(fsp_falcon.poll_msgq(bar)),
+            |&size| size > 0,
+            Delta::from_millis(10),
+            Delta::from_millis(FSP_MSG_TIMEOUT_MS),
+        )
+        .map_err(|_| {
+            dev_err!(dev, "FSP response timeout\n");
+            ETIMEDOUT
+        })?;
+
+        let packet_size = num::u32_as_usize(packet_size);
+        let mut response_buf = KVec::<u8>::new();
+        response_buf.resize(packet_size, 0, GFP_KERNEL)?;
+        fsp_falcon.recv_msg(bar, &mut response_buf, packet_size)?;
+
+        let (response, _) = FspResponse::from_bytes_prefix(&response_buf[..]).ok_or_else(|| {
+            dev_err!(dev, "FSP response too small: {}\n", response_buf.len());
+            EIO
+        })?;
+
+        let mctp_header = response.mctp_header;
+        let nvdm_header = response.nvdm_header;
+        let command_nvdm_type = response.response.command_nvdm_type;
+        let error_code = response.response.error_code;
+
+        if !mctp_header.is_single_packet() {
+            dev_err!(
+                dev,
+                "Unexpected MCTP header in FSP reply: {:x?}\n",
+                mctp_header,
+            );
+            return Err(EIO);
+        }
+
+        if !nvdm_header.validate(NvdmType::FspResponse) {
+            dev_err!(
+                dev,
+                "Unexpected NVDM header in FSP reply: {:x?}\n",
+                nvdm_header,
+            );
+            return Err(EIO);
+        }
+
+        if command_nvdm_type != M::NVDM_TYPE {
+            dev_err!(
+                dev,
+                "Expected NVDM type {:#x} in reply, got {:#x}\n",
+                M::NVDM_TYPE,
+                command_nvdm_type
+            );
+            return Err(EIO);
+        }
+
+        if error_code != 0 {
+            dev_err!(
+                dev,
+                "NVDM command {:#x} failed with error {:#x}\n",
+                M::NVDM_TYPE,
+                error_code
+            );
+            return Err(EIO);
+        }
+
+        Ok(())
+    }
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v11 18/22] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (16 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 17/22] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-06-01 14:07   ` Alexandre Courbot
  2026-05-30  3:09 ` [PATCH v11 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

The FSP Chain of Trust handshake is versioned, and the version the
driver must advertise depends on the GPU: Hopper speaks version 1 and
Blackwell speaks version 2. Represent that version explicitly and select
it per architecture so the boot message carries the value FSP expects.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs | 19 +++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs | 16 ++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index cc2ebc3f6e78..5aae8282f2f0 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -36,6 +36,25 @@
 
 mod hal;
 
+/// FSP Chain of Trust protocol version.
+///
+/// Hopper (GH100) uses version 1, Blackwell uses version 2.
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct FspCotVersion(u16);
+
+impl FspCotVersion {
+    /// Creates a new FSP CoT version.
+    pub(crate) const fn new(version: u16) -> Self {
+        Self(version)
+    }
+
+    /// Returns the raw protocol version number for the wire format.
+    #[expect(dead_code)]
+    pub(crate) const fn raw(self) -> u16 {
+        self.0
+    }
+}
+
 /// FSP message timeout in milliseconds.
 const FSP_MSG_TIMEOUT_MS: i64 = 2000;
 
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 7dd736e5b190..6cdface3c618 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -21,6 +21,7 @@
         Falcon, //
     },
     fb::SysmemFlush,
+    fsp::FspCotVersion,
     gsp::{
         self,
         Gsp, //
@@ -141,6 +142,21 @@ pub(crate) const fn needs_fwsec_bootloader(self) -> bool {
     pub(crate) fn pci_config_mirror_range(self) -> Range<u32> {
         hal::gpu_hal(self).pci_config_mirror_range()
     }
+
+    /// Returns the FSP Chain of Trust (CoT) protocol version for this chipset.
+    ///
+    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
+    /// Returns `None` for architectures that do not use FSP.
+    #[expect(dead_code)]
+    pub(crate) const fn fsp_cot_version(self) -> Option<FspCotVersion> {
+        match self.arch() {
+            Architecture::Hopper => Some(FspCotVersion::new(1)),
+            Architecture::BlackwellGB10x | Architecture::BlackwellGB20x => {
+                Some(FspCotVersion::new(2))
+            }
+            _ => None,
+        }
+    }
 }
 
 // TODO
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 18/22] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  2026-05-30  3:09 ` [PATCH v11 18/22] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
@ 2026-06-01 14:07   ` Alexandre Courbot
  2026-06-01 18:23     ` John Hubbard
  0 siblings, 1 reply; 56+ messages in thread
From: Alexandre Courbot @ 2026-06-01 14:07 UTC (permalink / raw)
  To: John Hubbard
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
> The FSP Chain of Trust handshake is versioned, and the version the
> driver must advertise depends on the GPU: Hopper speaks version 1 and
> Blackwell speaks version 2. Represent that version explicitly and select
> it per architecture so the boot message carries the value FSP expects.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/fsp.rs | 19 +++++++++++++++++++
>  drivers/gpu/nova-core/gpu.rs | 16 ++++++++++++++++
>  2 files changed, 35 insertions(+)
>
> diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
> index cc2ebc3f6e78..5aae8282f2f0 100644
> --- a/drivers/gpu/nova-core/fsp.rs
> +++ b/drivers/gpu/nova-core/fsp.rs
> @@ -36,6 +36,25 @@
>  
>  mod hal;
>  
> +/// FSP Chain of Trust protocol version.
> +///
> +/// Hopper (GH100) uses version 1, Blackwell uses version 2.
> +#[derive(Debug, Clone, Copy)]
> +pub(crate) struct FspCotVersion(u16);
> +
> +impl FspCotVersion {
> +    /// Creates a new FSP CoT version.
> +    pub(crate) const fn new(version: u16) -> Self {
> +        Self(version)
> +    }
> +
> +    /// Returns the raw protocol version number for the wire format.
> +    #[expect(dead_code)]
> +    pub(crate) const fn raw(self) -> u16 {
> +        self.0
> +    }
> +}

This type seems to just wrap a `u16`, and return its raw value without
any limitation for its range, or other functionality. It is just created
in `Chipset::fsp_cot_version`, to be immediately unpacked by
`Fsp::boot_fmc`. Why not just use a `u16` directly?

> +
>  /// FSP message timeout in milliseconds.
>  const FSP_MSG_TIMEOUT_MS: i64 = 2000;
>  
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 7dd736e5b190..6cdface3c618 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -21,6 +21,7 @@
>          Falcon, //
>      },
>      fb::SysmemFlush,
> +    fsp::FspCotVersion,
>      gsp::{
>          self,
>          Gsp, //
> @@ -141,6 +142,21 @@ pub(crate) const fn needs_fwsec_bootloader(self) -> bool {
>      pub(crate) fn pci_config_mirror_range(self) -> Range<u32> {
>          hal::gpu_hal(self).pci_config_mirror_range()
>      }
> +
> +    /// Returns the FSP Chain of Trust (CoT) protocol version for this chipset.
> +    ///
> +    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
> +    /// Returns `None` for architectures that do not use FSP.
> +    #[expect(dead_code)]
> +    pub(crate) const fn fsp_cot_version(self) -> Option<FspCotVersion> {
> +        match self.arch() {
> +            Architecture::Hopper => Some(FspCotVersion::new(1)),
> +            Architecture::BlackwellGB10x | Architecture::BlackwellGB20x => {
> +                Some(FspCotVersion::new(2))
> +            }
> +            _ => None,
> +        }
> +    }

This is only used in the `fsp` module - can we turn this into a FSP HAL
method? That way it also won't need to return an `Option` since
`fsp_hal` will already have filtered architectures that don't support
FSP.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 18/22] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
  2026-06-01 14:07   ` Alexandre Courbot
@ 2026-06-01 18:23     ` John Hubbard
  0 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-06-01 18:23 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Danilo Krummrich, Joel Fernandes, Timur Tabi, Alistair Popple,
	Eliot Courtney, Shashank Sharma, Zhi Wang, David Airlie,
	Simona Vetter, Bjorn Helgaas, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, rust-for-linux, LKML

On 6/1/26 7:07 AM, Alexandre Courbot wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
...
>> +impl FspCotVersion {
>> +    /// Creates a new FSP CoT version.
>> +    pub(crate) const fn new(version: u16) -> Self {
>> +        Self(version)
>> +    }
>> +
>> +    /// Returns the raw protocol version number for the wire format.
>> +    #[expect(dead_code)]
>> +    pub(crate) const fn raw(self) -> u16 {
>> +        self.0
>> +    }
>> +}
> 
> This type seems to just wrap a `u16`, and return its raw value without
> any limitation for its range, or other functionality. It is just created
> in `Chipset::fsp_cot_version`, to be immediately unpacked by
> `Fsp::boot_fmc`. Why not just use a `u16` directly?

OK, will do that.

> 
>> +
>>  /// FSP message timeout in milliseconds.
>>  const FSP_MSG_TIMEOUT_MS: i64 = 2000;
>>  
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index 7dd736e5b190..6cdface3c618 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -21,6 +21,7 @@
>>          Falcon, //
>>      },
>>      fb::SysmemFlush,
>> +    fsp::FspCotVersion,
>>      gsp::{
>>          self,
>>          Gsp, //
>> @@ -141,6 +142,21 @@ pub(crate) const fn needs_fwsec_bootloader(self) -> bool {
>>      pub(crate) fn pci_config_mirror_range(self) -> Range<u32> {
>>          hal::gpu_hal(self).pci_config_mirror_range()
>>      }
>> +
>> +    /// Returns the FSP Chain of Trust (CoT) protocol version for this chipset.
>> +    ///
>> +    /// Hopper (GH100) uses version 1, Blackwell uses version 2.
>> +    /// Returns `None` for architectures that do not use FSP.
>> +    #[expect(dead_code)]
>> +    pub(crate) const fn fsp_cot_version(self) -> Option<FspCotVersion> {
>> +        match self.arch() {
>> +            Architecture::Hopper => Some(FspCotVersion::new(1)),
>> +            Architecture::BlackwellGB10x | Architecture::BlackwellGB20x => {
>> +                Some(FspCotVersion::new(2))
>> +            }
>> +            _ => None,
>> +        }
>> +    }
> 
> This is only used in the `fsp` module - can we turn this into a FSP HAL
> method? That way it also won't need to return an `Option` since
> `fsp_hal` will already have filtered architectures that don't support
> FSP.

Yes. This was one of those snippets that I had flagged as "might need
to be a HAL", so I'll go ahead and make it a HAL now that you've also
suggested that.

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v11 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (17 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 18/22] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:09 ` [PATCH v11 20/22] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

Build and send the Chain of Trust message to FSP, bundling the
DMA-coherent boot parameters that FSP reads at boot time.

Co-developed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/firmware/fsp.rs         |   2 -
 drivers/gpu/nova-core/fsp.rs                  | 142 +++++++++++++++++-
 drivers/gpu/nova-core/gpu.rs                  |   1 -
 drivers/gpu/nova-core/gsp.rs                  |   1 +
 drivers/gpu/nova-core/gsp/fw.rs               |  64 ++++++++
 .../gpu/nova-core/gsp/fw/r570_144/bindings.rs |  82 ++++++++++
 drivers/gpu/nova-core/gsp/hal/gh100.rs        |  26 +++-
 drivers/gpu/nova-core/mctp.rs                 |   2 -
 8 files changed, 307 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware/fsp.rs b/drivers/gpu/nova-core/firmware/fsp.rs
index dc28d0cc2d03..13e6a5b5aeae 100644
--- a/drivers/gpu/nova-core/firmware/fsp.rs
+++ b/drivers/gpu/nova-core/firmware/fsp.rs
@@ -39,10 +39,8 @@ pub(crate) struct FmcSignatures {
 
 pub(crate) struct FspFirmware {
     /// FMC firmware image data (only the "image" ELF section).
-    #[expect(dead_code)]
     pub(crate) fmc_image: Coherent<[u8]>,
     /// FMC firmware signatures.
-    #[expect(dead_code)]
     pub(crate) fmc_sigs: KBox<FmcSignatures>,
 }
 
diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 5aae8282f2f0..5878d86323b9 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -9,8 +9,14 @@
 
 use kernel::{
     device,
+    dma::Coherent,
     io::poll::read_poll_timeout,
     prelude::*,
+    ptr::{
+        Alignable,
+        Alignment, //
+    },
+    sizes::SZ_2M,
     time::Delta,
     transmute::{
         AsBytes,
@@ -24,7 +30,13 @@
         self,
         Falcon, //
     },
+    fb::FbLayout,
+    firmware::fsp::{
+        FmcSignatures,
+        FspFirmware, //
+    },
     gpu::Chipset,
+    gsp::GspFmcBootParams,
     mctp::{
         MctpHeader,
         NvdmHeader,
@@ -49,7 +61,6 @@ pub(crate) const fn new(version: u16) -> Self {
     }
 
     /// Returns the raw protocol version number for the wire format.
-    #[expect(dead_code)]
     pub(crate) const fn raw(self) -> u16 {
         self.0
     }
@@ -67,6 +78,35 @@ struct NvdmPayloadCommandResponse {
     error_code: u32,
 }
 
+/// NVDM (NVIDIA Device Management) CoT (Chain of Trust) payload, the main
+/// message body sent to FSP for Chain of Trust boot.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct NvdmPayloadCot {
+    version: u16,
+    size: u16,
+    gsp_fmc_sysmem_offset: u64,
+    frts_sysmem_offset: u64,
+    frts_sysmem_size: u32,
+    frts_vidmem_offset: u64,
+    frts_vidmem_size: u32,
+    sigs: FmcSignatures,
+    gsp_boot_args_sysmem_offset: u64,
+}
+
+/// Complete FSP message structure with MCTP and NVDM headers.
+#[repr(C, packed)]
+#[derive(Clone, Copy)]
+struct FspMessage {
+    mctp_header: MctpHeader,
+    nvdm_header: NvdmHeader,
+    cot: NvdmPayloadCot,
+}
+
+// SAFETY: `FspMessage` is `#[repr(C, packed)]` with no padding, so all of its
+// bytes are initialized.
+unsafe impl AsBytes for FspMessage {}
+
 /// Complete FSP response structure with MCTP and NVDM headers.
 #[repr(C, packed)]
 #[derive(Clone, Copy)]
@@ -88,6 +128,47 @@ pub(crate) trait MessageToFsp: AsBytes {
     const NVDM_TYPE: u32;
 }
 
+impl MessageToFsp for FspMessage {
+    const NVDM_TYPE: u32 = NvdmType::Cot as u32;
+}
+
+/// Bundled arguments for FMC boot via FSP Chain of Trust.
+pub(crate) struct FmcBootArgs<'a> {
+    chipset: Chipset,
+    fsp_fw: &'a FspFirmware,
+    fmc_boot_params: Coherent<GspFmcBootParams>,
+    resume: bool,
+}
+
+impl<'a> FmcBootArgs<'a> {
+    /// Builds FMC boot arguments, allocating the DMA-coherent boot parameter
+    /// structure that FSP will read.
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: Chipset,
+        fsp_fw: &'a FspFirmware,
+        wpr_meta_addr: u64,
+        libos_addr: u64,
+        resume: bool,
+    ) -> Result<Self> {
+        let init = GspFmcBootParams::new(wpr_meta_addr, libos_addr);
+
+        Ok(Self {
+            chipset,
+            fsp_fw,
+            fmc_boot_params: Coherent::<GspFmcBootParams>::init(dev, GFP_KERNEL, init)?,
+            resume,
+        })
+    }
+
+    /// DMA address of the FMC boot parameters, needed after boot for lockdown
+    /// release polling.
+    #[expect(dead_code)]
+    pub(crate) fn boot_params_dma_handle(&self) -> u64 {
+        self.fmc_boot_params.dma_handle()
+    }
+}
+
 /// FSP interface for Hopper/Blackwell GPUs.
 pub(crate) struct Fsp;
 
@@ -115,8 +196,65 @@ pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipse
         .map(|_| ())
     }
 
+    /// Boots GSP FMC via FSP Chain of Trust.
+    ///
+    /// Builds the CoT message from the pre-configured [`FmcBootArgs`], sends it
+    /// to FSP, and waits for the response.
+    pub(crate) fn boot_fmc(
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        fb_layout: &FbLayout,
+        fsp_falcon: &mut Falcon<falcon::fsp::Fsp>,
+        args: &FmcBootArgs<'_>,
+    ) -> Result {
+        dev_dbg!(dev, "Starting FSP boot sequence for {}\n", args.chipset);
+
+        let fmc_addr = args.fsp_fw.fmc_image.dma_handle();
+        let fmc_boot_params_addr = args.fmc_boot_params.dma_handle();
+
+        // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
+        let frts_offset = if !args.resume {
+            let frts_reserved_size = fb_layout.heap.len() + u64::from(fb_layout.pmu_reserved_size);
+
+            frts_reserved_size
+                .align_up(Alignment::new::<SZ_2M>())
+                .ok_or(EINVAL)?
+        } else {
+            0
+        };
+        let frts_size: u32 = if !args.resume {
+            fb_layout.frts.len().try_into()?
+        } else {
+            0
+        };
+
+        let msg = KBox::new(
+            FspMessage {
+                mctp_header: MctpHeader::single_packet(),
+                nvdm_header: NvdmHeader::new(NvdmType::Cot),
+                cot: NvdmPayloadCot {
+                    version: args.chipset.fsp_cot_version().ok_or(ENOTSUPP)?.raw(),
+                    size: u16::try_from(core::mem::size_of::<NvdmPayloadCot>())
+                        .map_err(|_| EINVAL)?,
+                    gsp_fmc_sysmem_offset: fmc_addr,
+                    frts_sysmem_offset: 0,
+                    frts_sysmem_size: 0,
+                    frts_vidmem_offset: frts_offset,
+                    frts_vidmem_size: frts_size,
+                    sigs: *args.fsp_fw.fmc_sigs,
+                    gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
+                },
+            },
+            GFP_KERNEL,
+        )?;
+
+        Self::send_sync_fsp(dev, bar, fsp_falcon, &*msg)?;
+
+        dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
+        Ok(())
+    }
+
     /// Sends a message to FSP and waits for the response.
-    #[expect(dead_code)]
     fn send_sync_fsp<M>(
         dev: &device::Device,
         bar: &Bar0,
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 6cdface3c618..43749a62f593 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -147,7 +147,6 @@ pub(crate) fn pci_config_mirror_range(self) -> Range<u32> {
     ///
     /// Hopper (GH100) uses version 1, Blackwell uses version 2.
     /// Returns `None` for architectures that do not use FSP.
-    #[expect(dead_code)]
     pub(crate) const fn fsp_cot_version(self) -> Option<FspCotVersion> {
         match self.arch() {
             Architecture::Hopper => Some(FspCotVersion::new(1)),
diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index 1885cfa5cb38..69175ca3315c 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -25,6 +25,7 @@
 mod sequencer;
 
 pub(crate) use fw::{
+    GspFmcBootParams,
     GspFwWprMeta,
     LibosParams, //
 };
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 0c54e8bf4bb3..558b37863f00 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -934,3 +934,67 @@ fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
         })
     }
 }
+
+#[repr(u32)]
+pub(crate) enum GspDmaTarget {
+    #[expect(dead_code)]
+    LocalFb = bindings::GSP_DMA_TARGET_GSP_DMA_TARGET_LOCAL_FB,
+    CoherentSystem = bindings::GSP_DMA_TARGET_GSP_DMA_TARGET_COHERENT_SYSTEM,
+    NoncoherentSystem = bindings::GSP_DMA_TARGET_GSP_DMA_TARGET_NONCOHERENT_SYSTEM,
+}
+
+type GspAcrBootGspRmParams = bindings::GSP_ACR_BOOT_GSP_RM_PARAMS;
+
+impl GspAcrBootGspRmParams {
+    fn new(target: GspDmaTarget, wpr_meta_addr: u64) -> impl Init<Self> {
+        #[allow(non_snake_case)]
+        let params = init!(Self {
+            target: target as u32,
+            gspRmDescSize: num::usize_into_u32::<{ size_of::<GspFwWprMeta>() }>(),
+            gspRmDescOffset: wpr_meta_addr,
+            bIsGspRmBoot: 1,
+            wprCarveoutOffset: 0,
+            wprCarveoutSize: 0,
+            __bindgen_padding_0: Default::default(),
+        });
+
+        params
+    }
+}
+
+type GspRmParams = bindings::GSP_RM_PARAMS;
+
+impl GspRmParams {
+    fn new(target: GspDmaTarget, libos_addr: u64) -> impl Init<Self> {
+        #[allow(non_snake_case)]
+        let params = init!(Self {
+            target: target as u32,
+            bootArgsOffset: libos_addr,
+            __bindgen_padding_0: Default::default(),
+        });
+
+        params
+    }
+}
+
+pub(crate) type GspFmcBootParams = bindings::GSP_FMC_BOOT_PARAMS;
+
+// SAFETY: Padding is explicit and will not contain uninitialized data.
+unsafe impl AsBytes for GspFmcBootParams {}
+// SAFETY: This struct only contains integer types for which all bit patterns are valid.
+unsafe impl FromBytes for GspFmcBootParams {}
+
+impl GspFmcBootParams {
+    pub(crate) fn new(wpr_meta_addr: u64, libos_addr: u64) -> impl Init<Self> {
+        #[allow(non_snake_case)]
+        let init = init!(Self {
+            // Blackwell FSP obtains WPR info from other sources, so
+            // wprCarveoutOffset and wprCarveoutSize are left zero.
+            bootGspRmParams <- GspAcrBootGspRmParams::new(GspDmaTarget::CoherentSystem, wpr_meta_addr),
+            gspRmParams <- GspRmParams::new(GspDmaTarget::NoncoherentSystem, libos_addr),
+            ..Zeroable::init_zeroed()
+        });
+
+        init
+    }
+}
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
index 1d592bd3f9ed..ea350f9b2cc4 100644
--- a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
@@ -883,6 +883,88 @@ fn default() -> Self {
         }
     }
 }
+pub const GSP_DMA_TARGET_GSP_DMA_TARGET_LOCAL_FB: GSP_DMA_TARGET = 0;
+pub const GSP_DMA_TARGET_GSP_DMA_TARGET_COHERENT_SYSTEM: GSP_DMA_TARGET = 1;
+pub const GSP_DMA_TARGET_GSP_DMA_TARGET_NONCOHERENT_SYSTEM: GSP_DMA_TARGET = 2;
+pub const GSP_DMA_TARGET_GSP_DMA_TARGET_COUNT: GSP_DMA_TARGET = 3;
+pub type GSP_DMA_TARGET = ffi::c_uint;
+#[repr(C)]
+#[derive(Debug, Default, Copy, Clone, MaybeZeroable)]
+pub struct GSP_FMC_INIT_PARAMS {
+    pub regkeys: u32_,
+}
+#[repr(C)]
+#[derive(Debug, Copy, Clone, MaybeZeroable)]
+pub struct GSP_ACR_BOOT_GSP_RM_PARAMS {
+    pub target: GSP_DMA_TARGET,
+    pub gspRmDescSize: u32_,
+    pub gspRmDescOffset: u64_,
+    pub wprCarveoutOffset: u64_,
+    pub wprCarveoutSize: u32_,
+    pub bIsGspRmBoot: u8_,
+    pub __bindgen_padding_0: [u8; 3usize],
+}
+impl Default for GSP_ACR_BOOT_GSP_RM_PARAMS {
+    fn default() -> Self {
+        let mut s = ::core::mem::MaybeUninit::<Self>::uninit();
+        unsafe {
+            ::core::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+            s.assume_init()
+        }
+    }
+}
+#[repr(C)]
+#[derive(Debug, Copy, Clone, MaybeZeroable)]
+pub struct GSP_RM_PARAMS {
+    pub target: GSP_DMA_TARGET,
+    pub __bindgen_padding_0: [u8; 4usize],
+    pub bootArgsOffset: u64_,
+}
+impl Default for GSP_RM_PARAMS {
+    fn default() -> Self {
+        let mut s = ::core::mem::MaybeUninit::<Self>::uninit();
+        unsafe {
+            ::core::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+            s.assume_init()
+        }
+    }
+}
+#[repr(C)]
+#[derive(Debug, Copy, Clone, MaybeZeroable)]
+pub struct GSP_SPDM_PARAMS {
+    pub target: GSP_DMA_TARGET,
+    pub __bindgen_padding_0: [u8; 4usize],
+    pub payloadBufferOffset: u64_,
+    pub payloadBufferSize: u32_,
+    pub __bindgen_padding_1: [u8; 4usize],
+}
+impl Default for GSP_SPDM_PARAMS {
+    fn default() -> Self {
+        let mut s = ::core::mem::MaybeUninit::<Self>::uninit();
+        unsafe {
+            ::core::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+            s.assume_init()
+        }
+    }
+}
+#[repr(C)]
+#[derive(Debug, Copy, Clone, MaybeZeroable)]
+pub struct GSP_FMC_BOOT_PARAMS {
+    pub initParams: GSP_FMC_INIT_PARAMS,
+    pub __bindgen_padding_0: [u8; 4usize],
+    pub bootGspRmParams: GSP_ACR_BOOT_GSP_RM_PARAMS,
+    pub gspRmParams: GSP_RM_PARAMS,
+    pub gspSpdmParams: GSP_SPDM_PARAMS,
+}
+impl Default for GSP_FMC_BOOT_PARAMS {
+    fn default() -> Self {
+        let mut s = ::core::mem::MaybeUninit::<Self>::uninit();
+        unsafe {
+            ::core::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+            s.assume_init()
+        }
+    }
+}
 #[repr(C)]
 #[derive(Debug, Default, Copy, Clone, MaybeZeroable)]
 pub struct rpc_unloading_guest_driver_v1F_07 {
diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
index 151df05e303b..946e88481d6f 100644
--- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
@@ -21,7 +21,10 @@
         fsp::FspFirmware,
         FIRMWARE_VERSION, //
     },
-    fsp::Fsp,
+    fsp::{
+        FmcBootArgs,
+        Fsp, //
+    },
     gpu::Chipset,
     gsp::{
         boot::BootUnloadGuard,
@@ -40,20 +43,31 @@ impl GspHal for Gh100 {
     /// the GSP boot internally - no manual GSP reset/boot is needed.
     fn boot<'a>(
         &self,
-        _gsp: &'a Gsp,
+        gsp: &'a Gsp,
         dev: &'a device::Device<device::Bound>,
         bar: &'a Bar0,
         chipset: Chipset,
-        _fb_layout: &FbLayout,
-        _wpr_meta: &Coherent<GspFwWprMeta>,
+        fb_layout: &FbLayout,
+        wpr_meta: &Coherent<GspFwWprMeta>,
         _gsp_falcon: &'a Falcon<GspEngine>,
         _sec2_falcon: &'a Falcon<Sec2>,
     ) -> Result<BootUnloadGuard<'a>> {
-        let _fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
-        let _fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
+        let mut fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
+        let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
 
         Fsp::wait_secure_boot(dev, bar, chipset)?;
 
+        let args = FmcBootArgs::new(
+            dev,
+            chipset,
+            &fsp_fw,
+            wpr_meta.dma_handle(),
+            gsp.libos.dma_handle(),
+            false,
+        )?;
+
+        Fsp::boot_fmc(dev, bar, fb_layout, &mut fsp_falcon, &args)?;
+
         Err(ENOTSUPP)
     }
 }
diff --git a/drivers/gpu/nova-core/mctp.rs b/drivers/gpu/nova-core/mctp.rs
index a13146dc0cca..be3e757d05a0 100644
--- a/drivers/gpu/nova-core/mctp.rs
+++ b/drivers/gpu/nova-core/mctp.rs
@@ -7,8 +7,6 @@
 //! Device Management) messages between the kernel driver and GPU firmware
 //! processors such as FSP and GSP.
 
-#![expect(dead_code)]
-
 use kernel::pci::Vendor;
 
 /// NVDM message type identifiers carried over MCTP.
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v11 20/22] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (18 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:09 ` [PATCH v11 21/22] gpu: nova-core: add non-sec2 unload path John Hubbard
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

On Hopper and Blackwell, FSP boots GSP with hardware lockdown enabled.
After FSP Chain of Trust completes, the driver must poll for lockdown
release before proceeding with GSP initialization. Add the register
bit and helper functions needed for this polling.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/fsp.rs           |  1 -
 drivers/gpu/nova-core/gsp/hal/gh100.rs | 90 +++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs          |  2 +
 3 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
index 5878d86323b9..15d4b01c284a 100644
--- a/drivers/gpu/nova-core/fsp.rs
+++ b/drivers/gpu/nova-core/fsp.rs
@@ -163,7 +163,6 @@ pub(crate) fn new(
 
     /// DMA address of the FMC boot parameters, needed after boot for lockdown
     /// release polling.
-    #[expect(dead_code)]
     pub(crate) fn boot_params_dma_handle(&self) -> u64 {
         self.fmc_boot_params.dma_handle()
     }
diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
index 946e88481d6f..1f333a6f57a0 100644
--- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
@@ -5,7 +5,13 @@
 
 use kernel::{
     device,
-    dma::Coherent, //
+    dma::Coherent,
+    io::{
+        poll::read_poll_timeout,
+        register::WithBase,
+        Io, //
+    },
+    time::Delta,
 };
 
 use crate::{
@@ -32,8 +38,85 @@
         Gsp,
         GspFwWprMeta, //
     },
+    regs,
 };
 
+/// GSP lockdown pattern written by firmware to mbox0 while RISC-V branch privilege
+/// lockdown is active. The low byte varies, the upper 24 bits are fixed.
+const GSP_LOCKDOWN_PATTERN: u32 = 0xbadf_4100;
+const GSP_LOCKDOWN_MASK: u32 = 0xffff_ff00;
+
+/// GSP falcon mailbox state, used to track lockdown release status.
+struct GspMbox {
+    mbox0: u32,
+    mbox1: u32,
+}
+
+impl GspMbox {
+    /// Reads both mailboxes from the GSP falcon.
+    fn read(gsp_falcon: &Falcon<GspEngine>, bar: &Bar0) -> Self {
+        Self {
+            mbox0: gsp_falcon.read_mailbox0(bar),
+            mbox1: gsp_falcon.read_mailbox1(bar),
+        }
+    }
+
+    /// Returns `true` if the lockdown pattern is present in `mbox0`.
+    fn is_locked_down(&self) -> bool {
+        (self.mbox0 & GSP_LOCKDOWN_MASK) == GSP_LOCKDOWN_PATTERN
+    }
+
+    /// Combines mailbox0 and mailbox1 into a 64-bit address.
+    fn combined_addr(&self) -> u64 {
+        (u64::from(self.mbox1) << 32) | u64::from(self.mbox0)
+    }
+
+    /// Returns `true` if GSP lockdown has been released.
+    ///
+    /// Checks the lockdown pattern, validates the boot params address,
+    /// and verifies the `HWCFG2` lockdown bit is clear.
+    fn lockdown_released(&self, bar: &Bar0, fmc_boot_params_addr: u64) -> bool {
+        if self.is_locked_down() {
+            return false;
+        }
+
+        if self.mbox0 != 0 && self.combined_addr() != fmc_boot_params_addr {
+            return true;
+        }
+
+        let hwcfg2 = bar.read(regs::NV_PFALCON_FALCON_HWCFG2::of::<GspEngine>());
+        !hwcfg2.riscv_br_priv_lockdown()
+    }
+}
+
+/// Waits for GSP lockdown to be released after FSP Chain of Trust.
+fn wait_for_gsp_lockdown_release(
+    dev: &device::Device<device::Bound>,
+    bar: &Bar0,
+    gsp_falcon: &Falcon<GspEngine>,
+    fmc_boot_params_addr: u64,
+) -> Result {
+    dev_dbg!(dev, "Waiting for GSP lockdown release\n");
+
+    let mbox = read_poll_timeout(
+        || Ok(GspMbox::read(gsp_falcon, bar)),
+        |mbox| mbox.lockdown_released(bar, fmc_boot_params_addr),
+        Delta::from_millis(10),
+        Delta::from_secs(30),
+    )
+    .inspect_err(|_| {
+        dev_err!(dev, "GSP lockdown release timeout\n");
+    })?;
+
+    if mbox.mbox0 != 0 {
+        dev_err!(dev, "GSP-FMC boot failed (mbox: {:#x})\n", mbox.mbox0);
+        return Err(EIO);
+    }
+
+    dev_dbg!(dev, "GSP lockdown released\n");
+    Ok(())
+}
+
 struct Gh100;
 
 impl GspHal for Gh100 {
@@ -49,7 +132,7 @@ fn boot<'a>(
         chipset: Chipset,
         fb_layout: &FbLayout,
         wpr_meta: &Coherent<GspFwWprMeta>,
-        _gsp_falcon: &'a Falcon<GspEngine>,
+        gsp_falcon: &'a Falcon<GspEngine>,
         _sec2_falcon: &'a Falcon<Sec2>,
     ) -> Result<BootUnloadGuard<'a>> {
         let mut fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
@@ -68,6 +151,9 @@ fn boot<'a>(
 
         Fsp::boot_fmc(dev, bar, fb_layout, &mut fsp_falcon, &args)?;
 
+        let fmc_boot_params_addr = args.boot_params_dma_handle();
+        wait_for_gsp_lockdown_release(dev, bar, gsp_falcon, fmc_boot_params_addr)?;
+
         Err(ENOTSUPP)
     }
 }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index d4067efb8772..0f48e78eebe7 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -355,6 +355,8 @@ pub(crate) fn vga_workspace_addr(self) -> Option<u64> {
     pub(crate) NV_PFALCON_FALCON_HWCFG2(u32) @ PFalconBase + 0x000000f4 {
         /// Signal indicating that reset is completed (GA102+).
         31:31   reset_ready => bool;
+        /// RISC-V branch privilege lockdown bit.
+        13:13   riscv_br_priv_lockdown => bool;
         /// Set to 0 after memory scrubbing is completed.
         12:12   mem_scrubbing => bool;
         10:10   riscv => bool;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v11 21/22] gpu: nova-core: add non-sec2 unload path
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (19 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 20/22] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:09 ` [PATCH v11 22/22] gpu: nova-core: gsp: enable FSP boot path John Hubbard
  2026-05-30  3:21 ` [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

From: Eliot Courtney <ecourtney@nvidia.com>

For non-sec2 it is only required to wait for GSP falcon to halt. This is
because GSP does the main work of unloading on GPUs not using sec2.

Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
[ jhubbard: use Result instead of Result<()> in the UnloadBundle impl ]
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/hal/gh100.rs | 37 ++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
index 1f333a6f57a0..0076ca00a771 100644
--- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
@@ -34,7 +34,10 @@
     gpu::Chipset,
     gsp::{
         boot::BootUnloadGuard,
-        hal::GspHal,
+        hal::{
+            GspHal,
+            UnloadBundle, //
+        },
         Gsp,
         GspFwWprMeta, //
     },
@@ -117,6 +120,28 @@ fn wait_for_gsp_lockdown_release(
     Ok(())
 }
 
+struct FspUnloadBundle;
+
+impl UnloadBundle for FspUnloadBundle {
+    fn run(
+        &self,
+        dev: &device::Device<device::Bound>,
+        bar: &Bar0,
+        gsp_falcon: &Falcon<GspEngine>,
+        _sec2_falcon: &Falcon<Sec2>,
+    ) -> Result {
+        // GSP falcon does most of the work of resetting, so just wait for it to finish.
+        read_poll_timeout(
+            || Ok(gsp_falcon.is_riscv_active(bar)),
+            |&active| !active,
+            Delta::from_millis(10),
+            Delta::from_secs(5),
+        )
+        .map(|_| ())
+        .inspect_err(|_| dev_err!(dev, "GSP falcon failed to halt\n"))
+    }
+}
+
 struct Gh100;
 
 impl GspHal for Gh100 {
@@ -133,11 +158,19 @@ fn boot<'a>(
         fb_layout: &FbLayout,
         wpr_meta: &Coherent<GspFwWprMeta>,
         gsp_falcon: &'a Falcon<GspEngine>,
-        _sec2_falcon: &'a Falcon<Sec2>,
+        sec2_falcon: &'a Falcon<Sec2>,
     ) -> Result<BootUnloadGuard<'a>> {
         let mut fsp_falcon = Falcon::<FspEngine>::new(dev, chipset)?;
         let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
 
+        let unload_bundle = crate::gsp::UnloadBundle(
+            KBox::new(FspUnloadBundle, GFP_KERNEL)? as KBox<dyn UnloadBundle>
+        );
+
+        // Wrap the unload bundle into a drop guard so it is automatically run upon failure.
+        let _unload_guard =
+            BootUnloadGuard::new(gsp, dev, bar, gsp_falcon, sec2_falcon, Some(unload_bundle));
+
         Fsp::wait_secure_boot(dev, bar, chipset)?;
 
         let args = FmcBootArgs::new(
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v11 22/22] gpu: nova-core: gsp: enable FSP boot path
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (20 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 21/22] gpu: nova-core: add non-sec2 unload path John Hubbard
@ 2026-05-30  3:09 ` John Hubbard
  2026-05-30  3:21 ` [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:09 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, John Hubbard

From: Alexandre Courbot <acourbot@nvidia.com>

Now that all the elements are in place, enable the FSP boot path so
Hopper and Blackwell can boot.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/gsp/hal/gh100.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
index 0076ca00a771..c5434d59db0d 100644
--- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
+++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
@@ -168,7 +168,7 @@ fn boot<'a>(
         );
 
         // Wrap the unload bundle into a drop guard so it is automatically run upon failure.
-        let _unload_guard =
+        let unload_guard =
             BootUnloadGuard::new(gsp, dev, bar, gsp_falcon, sec2_falcon, Some(unload_bundle));
 
         Fsp::wait_secure_boot(dev, bar, chipset)?;
@@ -187,7 +187,7 @@ fn boot<'a>(
         let fmc_boot_params_addr = args.boot_params_dma_handle();
         wait_for_gsp_lockdown_release(dev, bar, gsp_falcon, fmc_boot_params_addr)?;
 
-        Err(ENOTSUPP)
+        Ok(unload_guard)
     }
 }
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support
  2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
                   ` (21 preceding siblings ...)
  2026-05-30  3:09 ` [PATCH v11 22/22] gpu: nova-core: gsp: enable FSP boot path John Hubbard
@ 2026-05-30  3:21 ` John Hubbard
  22 siblings, 0 replies; 56+ messages in thread
From: John Hubbard @ 2026-05-30  3:21 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Joel Fernandes, Timur Tabi, Alistair Popple, Eliot Courtney,
	Shashank Sharma, Zhi Wang, David Airlie, Simona Vetter,
	Bjorn Helgaas, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, rust-for-linux, LKML, nova-gpu

Shoot, I seem to have used the older, now-wrong script to send this, because
I've only Cc'd rust-for-linux, and left out our nice new nova-gpu@lists.linux.dev
mailing list. 

I'll +Cc nova-gpu@lists.linux.dev just on this cover letter, for now. I hope
this didn't mess up anyone's inbox too badly. 

thanks,
John Hubbard

On 5/29/26 8:09 PM, John Hubbard wrote:
> Changes in v11:
> 
> * Made the FSP messaging path sound. The FSP falcon's EMEM window is a
>   stateful register pair (program an offset, then touch the data
>   register), so modeling it as a stateless I/O region let aliasing
>   accesses corrupt each other's offset with no unsafe at the call site.
>   The EMEM accessors and the send/receive helpers now take &mut self, so
>   the falcon handle is the exclusive token for an in-flight exchange,
>   and the unsafe Io/IoCapable impls and their unreachable! bounds checks
>   are gone. The accessors now program the EMEM offset once and stream
>   through the data register using the falcon's auto-increment, matching
>   Open RM, instead of re-programming the offset for every word.
> 
> * Rebased onto a current drm-rust-next that already carries the v10
>   preparatory patches, which are dropped from the series.
> 
> * Top of the series: the v10 boot-integration patch is replaced by "gsp:
>   enable FSP boot path" (Alexandre Courbot) and "add non-sec2 unload
>   path" (Eliot Courtney). The Hopper/Blackwell boot path now lives in
>   the GSP HAL (gsp/hal/gh100.rs) and returns a BootUnloadGuard.
> 
> * Reordered per review: hardware-differences patches first (DMA mask,
>   PCI config mirror, PMU-reserved framebuffer, non-WPR heap, WPR2 heap,
>   sysmem flush registers), then the FSP/FMC stack, then GSP lockdown
>   release polling.
> 
> * Hardware-difference patches are now HAL methods instead of inline
>   Architecture matches: the PMU-reserved framebuffer size (patch
>   retitled from "calculate reserved FB heap size" to "compute
>   PMU-reserved framebuffer size"), the non-WPR heap size (now u32 with a
>   1 MiB default instead of Option<u32>, per v10 review, with the GB10x
>   value in the GB100 HAL and the larger GB20x value in the GB202 HAL),
>   and the PCI config mirror range. The larger WPR2 heap pulls its base
>   size from the generated bindings, drops the custom constants that have
>   no Open RM counterpart, and matches all architectures exhaustively.
> 
> * FSP firmware handling moved into firmware/fsp.rs: FspFirmware now
>   holds parsed signatures (KBox<FmcSignatures>) instead of a raw ELF
>   copy, extracted through a get_section closure (per v10 review).
> 
> * FSP secure-boot polling uses a per-chipset FSP HAL
>   (fsp/hal/{gh100,gb202}.rs) reading the correct NV_THERM_I2CS register,
>   instead of a free function in regs.rs.
> 
> * FSP Chain of Trust boot was redone around a new FmcBootArgs type, and
>   the response headers are strongly typed (MctpHeader/NvdmHeader instead
>   of bare u32), with the vendor ID from kernel::pci::Vendor.
> 
> * GB10x/GB20x sysmem flush: the HSHUB0/FBHUB0 register details moved
>   from module doccomments onto the write_sysmem_flush_page_* methods.
> 
> * Commit message cleanups: dropped stale claims, shortened an
>   over-length subject, and fixed trailer ordering.
> 
> Changes in v10:
> 
> * Reordered per review (and direct assistance--thanks again) from
>   Alexandre Courbot: the two refactoring patches (factor .fwsignature*
>   selection, use GPU Architecture to simplify HALs) now come first,
>   before GPU identification. The boot_via_fsp stub is introduced early
>   and completed as FSP features arrive. The SEC2 refactoring, PCI config
>   mirror, and reserved heap size patches are moved earlier in the
>   series.
> 
> * Made pmuReservedSize conditional on Blackwell dGPU architectures.
>   Open RM only sets this field for Blackwell (Turing/Ampere/Ada/Hopper
>   all leave it zero). Added calc_pmu_reserved_size() helper and
>   FbLayout.pmu_reserved_size field to route the value through the
>   layout instead of using the constant unconditionally. Replaced
>   `as u32` cast with usize_into_u32 for PMU_RESERVED_SIZE. (Alexandre)
> 
> * Split the GFW boot wait HAL change into two patches: one that moves
>   the existing behavior into a GpuHal trait, and a second that adds the
>   Hopper/Blackwell skip.
> 
> * Removed the Spec::chipset() accessor (no longer needed after
>   restructuring). Updated the Copy/Clone commit message accordingly.
> 
> * Rebased onto drm-rust-next-staging, which includes
>   const_align_up(), "move firmware image parsing code to firmware.rs",
>   "factor out an elf_str() function", and "make WPR heap sizing
>   fallible" from the v9 series. Series is now 28 patches (was 31).
> 
> * Depends on the "rust: sizes: SizeConstants trait" series[N], which
>   adds typed SZ_* constants (u64::SZ_1M, u32::SZ_4K, etc.). The
>   nova-core conversion patch ("use SizeConstants trait for u64 size
>   constants") will be posted separately, but is already included in my
>   git branch. The Blackwell patches that introduce new SZ_* usage
>   (larger non-WPR heap, FSP Chain of Trust boot, larger WPR2 heap) use
>   the trait form from the start.
> 
> * Fixed the PCI config mirror commit message: corrected hex offsets to
>   match the code (older architectures use 0x088000, Hopper/Blackwell
>   use 0x092000).
> 
> * Dropped the never-used nvdm_type_raw() method from the MCTP/NVDM
>   introducing patch.
> 
> * Removed stale Co-developed-by tag from the FSP Chain of Trust boot
>   commit per Alex's request. Rewrote the commit message to remove
>   references to the no-longer-existent fmc_full field.
> 
> * Added missing #[expect(dead_code)] on GspFmcBootParams in the FSP
>   secure boot commit, removed when the struct becomes used in the
>   Chain of Trust boot commit.
> 
> Changes in v9:
> 
> * Rebased onto today's drm-rust-next.
> 
> * Split Architecture::Blackwell into BlackwellGB10x and BlackwellGB20x,
>   after Gary Guo and Sashiko pointed out that GB10x and GB20x are
>   distinct enough to warrant separate architecture variants. This
>   surfaced several bugs where all Blackwell chips were incorrectly
>   treated as a single group:
>   * Fixed the FSP boot completion register address for GB10x. GB10x
>     uses the same address as Hopper (0x000200bc), not the GB20x
>     address (0x00ad00bc).
>   * Made the FSP secure boot timeout architecture-dependent. GB20x
>     now gets 5000ms while Hopper and GB10x keep 4000ms.
>   * Removed chipset-level match arms that were working around the
>     single-variant design in fb/hal.rs, firmware/gsp.rs, and regs.rs.
> 
> * Simplified find_gsp_sigs_section() to return &'static str instead of
>   Option<&'static str>, since the Architecture enum is now exhaustive
>   and every variant has a known signature section name.
> 
> * Moved dma_set_mask_and_coherent from probe() into Gpu::new(), with
>   the unsafe block narrowed to just that call. Gpu::new() now takes
>   pci::Device<device::Core> instead of device::Bound to support this.
> 
> * Dropped the local `chipset` variable in Gpu::new() and accessed
>   spec.chipset() directly, since Spec is now Copy.
> 
> * Changed Spec::chipset() to take self instead of &self, since Spec is
>   Copy.
> 
> * Removed the unnecessary Tu102/Gh100 consts in gpu/hal.rs and used the
>   unit structs directly.
> 
> * Kept a hold on the Firmware object in FspFirmware instead of copying
>   the FMC ELF into a KVec<u8>.
> 
> * Moved the dev_info formatting fix and the GFW_BOOT comment removal
>   out of the Copy/Clone patch and into the patches that actually touch
>   those lines.
> 
> * Added Reviewed-by tags from Gary Guo and Alice Ryhl.
> 
> Changes in v8:
> 
> * Added Clone/Copy derives to Spec and Revision. Removed the
>   unnecessary pin_init_scope wrapping in Gpu::new() that the lack of
>   Copy had forced. Added a Spec::chipset() accessor.
> 
> * Removed implementation-detail sentence from the
>   Architecture::dma_mask() doccomment.
> 
> * Simplified the GPU HAL to two variants (Tu102, Gh100) instead of
>   four. Renamed "Fsp" to "Gh100" to follow the HAL naming convention.
>   Removed the spurious GA100 special case. Moved the GFW_BOOT wait into
>   the HAL method itself instead of returning a bool.
> 
> * Increased the GFW_BOOT wait timeout from 4 seconds to 30 seconds,
>   after Joel found that a different Blackwell SKU required extra time.
> 
> * Removed stray Cc lines from each patch.
> 
> * Fixed rustfmt issues in gsp/fw.rs and gsp/boot.rs reported by the
>   kernel test robot against v7 patches 27 and 31.
> 
> Changes in v7:
> * Rebased onto Alexandre Courbot's rust register!() series in
>   drm-rust-next, including the related generic I/O accessor and
>   IoCapable changes.
> 
> * Rebased onto drm-rust-next (v7.0-rc4 based).
> 
> * Dropped the v6 patches that are already in drm-rust-next: the
>   aux-device fix, the pdev helper macro patch, and the one-item-per-line
>   use cleanup.
> 
> * Reworked the GPU init pieces per review. DMA mask setup now stays in
>   driver probe, with the mask width selected by GPU architecture, and
>   the GFW boot policy now lives in a dedicated GPU HAL.
> 
> * Reworked firmware image parsing per review around a single ElfFormat
>   trait with associated header types. Also added support for both ELF32
>   and ELF64 images, with automatic format detection.
> 
> * Reworked the MCTP/NVDM protocol code to use bitfield! and typed
>   accessors, removing the open-coded bit handling.
> 
> * Reworked the FSP messaging part of the series so that the message
>   structures are introduced in the first patches that use them, instead
>   of as a standalone dead-code-only patch. Also changed fmc_full to use
>   KVec<u8> from the start.
> 
> * Split the WPR heap overflow handling out into a separate prep patch.
>   That patch makes management_overhead() and wpr_heap_size() fallible,
>   uses checked arithmetic, and leaves the larger WPR2 heap patch with
>   only the Hopper and Blackwell sizing changes.
> 
> * Added a code comment documenting the Hopper and Blackwell PCI config
>   mirror base change.
> 
> Changes in v6:
> 
> * Rebased onto drm-rust-next (v7.0-rc1 based).
> 
> * Dropped the first two patches from v5 (aux device fix and pdev
>   macros), which have since been merged independently.
> 
> * const_align_up(): reworked per review from Gary Guo, Miguel Ojeda,
>   and Danilo Krummrich: now returns Option<usize> instead of panicking,
>   takes an Alignment argument instead of a const generic, and no longer
>   needs the inline_const feature addition in scripts/Makefile.build.
> 
> * The rust/sizes and SZ_*_U64 patches from v5 are no longer included.
>   I plan to post those as a separate series that depends on this one.
> 
> Changes in v5:
> 
> * Rebased onto linux.git master.
> 
> * Split MCTP protocol into its own module and file.
> 
> * Many Rust-based improvements: more use of types, especially. Also
>   used Result and Option more.
> 
> * Lots of cleanup of comments and print output and error handling.
> 
> * Added const_align_up() to rust/ and used it in nova-core. This
>   required enabling a Rust feature: inline_const, as recommended by
>   Miguel Ojeda.
> 
> * Refactoring various things, such as Gpu::new() to own Spec creation,
>   and several more such things.
> 
> * Fixed three Delta::ZERO busy-polls (patches 21, 24, 31) to use
>   non-zero sleep intervals (after just realizing that it was a bad
>   choice to have zero in there).
> 
> * Reduced GH100/GB100 HAL duplication. Made FSP_PKEY_SIZE/FSP_SIG_SIZE
>   consistent across patches. Replaced fragile architecture checks with
>   chipset.arch(). Renamed LIBOS_BLACKWELL.
> 
> * Narrowed the scope of some of the #![expect(dead_code)] cases,
>   although that really only matters within the series, not once it is
>   fully applied.
> 
> [1] https://github.com/Gnurou/linux/commits/drm-rust-next-staging/
> [2] https://lore.kernel.org/20260411024118.471294-1-jhubbard@nvidia.com
> 
> Alexandre Courbot (1):
>   gpu: nova-core: gsp: enable FSP boot path
> 
> Eliot Courtney (1):
>   gpu: nova-core: add non-sec2 unload path
> 
> John Hubbard (20):
>   gpu: nova-core: set DMA mask width based on GPU architecture
>   gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
>   gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
>   gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
>   gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
>   gpu: nova-core: Blackwell: use correct sysmem flush registers
>   gpu: nova-core: don't assume 64-bit firmware images
>   gpu: nova-core: add support for 32-bit firmware images
>   gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
>   gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
>   gpu: nova-core: Hopper/Blackwell: add FMC firmware image
>   gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
>     waiting
>   gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
>   gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
>   gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
>   gpu: nova-core: add MCTP/NVDM protocol types for firmware
>     communication
>   gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
>   gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
>   gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
>   gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
> 
>  drivers/gpu/nova-core/driver.rs               |  15 -
>  drivers/gpu/nova-core/falcon.rs               |   1 +
>  drivers/gpu/nova-core/falcon/fsp.rs           | 202 +++++++++++
>  drivers/gpu/nova-core/fb.rs                   |   8 +-
>  drivers/gpu/nova-core/fb/hal.rs               |  28 +-
>  drivers/gpu/nova-core/fb/hal/ga100.rs         |   5 +
>  drivers/gpu/nova-core/fb/hal/ga102.rs         |   7 +-
>  drivers/gpu/nova-core/fb/hal/gb100.rs         | 102 ++++++
>  drivers/gpu/nova-core/fb/hal/gb202.rs         |  86 +++++
>  drivers/gpu/nova-core/fb/hal/gh100.rs         |  50 +++
>  drivers/gpu/nova-core/fb/hal/tu102.rs         |   9 +
>  drivers/gpu/nova-core/firmware.rs             | 176 +++++++--
>  drivers/gpu/nova-core/firmware/fsp.rs         | 129 +++++++
>  drivers/gpu/nova-core/firmware/gsp.rs         |   4 +-
>  drivers/gpu/nova-core/fsp.rs                  | 334 ++++++++++++++++++
>  drivers/gpu/nova-core/fsp/hal.rs              |  27 ++
>  drivers/gpu/nova-core/fsp/hal/gb202.rs        |  23 ++
>  drivers/gpu/nova-core/fsp/hal/gh100.rs        |  23 ++
>  drivers/gpu/nova-core/gpu.rs                  |  34 +-
>  drivers/gpu/nova-core/gpu/hal.rs              |  13 +-
>  drivers/gpu/nova-core/gpu/hal/gh100.rs        |  18 +-
>  drivers/gpu/nova-core/gpu/hal/tu102.rs        |  14 +
>  drivers/gpu/nova-core/gsp.rs                  |   1 +
>  drivers/gpu/nova-core/gsp/boot.rs             |   2 +-
>  drivers/gpu/nova-core/gsp/commands.rs         |   8 +-
>  drivers/gpu/nova-core/gsp/fw.rs               |  85 ++++-
>  drivers/gpu/nova-core/gsp/fw/commands.rs      |  15 +-
>  .../gpu/nova-core/gsp/fw/r570_144/bindings.rs |  83 +++++
>  drivers/gpu/nova-core/gsp/hal/gh100.rs        | 166 ++++++++-
>  drivers/gpu/nova-core/mctp.rs                 | 100 ++++++
>  drivers/gpu/nova-core/nova_core.rs            |   2 +
>  drivers/gpu/nova-core/regs.rs                 | 111 ++++++
>  32 files changed, 1800 insertions(+), 81 deletions(-)
>  create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
>  create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
>  create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fsp.rs
>  create mode 100644 drivers/gpu/nova-core/fsp/hal.rs
>  create mode 100644 drivers/gpu/nova-core/fsp/hal/gb202.rs
>  create mode 100644 drivers/gpu/nova-core/fsp/hal/gh100.rs
>  create mode 100644 drivers/gpu/nova-core/mctp.rs
> 
> 
> base-commit: 2cfcf9dfb48e932d46c3fa9ae99f1607d1a80162


^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2026-06-01 18:23 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-30  3:09 [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
2026-05-30  3:09 ` [PATCH v11 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-06-01  4:01   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-06-01  4:04   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
2026-06-01  2:07   ` Alexandre Courbot
2026-06-01  5:34     ` Alexandre Courbot
2026-06-01 18:01       ` John Hubbard
2026-06-01  4:41   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-06-01  2:24   ` Alexandre Courbot
2026-06-01 18:03     ` John Hubbard
2026-06-01  5:01   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-06-01  5:21   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-06-01  7:01   ` Alexandre Courbot
2026-06-01 18:16     ` John Hubbard
2026-06-01  7:33   ` Eliot Courtney
2026-06-01 13:13     ` Alexandre Courbot
2026-06-01 18:09       ` John Hubbard
2026-05-30  3:09 ` [PATCH v11 07/22] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-06-01  6:36   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 08/22] gpu: nova-core: add support for 32-bit " John Hubbard
2026-06-01  6:37   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-06-01  6:49   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-06-01  7:47   ` Eliot Courtney
2026-06-01 16:10   ` Timur Tabi
2026-06-01 18:17     ` John Hubbard
2026-05-30  3:09 ` [PATCH v11 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image John Hubbard
2026-06-01  8:38   ` Eliot Courtney
2026-05-30  3:09 ` [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-06-01  7:48   ` Alexandre Courbot
2026-06-01  8:32     ` Eliot Courtney
2026-06-01 13:07       ` Alexandre Courbot
2026-06-01 18:18         ` John Hubbard
2026-05-30  3:09 ` [PATCH v11 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-06-01  8:55   ` Eliot Courtney
2026-06-01 14:45   ` Alexandre Courbot
2026-06-01 14:49     ` Alexandre Courbot
2026-06-01 18:21       ` John Hubbard
2026-05-30  3:09 ` [PATCH v11 14/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-05-30  3:09 ` [PATCH v11 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-05-30  3:09 ` [PATCH v11 16/22] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
2026-05-30  3:09 ` [PATCH v11 17/22] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-05-30  3:09 ` [PATCH v11 18/22] gpu: nova-core: Hopper/Blackwell: add FspCotVersion type John Hubbard
2026-06-01 14:07   ` Alexandre Courbot
2026-06-01 18:23     ` John Hubbard
2026-05-30  3:09 ` [PATCH v11 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-05-30  3:09 ` [PATCH v11 20/22] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-05-30  3:09 ` [PATCH v11 21/22] gpu: nova-core: add non-sec2 unload path John Hubbard
2026-05-30  3:09 ` [PATCH v11 22/22] gpu: nova-core: gsp: enable FSP boot path John Hubbard
2026-05-30  3:21 ` [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox