[PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support

public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed

From: John Hubbard <jhubbard@nvidia.com>
To: Danilo Krummrich <dakr@kernel.org>,
	Alexandre Courbot <acourbot@nvidia.com>
Cc: "Joel Fernandes" <joelagnelf@nvidia.com>,
	"Timur Tabi" <ttabi@nvidia.com>,
	"Alistair Popple" <apopple@nvidia.com>,
	"Eliot Courtney" <ecourtney@nvidia.com>,
	"Zhi Wang" <zhiw@nvidia.com>, "David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <lossin@kernel.org>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	nouveau@lists.freedesktop.org, rust-for-linux@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	"John Hubbard" <jhubbard@nvidia.com>
Subject: [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support
Date: Mon,  9 Feb 2026 18:45:27 -0800	[thread overview]
Message-ID: <20260210024601.593248-1-jhubbard@nvidia.com> (raw)

Hi,

This is based on the Feb 5, 2026 linux-next: commit 9845cf73f7db ("Add
linux-next specific files for 20260205") That's new enough to have the
pdev.as_ref() changes (see below for details), but not so new as to
include the current merge window churn for Linux .70.

I've re-tested on Ampere (GA104) and Blackwell (GB202) RTX GPUs.

Data center GPUs remain as TODO items: GA100 needs some additional code,
Hopper/GH100 might work but is not yet tested, and I haven't even
thought about Blackwell data center GPUs.

So, even though many patches say Hopper/Blackwell, there may be some
test-and-fix work remaining there.

Changes in v4:

* Fixed the IOMMU page faults on address 0x0 that I was seeing on v3 and
  earlier, for the iommu enabled case. These were due to the sysmem
  flush buffer being in a different location for Blackwell, so I've
  HAL-ified that aspect.

* Added a patch (0001) to pass pdev directly to dev_* logging macros.
  Then converted the remaining patches to also use pdev directly,
  instead of pdev.as_ref(). This is only possible in branches that have
  commit a38cd1fea989 ("rust: device: support `dev_printk` on all
  devices"), which in turn is why this v4 is based on a linux-next
  commit.

* Changed FmcSignatures fields from [u32; N] to [u8; N] arrays because
  the data is not treated as 32-bit integers. This eliminates the need
  for .as_bytes_mut() in the FMC signature extraction patch and allows
  using named constants like [u8; FSP_HASH_SIZE]. (From Timur Tabi's
  review.)

* Changed .unwrap_or(u64::MAX) to .expect("...") for alignment overflow
  in client_alloc_size() and management_overhead(). A panic is warranted
  here since the values are compile-time constants and overflow is
  impossible. (From Timur Tabi's review.)

* Added a patch at the end that I actually expect will get merged
  earlier, separately. But for now, it avoids nova-drm aux bus
  registration failure on multi-GPU systems, which in turn keeps the
  driver alive, which in turn avoids a driver teardown missing feature
  (pre-existing), which in turn avoids IOMMU page faults at non-zero
  addresses. whew. :)

Changes in v3:

* Rebased onto linux-next (20260205), which includes several
  rust-for-linux updates that affected nova-core.

* Removed redundant .as_ref() from dev_*!() macro call sites, since the
  dev_printk!() macro now calls .as_ref() internally (Gary Guo's
  "remove redundant .as_ref() for dev_* print" series).

* Added a `use kernel::io::Io` import in regs.rs, needed after the
  upstream separation of generic I/O helpers from the MMIO
  implementation.

Changes in v2:

v2 is here:
    https://lore.kernel.org/20260131005604.454172-1-jhubbard@nvidia.com

* GA100 (an Ampere chip whose firmware boot steps are closer to Turing,
  than to other Amperes) returns ENOTSUPP for now because it is *known*
  to not work yet.

* FSP: use the new Chipset::fsp_cot_version() method instead of a
  hardcoded constant. This fixes a known wrongness on GH100.

* Changed to a HAL approach to handle the slightly different non-WPR
  heap sizes, for Hopper vs. Blackwell.

* Return Option instead of Result from get_gsp_sigs_section() since
  the failure case is simply "not found".

* Return DmaMask directly from dma_mask() instead of returning a bit
  count.

* Change fmc_full from DmaObject to KVec<u8> since it's only used for
  CPU-side signature extraction and is never submitted to hardware
  (only fmc_image is). This eliminates the need for unsafe code and
  the associated SAFETY comment entirely.

* Use as_bytes_mut() instead of unsafe core::slice::from_raw_parts_mut()
  for copying FMC signature data (hash, public_key, signature arrays).

* Refactor wait_for_gsp_lockdown_release() to use early return with ?
  instead of chained .inspect_err().map().and_then() pattern.

* Removed many dev_dbg! statements.

* Use IEC binary prefix "MiB" instead of "MB" for memory size output.
  Also improved display of small sizes (e.g., "24 KiB" instead of
  "0 MB") and fixed a typo ("suprising" -> "surprising").

* Reordered the "skip GFW boot waiting" commit to appear earlier in the
  series.

* Series has been reduced from 31 to 30 patches, because the "needs
  large reserved mem" patch was absorbed into the non-WPR heap size
  patch.

John Hubbard (33):
  gpu: nova-core: pass pdev directly to dev_* logging macros
  gpu: nova-core: print FB sizes, along with ranges
  gpu: nova-core: add FbRange.len() and use it in boot.rs
  gpu: nova-core: Hopper/Blackwell: basic GPU identification
  gpu: nova-core: factor .fwsignature* selection into a new
    get_gsp_sigs_section()
  gpu: nova-core: use GPU Architecture to simplify HAL selections
  gpu: nova-core: apply the one "use" item per line policy to
    commands.rs
  gpu: nova-core: set DMA mask width based on GPU architecture
  gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting
  gpu: nova-core: move firmware image parsing code to firmware.rs
  gpu: nova-core: factor out a section_name_eq() function
  gpu: nova-core: don't assume 64-bit firmware images
  gpu: nova-core: add support for 32-bit firmware images
  gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
  gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support
    of FSP
  gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
  gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
  gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
  gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size
  gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
    waiting
  gpu: nova-core: Hopper/Blackwell: add FSP message structures
  gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
  gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
  gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
  gpu: nova-core: Blackwell: use correct sysmem flush registers
  gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
  gpu: nova-core: refactor SEC2 booter loading into run_booter() helper
  gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
  gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path
  gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
  gpu: nova-core: clarify the GPU firmware boot steps
  gpu: nova-core: fix aux device registration for multi-GPU systems

 drivers/gpu/nova-core/driver.rs          |  48 +-
 drivers/gpu/nova-core/falcon.rs          |   1 +
 drivers/gpu/nova-core/falcon/fsp.rs      | 160 +++++++
 drivers/gpu/nova-core/falcon/hal.rs      |  20 +-
 drivers/gpu/nova-core/fb.rs              | 118 ++++-
 drivers/gpu/nova-core/fb/hal.rs          |  34 +-
 drivers/gpu/nova-core/fb/hal/ga102.rs    |   2 +-
 drivers/gpu/nova-core/fb/hal/gb100.rs    |  73 +++
 drivers/gpu/nova-core/fb/hal/gb202.rs    |  62 +++
 drivers/gpu/nova-core/fb/hal/gh100.rs    |  37 ++
 drivers/gpu/nova-core/firmware.rs        | 186 ++++++++
 drivers/gpu/nova-core/firmware/fsp.rs    |  47 ++
 drivers/gpu/nova-core/firmware/gsp.rs    | 140 ++----
 drivers/gpu/nova-core/fsp.rs             | 561 +++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs             |  87 +++-
 drivers/gpu/nova-core/gsp/boot.rs        | 337 +++++++++++---
 drivers/gpu/nova-core/gsp/commands.rs    |   8 +-
 drivers/gpu/nova-core/gsp/fw.rs          |  63 ++-
 drivers/gpu/nova-core/gsp/fw/commands.rs |  32 +-
 drivers/gpu/nova-core/nova_core.rs       |   1 +
 drivers/gpu/nova-core/num.rs             |  10 +
 drivers/gpu/nova-core/regs.rs            |  95 ++++
 22 files changed, 1856 insertions(+), 266 deletions(-)
 create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
 create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
 create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
 create mode 100644 drivers/gpu/nova-core/fsp.rs

base-commit: 9845cf73f7db6094c0d8419d6adb848028f4a921
-- 
2.53.0

next             reply	other threads:[~2026-02-10  2:46 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-10  2:45 John Hubbard [this message]
2026-02-10  2:45 ` [PATCH v4 01/33] gpu: nova-core: pass pdev directly to dev_* logging macros John Hubbard
2026-02-11 10:06   ` Danilo Krummrich
2026-02-11 18:48     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 02/33] gpu: nova-core: print FB sizes, along with ranges John Hubbard
2026-02-10  2:45 ` [PATCH v4 03/33] gpu: nova-core: add FbRange.len() and use it in boot.rs John Hubbard
2026-02-10  2:45 ` [PATCH v4 04/33] gpu: nova-core: Hopper/Blackwell: basic GPU identification John Hubbard
2026-02-10  2:45 ` [PATCH v4 05/33] gpu: nova-core: factor .fwsignature* selection into a new get_gsp_sigs_section() John Hubbard
2026-02-11 10:16   ` Danilo Krummrich
2026-02-12  0:39     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 06/33] gpu: nova-core: use GPU Architecture to simplify HAL selections John Hubbard
2026-02-10  2:45 ` [PATCH v4 07/33] gpu: nova-core: apply the one "use" item per line policy to commands.rs John Hubbard
2026-02-10  2:45 ` [PATCH v4 08/33] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-02-11 10:28   ` Danilo Krummrich
2026-02-12  2:06     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 09/33] gpu: nova-core: Hopper/Blackwell: skip GFW boot waiting John Hubbard
2026-02-11 10:09   ` Danilo Krummrich
2026-02-12  1:49     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 10/33] gpu: nova-core: move firmware image parsing code to firmware.rs John Hubbard
2026-02-10  2:45 ` [PATCH v4 11/33] gpu: nova-core: factor out a section_name_eq() function John Hubbard
2026-02-10  2:45 ` [PATCH v4 12/33] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-02-10  2:45 ` [PATCH v4 13/33] gpu: nova-core: add support for 32-bit " John Hubbard
2026-02-10  2:45 ` [PATCH v4 14/33] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-02-10  2:45 ` [PATCH v4 15/33] gpu: nova-core: Hopper/Blackwell: add FMC firmware image, in support of FSP John Hubbard
2026-02-10  2:45 ` [PATCH v4 16/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-02-10  2:45 ` [PATCH v4 17/33] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-02-11 10:57   ` Danilo Krummrich
2026-02-12  2:09     ` John Hubbard
2026-02-17 15:43       ` Danilo Krummrich
2026-02-19  2:54         ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 18/33] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-02-17 16:28   ` Danilo Krummrich
2026-02-20 22:05     ` Tegra notes for Nova: " John Hubbard
2026-02-23  3:36       ` Alexandre Courbot
2026-02-10  2:45 ` [PATCH v4 19/33] gpu: nova-core: Hopper/Blackwell: calculate reserved FB heap size John Hubbard
2026-02-17 16:39   ` Danilo Krummrich
2026-02-19  3:01     ` John Hubbard
2026-02-19  9:01       ` Miguel Ojeda
2026-02-20 22:08         ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 20/33] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-02-17 17:13   ` Danilo Krummrich
2026-02-20 23:26     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 21/33] gpu: nova-core: Hopper/Blackwell: add FSP message structures John Hubbard
2026-02-10  2:45 ` [PATCH v4 22/33] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-02-10  2:45 ` [PATCH v4 23/33] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-02-10  2:45 ` [PATCH v4 24/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-02-17 18:16   ` Danilo Krummrich
2026-02-20 23:35     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 25/33] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-02-17 20:04   ` Danilo Krummrich
2026-02-20 23:57     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 26/33] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-02-10  2:45 ` [PATCH v4 27/33] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-02-17 20:10   ` Danilo Krummrich
2026-02-21  1:01     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 28/33] gpu: nova-core: refactor SEC2 booter loading into run_booter() helper John Hubbard
2026-02-17 20:12   ` Danilo Krummrich
2026-02-21  1:03     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 29/33] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-02-17 20:20   ` Danilo Krummrich
2026-02-21  1:06     ` John Hubbard
2026-02-10  2:45 ` [PATCH v4 30/33] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot path John Hubbard
2026-02-10  2:45 ` [PATCH v4 31/33] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-02-10  2:45 ` [PATCH v4 32/33] gpu: nova-core: clarify the GPU firmware boot steps John Hubbard
2026-02-10  2:46 ` [PATCH v4 33/33] gpu: nova-core: fix aux device registration for multi-GPU systems John Hubbard
2026-02-10 22:27 ` [PATCH v4 00/33] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260210024601.593248-1-jhubbard@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=apopple@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=ecourtney@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=simona@ffwll.ch \
    --cc=tmgross@umich.edu \
    --cc=ttabi@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox