rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization
@ 2025-05-07 13:52 Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation Alexandre Courbot
                   ` (19 more replies)
  0 siblings, 20 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot, Shirish Baskaran

Hi everyone,

I have respinned this series due to the introduction of
`Devres::access()` in nova-next - this addition seriously simplifies the
way we access the BAR and justifies sending a new version, although
there are also more minor changes.

While this series is still far from bringing the GPU into a state where
it can do anything useful, it sets up the basic layout of the driver
upon which we can build in order to continue with the next steps of GSP
booting, as well as supporting more chipsets.

Upon successful probe, the driver will display the range of the WPR2
region constructed by FWSEC-FRTS:

  [   95.436000] NovaCore 0000:01:00.0: WPR2: 0xffc00000-0xffce0000
  [   95.436002] NovaCore 0000:01:00.0: GPU instance built

This code is based on nova-next.

There are bits of documentation still missing, these are addressed by
Joel in his own documentation patch series [1]. I'll also double-check
and send follow-up patches if anything is still missing after that.

I have also tried to look at ways to split the patch adding falcon
support, but couldn't find any that would not be awkward. Starting
review from `falcon.rs` and going down to the HAL should be the logical
order for a smooth review.

[1] https://lore.kernel.org/rust-for-linux/20250503040802.1411285-1-joelagnelf@nvidia.com/

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
Changes in v3:
- Rebased on top of latest nova-next.
- Use the new Devres::access() and remove the now unneeded with_bar!()
  macro.
- Dropped `rust: devres: allow to borrow a reference to the resource's
  Device` as it is not needed anymore.
- Fixed more erroneous uses of `ERANGE` error.
- Optimized alignment computations of the FB layout a bit.
- Link to v2: https://lore.kernel.org/r/20250501-nova-frts-v2-0-b4a137175337@nvidia.com

Changes in v2:
- Rebased on latest nova-next.
- Fixed all clippy warnings.
- Added `count` and `size` methods to `CoherentAllocation`.
- Added method to obtain a reference to the `Device` from a `Devres`
  (this is super convenient).
- Split `DmaObject` into its own patch and added `Deref` implementation.
- Squashed field names from [3] into "extract FWSEC from BIOS".
- Fixed erroneous use of `ERANGE` error.
- Reworked `register!()` macro towards a more intuitive syntax, moved
  its helper macros into internal rules to avoid polluting the macro
  namespace.
- Renamed all registers to capital snake case to better match OpenRM.
- Removed declarations for registers that are not used yet.
- Added more documentation for items not covered by Joel's documentation
  patches.
- Removed timer device and replaced it with a helper function using
  `Ktime`. This also made [4] unneeded so it is dropped.
- Unregister the sysmem flush page upon device destruction.
- ... probably more that I forgot. >_<
- Link to v1: https://lore.kernel.org/r/20250420-nova-frts-v1-0-ecd1cca23963@nvidia.com

[3] https://lore.kernel.org/all/20250423225405.139613-6-joelagnelf@nvidia.com/
[4] https://lore.kernel.org/lkml/20250420-nova-frts-v1-1-ecd1cca23963@nvidia.com/

---
Alexandre Courbot (17):
      rust: dma: expose the count and size of CoherentAllocation
      gpu: nova-core: derive useful traits for Chipset
      gpu: nova-core: add missing GA100 definition
      gpu: nova-core: take bound device in Gpu::new
      gpu: nova-core: define registers layout using helper macro
      gpu: nova-core: fix layout of NV_PMC_BOOT_0
      gpu: nova-core: move Firmware to firmware module
      rust: make ETIMEDOUT error available
      gpu: nova-core: wait for GFW_BOOT completion
      gpu: nova-core: add DMA object struct
      gpu: nova-core: register sysmem flush page
      gpu: nova-core: add helper function to wait on condition
      gpu: nova-core: add falcon register definitions and base code
      gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS
      gpu: nova-core: compute layout of the FRTS region
      gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS
      gpu: nova-core: load and run FWSEC-FRTS

Joel Fernandes (2):
      rust: num: Add an upward alignment helper for usize
      nova-core: Add support for VBIOS ucode extraction for boot

 Documentation/gpu/nova/core/todo.rst      |    6 +
 drivers/gpu/nova-core/devinit.rs          |   38 +
 drivers/gpu/nova-core/dma.rs              |   57 ++
 drivers/gpu/nova-core/driver.rs           |    2 +-
 drivers/gpu/nova-core/falcon.rs           |  537 ++++++++++++++
 drivers/gpu/nova-core/falcon/gsp.rs       |   22 +
 drivers/gpu/nova-core/falcon/hal.rs       |   56 ++
 drivers/gpu/nova-core/falcon/hal/ga102.rs |  120 +++
 drivers/gpu/nova-core/falcon/sec2.rs      |    8 +
 drivers/gpu/nova-core/firmware.rs         |  105 ++-
 drivers/gpu/nova-core/firmware/fwsec.rs   |  359 +++++++++
 drivers/gpu/nova-core/gpu.rs              |  223 ++++--
 drivers/gpu/nova-core/gsp.rs              |    3 +
 drivers/gpu/nova-core/gsp/fb.rs           |  108 +++
 drivers/gpu/nova-core/nova_core.rs        |    5 +
 drivers/gpu/nova-core/regs.rs             |  259 +++++--
 drivers/gpu/nova-core/regs/macros.rs      |  380 ++++++++++
 drivers/gpu/nova-core/util.rs             |   29 +
 drivers/gpu/nova-core/vbios.rs            | 1144 +++++++++++++++++++++++++++++
 rust/kernel/dma.rs                        |   14 +
 rust/kernel/error.rs                      |    1 +
 rust/kernel/lib.rs                        |    1 +
 rust/kernel/num.rs                        |   44 ++
 23 files changed, 3427 insertions(+), 94 deletions(-)
---
base-commit: b75a99e1077b12c5631fef7ac36970a89f6021f7
change-id: 20250417-nova-frts-96ef299abe2c

Best regards,
-- 
Alexandre Courbot <acourbot@nvidia.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 12:15   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 02/19] gpu: nova-core: derive useful traits for Chipset Alexandre Courbot
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

These properties are very useful to have and should be accessible.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 rust/kernel/dma.rs | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 605e01e35715667f93297fd9ec49d8e7032e0910..18602d771054fceb80c29278b1945254312ed7c6 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -201,6 +201,20 @@ pub fn alloc_coherent(
         CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
     }
 
+    /// Returns the number of elements `T` in this allocation.
+    ///
+    /// Note that this is not the size of the allocation in bytes, which is provided by
+    /// [`Self::size`].
+    pub fn count(&self) -> usize {
+        self.count
+    }
+
+    /// Returns the size in bytes of this allocation.
+    pub fn size(&self) -> usize {
+        // This is guaranteed not to overflow as the same operation has been done in `alloc_attrs`.
+        self.count * core::mem::size_of::<T>()
+    }
+
     /// Returns the base address to the allocated region in the CPU's virtual address space.
     pub fn start_ptr(&self) -> *const T {
         self.cpu_addr

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 02/19] gpu: nova-core: derive useful traits for Chipset
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 03/19] gpu: nova-core: add missing GA100 definition Alexandre Courbot
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

We will commonly need to compare chipset versions, so derive the
ordering traits to make that possible. Also derive Copy and Clone since
passing Chipset by value will be more efficient than by reference.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 17c9660da45034762edaa78e372d8821144cdeb7..4de67a2dc16302c00530026156d7264cbc7e5b32 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -13,7 +13,7 @@ macro_rules! define_chipset {
     ({ $($variant:ident = $value:expr),* $(,)* }) =>
     {
         /// Enum representation of the GPU chipset.
-        #[derive(fmt::Debug)]
+        #[derive(fmt::Debug, Copy, Clone, PartialOrd, Ord, PartialEq, Eq)]
         pub(crate) enum Chipset {
             $($variant = $value),*,
         }

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 03/19] gpu: nova-core: add missing GA100 definition
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 02/19] gpu: nova-core: derive useful traits for Chipset Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 04/19] gpu: nova-core: take bound device in Gpu::new Alexandre Courbot
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

linux-firmware contains a directory for GA100, and it is a defined
chipset in Nouveau.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 4de67a2dc16302c00530026156d7264cbc7e5b32..9fe6aedaa9563799c2624d461d4e37ee9b094909 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -54,6 +54,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> {
     TU117 = 0x167,
     TU116 = 0x168,
     // Ampere
+    GA100 = 0x170,
     GA102 = 0x172,
     GA103 = 0x173,
     GA104 = 0x174,
@@ -73,7 +74,7 @@ pub(crate) fn arch(&self) -> Architecture {
             Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => {
                 Architecture::Turing
             }
-            Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => {
+            Self::GA100 | Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => {
                 Architecture::Ampere
             }
             Self::AD102 | Self::AD103 | Self::AD104 | Self::AD106 | Self::AD107 => {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 04/19] gpu: nova-core: take bound device in Gpu::new
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (2 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 03/19] gpu: nova-core: add missing GA100 definition Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 05/19] gpu: nova-core: define registers layout using helper macro Alexandre Courbot
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

We will need to perform things like allocating DMA memory during device
creation, so make sure to take the device context that will allow us to
perform these actions. This also allows us to use Devres::access to
obtain the BAR without holding a RCU lock.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 9fe6aedaa9563799c2624d461d4e37ee9b094909..a64a306e0ec87f96e7aab47cfe00a5c280c95a64 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -134,9 +134,8 @@ pub(crate) struct Spec {
 }
 
 impl Spec {
-    fn new(bar: &Devres<Bar0>) -> Result<Spec> {
-        let bar = bar.try_access().ok_or(ENXIO)?;
-        let boot0 = regs::Boot0::read(&bar);
+    fn new(bar: &Bar0) -> Result<Spec> {
+        let boot0 = regs::Boot0::read(bar);
 
         Ok(Self {
             chipset: boot0.chipset().try_into()?,
@@ -183,8 +182,12 @@ pub(crate) struct Gpu {
 }
 
 impl Gpu {
-    pub(crate) fn new(pdev: &pci::Device, bar: Devres<Bar0>) -> Result<impl PinInit<Self>> {
-        let spec = Spec::new(&bar)?;
+    pub(crate) fn new(
+        pdev: &pci::Device<device::Bound>,
+        devres_bar: Devres<Bar0>,
+    ) -> Result<impl PinInit<Self>> {
+        let bar = devres_bar.access(pdev.as_ref())?;
+        let spec = Spec::new(bar)?;
         let fw = Firmware::new(pdev.as_ref(), &spec, "535.113.01")?;
 
         dev_info!(
@@ -195,6 +198,10 @@ pub(crate) fn new(pdev: &pci::Device, bar: Devres<Bar0>) -> Result<impl PinInit<
             spec.revision
         );
 
-        Ok(pin_init!(Self { spec, bar, fw }))
+        Ok(pin_init!(Self {
+            spec,
+            bar: devres_bar,
+            fw
+        }))
     }
 }

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 05/19] gpu: nova-core: define registers layout using helper macro
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (3 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 04/19] gpu: nova-core: take bound device in Gpu::new Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 06/19] gpu: nova-core: fix layout of NV_PMC_BOOT_0 Alexandre Courbot
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

Add the register!() macro, which defines a given register's layout and
provide bit-field accessors with a way to convert them to a given type.
This macro will allow us to make clear definitions of the registers and
manipulate their fields safely.

The long-term goal is to eventually move it to the kernel crate so it
can be used my other drivers as well, but it was agreed to first land it
into nova-core and make it mature there.

To illustrate its usage, use it to define the layout for the Boot0
(renamed to NV_PMC_BOOT_0 to match OpenRM's naming scheme) and take
advantage of its accessors.

Suggested-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 Documentation/gpu/nova/core/todo.rst |   6 +
 drivers/gpu/nova-core/gpu.rs         |  10 +-
 drivers/gpu/nova-core/regs.rs        |  61 ++----
 drivers/gpu/nova-core/regs/macros.rs | 380 +++++++++++++++++++++++++++++++++++
 4 files changed, 403 insertions(+), 54 deletions(-)

diff --git a/Documentation/gpu/nova/core/todo.rst b/Documentation/gpu/nova/core/todo.rst
index 234d753d3eacc709b928b1ccbfc9750ef36ec4ed..8a459fc088121f770bfcda5dfb4ef51c712793ce 100644
--- a/Documentation/gpu/nova/core/todo.rst
+++ b/Documentation/gpu/nova/core/todo.rst
@@ -102,7 +102,13 @@ Usage:
 	let boot0 = Boot0::read(&bar);
 	pr_info!("Revision: {}\n", boot0.revision());
 
+Note: a work-in-progress implementation currently resides in
+`drivers/gpu/nova-core/regs/macros.rs` and is used in nova-core. It would be
+nice to improve it (possibly using proc macros) and move it to the `kernel`
+crate so it can be used by other components as well.
+
 | Complexity: Advanced
+| Contact: Alexandre Courbot
 
 Delay / Sleep abstractions
 --------------------------
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index a64a306e0ec87f96e7aab47cfe00a5c280c95a64..43139b527fac35f5f33facbc0be50b9715044245 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -112,10 +112,10 @@ pub(crate) struct Revision {
 }
 
 impl Revision {
-    fn from_boot0(boot0: regs::Boot0) -> Self {
+    fn from_boot0(boot0: regs::NV_PMC_BOOT_0) -> Self {
         Self {
-            major: boot0.major_rev(),
-            minor: boot0.minor_rev(),
+            major: boot0.major_revision(),
+            minor: boot0.minor_revision(),
         }
     }
 }
@@ -135,10 +135,10 @@ pub(crate) struct Spec {
 
 impl Spec {
     fn new(bar: &Bar0) -> Result<Spec> {
-        let boot0 = regs::Boot0::read(bar);
+        let boot0 = regs::NV_PMC_BOOT_0::read(bar);
 
         Ok(Self {
-            chipset: boot0.chipset().try_into()?,
+            chipset: boot0.chipset()?,
             revision: Revision::from_boot0(boot0),
         })
     }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index b1a25b86ef17a6710e6236d5e7f1f26cd4407ce3..498fefb52f33bf01518f19d32287962f1fdc3224 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -1,55 +1,18 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use crate::driver::Bar0;
+// Required to retain the original register names used by OpenRM, which are all capital snake case
+// but are mapped to types.
+#![allow(non_camel_case_types)]
 
-// TODO
-//
-// Create register definitions via generic macros. See task "Generic register
-// abstraction" in Documentation/gpu/nova/core/todo.rst.
+#[macro_use]
+mod macros;
 
-const BOOT0_OFFSET: usize = 0x00000000;
+use crate::gpu::Chipset;
 
-// 3:0 - chipset minor revision
-const BOOT0_MINOR_REV_SHIFT: u8 = 0;
-const BOOT0_MINOR_REV_MASK: u32 = 0x0000000f;
+/* PMC */
 
-// 7:4 - chipset major revision
-const BOOT0_MAJOR_REV_SHIFT: u8 = 4;
-const BOOT0_MAJOR_REV_MASK: u32 = 0x000000f0;
-
-// 23:20 - chipset implementation Identifier (depends on architecture)
-const BOOT0_IMPL_SHIFT: u8 = 20;
-const BOOT0_IMPL_MASK: u32 = 0x00f00000;
-
-// 28:24 - chipset architecture identifier
-const BOOT0_ARCH_MASK: u32 = 0x1f000000;
-
-// 28:20 - chipset identifier (virtual register field combining BOOT0_IMPL and
-//         BOOT0_ARCH)
-const BOOT0_CHIPSET_SHIFT: u8 = BOOT0_IMPL_SHIFT;
-const BOOT0_CHIPSET_MASK: u32 = BOOT0_IMPL_MASK | BOOT0_ARCH_MASK;
-
-#[derive(Copy, Clone)]
-pub(crate) struct Boot0(u32);
-
-impl Boot0 {
-    #[inline]
-    pub(crate) fn read(bar: &Bar0) -> Self {
-        Self(bar.read32(BOOT0_OFFSET))
-    }
-
-    #[inline]
-    pub(crate) fn chipset(&self) -> u32 {
-        (self.0 & BOOT0_CHIPSET_MASK) >> BOOT0_CHIPSET_SHIFT
-    }
-
-    #[inline]
-    pub(crate) fn minor_rev(&self) -> u8 {
-        ((self.0 & BOOT0_MINOR_REV_MASK) >> BOOT0_MINOR_REV_SHIFT) as u8
-    }
-
-    #[inline]
-    pub(crate) fn major_rev(&self) -> u8 {
-        ((self.0 & BOOT0_MAJOR_REV_MASK) >> BOOT0_MAJOR_REV_SHIFT) as u8
-    }
-}
+register!(NV_PMC_BOOT_0 @ 0x00000000, "Basic revision information about the GPU" {
+    3:0     minor_revision as u8, "Minor revision of the chip";
+    7:4     major_revision as u8, "Major revision of the chip";
+    28:20   chipset as u32 ?=> Chipset, "Chipset model";
+});
diff --git a/drivers/gpu/nova-core/regs/macros.rs b/drivers/gpu/nova-core/regs/macros.rs
new file mode 100644
index 0000000000000000000000000000000000000000..7ecc70efb3cd723b673cd72915e72b8a4a009f06
--- /dev/null
+++ b/drivers/gpu/nova-core/regs/macros.rs
@@ -0,0 +1,380 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Macro to define register layout and accessors.
+//!
+//! A single register typically includes several fields, which are accessed through a combination
+//! of bit-shift and mask operations that introduce a class of potential mistakes, notably because
+//! not all possible field values are necessarily valid.
+//!
+//! The macro in this module allow to define, using an intruitive and readable syntax, a dedicated
+//! type for each register with its own field accessors that can return an error is a field's value
+//! is invalid.
+
+/// Defines a dedicated type for a register with an absolute offset, alongside with getter and
+/// setter methods for its fields and methods to read and write it from an `Io` region.
+///
+/// Example:
+///
+/// ```no_run
+/// register!(BOOT_0 @ 0x00000100, "Basic revision information about the GPU" {
+///    3:0     minor_revision as u8, "Minor revision of the chip";
+///    7:4     major_revision as u8, "Major revision of the chip";
+///    28:20   chipset as u32 ?=> Chipset, "Chipset model";
+/// });
+/// ```
+///
+/// This defines a `BOOT_0` type which can be read or written from offset `0x100` of an `Io`
+/// region. It is composed of 3 fields, for instance `minor_revision` is made of the 4 less
+/// significant bits of the register. Each field can be accessed and modified using accessor
+/// methods:
+///
+/// ```no_run
+/// // Read from the register's defined offset (0x100).
+/// let boot0 = BOOT_0::read(&bar);
+/// pr_info!("chip revision: {}.{}", boot0.major_revision(), boot0.minor_revision());
+///
+/// // `Chipset::try_from` will be called with the value of the field and returns an error if the
+/// // value is invalid.
+/// let chipset = boot0.chipset()?;
+///
+/// // Update some fields and write the value back.
+/// boot0.set_major_revision(3).set_minor_revision(10).write(&bar);
+///
+/// // Or just read and update the register in a single step:
+/// BOOT_0::alter(&bar, |r| r.set_major_revision(3).set_minor_revision(10));
+/// ```
+///
+/// Fields can be defined as follows:
+///
+/// - `as <type>` simply returns the field value casted as the requested integer type, typically
+///   `u32`, `u16`, `u8` or `bool`. Note that `bool` fields must have a range of 1 bit.
+/// - `as <type> => <into_type>` calls `<into_type>`'s `From::<<type>>` implementation and returns
+///   the result.
+/// - `as <type> ?=> <try_into_type>` calls `<try_into_type>`'s `TryFrom::<<type>>` implementation
+///   and returns the result. This is useful on fields for which not all values are value.
+///
+/// The documentation strings are optional. If present, they will be added to the type's
+/// definition, or the field getter and setter methods they are attached to.
+///
+/// Putting a `+` before the address of the register makes it relative to a base: the `read` and
+/// `write` methods take a `base` argument that is added to the specified address before access,
+/// and `try_read` and `try_write` methods are also created, allowing access with offsets unknown
+/// at compile-time:
+///
+/// ```no_run
+/// register!(CPU_CTL @ +0x0000010, "CPU core control" {
+///    0:0     start as bool, "Start the CPU core";
+/// });
+///
+/// // Flip the `start` switch for the CPU core which base address is at `CPU_BASE`.
+/// let cpuctl = CPU_CTL::read(&bar, CPU_BASE);
+/// pr_info!("CPU CTL: {:#x}", cpuctl);
+/// cpuctl.set_start(true).write(&bar, CPU_BASE);
+/// ```
+macro_rules! register {
+    // Creates a register at a fixed offset of the MMIO space.
+    (
+        $name:ident @ $offset:literal $(, $comment:literal)? {
+            $($fields:tt)*
+        }
+    ) => {
+        register!(@common $name $(, $comment)?);
+        register!(@field_accessors $name { $($fields)* });
+        register!(@io $name @ $offset);
+    };
+
+    // Creates a register at a relative offset from a base address.
+    (
+        $name:ident @ + $offset:literal $(, $comment:literal)? {
+            $($fields:tt)*
+        }
+    ) => {
+        register!(@common $name $(, $comment)?);
+        register!(@field_accessors $name { $($fields)* });
+        register!(@io$name @ + $offset);
+    };
+
+    // Defines the wrapper `$name` type, as well as its relevant implementations (`Debug`, `BitOr`,
+    // and conversion to regular `u32`).
+    (@common $name:ident $(, $comment:literal)?) => {
+        $(
+        #[doc=$comment]
+        )?
+        #[repr(transparent)]
+        #[derive(Clone, Copy, Default)]
+        pub(crate) struct $name(u32);
+
+        // TODO: display the raw hex value, then the value of all the fields. This requires
+        // matching the fields, which will complexify the syntax considerably...
+        impl ::core::fmt::Debug for $name {
+            fn fmt(&self, f: &mut ::core::fmt::Formatter<'_>) -> ::core::fmt::Result {
+                f.debug_tuple(stringify!($name))
+                    .field(&format_args!("0x{0:x}", &self.0))
+                    .finish()
+            }
+        }
+
+        impl core::ops::BitOr for $name {
+            type Output = Self;
+
+            fn bitor(self, rhs: Self) -> Self::Output {
+                Self(self.0 | rhs.0)
+            }
+        }
+
+        impl ::core::convert::From<$name> for u32 {
+            fn from(reg: $name) -> u32 {
+                reg.0
+            }
+        }
+    };
+
+    // Defines all the field getter/methods methods for `$name`.
+    (
+        @field_accessors $name:ident {
+        $($hi:tt:$lo:tt $field:ident as $type:tt
+            $(?=> $try_into_type:ty)?
+            $(=> $into_type:ty)?
+            $(, $comment:literal)?
+        ;
+        )*
+        }
+    ) => {
+        $(
+            register!(@check_field_bounds $hi:$lo $field as $type);
+        )*
+
+        #[allow(dead_code)]
+        impl $name {
+            $(
+            register!(@field_accessor $name $hi:$lo $field as $type
+                $(?=> $try_into_type)?
+                $(=> $into_type)?
+                $(, $comment)?
+                ;
+            );
+            )*
+        }
+    };
+
+    // Boolean fields must have `$hi == $lo`.
+    (@check_field_bounds $hi:tt:$lo:tt $field:ident as bool) => {
+        #[allow(clippy::eq_op)]
+        const _: () = {
+            kernel::build_assert!(
+                $hi == $lo,
+                concat!("boolean field `", stringify!($field), "` covers more than one bit")
+            );
+        };
+    };
+
+    // Non-boolean fields must have `$hi >= $lo`.
+    (@check_field_bounds $hi:tt:$lo:tt $field:ident as $type:tt) => {
+        #[allow(clippy::eq_op)]
+        const _: () = {
+            kernel::build_assert!(
+                $hi >= $lo,
+                concat!("field `", stringify!($field), "`'s MSB is smaller than its LSB")
+            );
+        };
+    };
+
+    // Catches fields defined as `bool` and convert them into a boolean value.
+    (
+        @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as bool => $into_type:ty
+            $(, $comment:literal)?;
+    ) => {
+        register!(
+            @leaf_accessor $name $hi:$lo $field as bool
+            { |f| <$into_type>::from(if f != 0 { true } else { false }) }
+            $into_type => $into_type $(, $comment)?;
+        );
+    };
+
+    // Shortcut for fields defined as `bool` without the `=>` syntax.
+    (
+        @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as bool $(, $comment:literal)?;
+    ) => {
+        register!(@field_accessor $name $hi:$lo $field as bool => bool $(, $comment)?;);
+    };
+
+    // Catches the `?=>` syntax for non-boolean fields.
+    (
+        @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:tt ?=> $try_into_type:ty
+            $(, $comment:literal)?;
+    ) => {
+        register!(@leaf_accessor $name $hi:$lo $field as $type
+            { |f| <$try_into_type>::try_from(f as $type) } $try_into_type =>
+            ::core::result::Result<
+                $try_into_type,
+                <$try_into_type as ::core::convert::TryFrom<$type>>::Error
+            >
+            $(, $comment)?;);
+    };
+
+    // Catches the `=>` syntax for non-boolean fields.
+    (
+        @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:tt => $into_type:ty
+            $(, $comment:literal)?;
+    ) => {
+        register!(@leaf_accessor $name $hi:$lo $field as $type
+            { |f| <$into_type>::from(f as $type) } $into_type => $into_type $(, $comment)?;);
+    };
+
+    // Shortcut for fields defined as non-`bool` without the `=>` or `?=>` syntax.
+    (
+        @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:tt
+            $(, $comment:literal)?;
+    ) => {
+        register!(@field_accessor $name $hi:$lo $field as $type => $type $(, $comment)?;);
+    };
+
+    // Generates the accessor methods for a single field.
+    (
+        @leaf_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:ty
+            { $process:expr } $to_type:ty => $res_type:ty $(, $comment:literal)?;
+    ) => {
+        kernel::macros::paste!(
+        const [<$field:upper>]: ::core::ops::RangeInclusive<u8> = $lo..=$hi;
+        const [<$field:upper _MASK>]: u32 = ((((1 << $hi) - 1) << 1) + 1) - ((1 << $lo) - 1);
+        const [<$field:upper _SHIFT>]: u32 = Self::[<$field:upper _MASK>].trailing_zeros();
+        );
+
+        $(
+        #[doc="Returns the value of this field:"]
+        #[doc=$comment]
+        )?
+        #[inline]
+        pub(crate) fn $field(self) -> $res_type {
+            kernel::macros::paste!(
+            const MASK: u32 = $name::[<$field:upper _MASK>];
+            const SHIFT: u32 = $name::[<$field:upper _SHIFT>];
+            );
+            let field = ((self.0 & MASK) >> SHIFT);
+
+            $process(field)
+        }
+
+        kernel::macros::paste!(
+        $(
+        #[doc="Sets the value of this field:"]
+        #[doc=$comment]
+        )?
+        #[inline]
+        pub(crate) fn [<set_ $field>](mut self, value: $to_type) -> Self {
+            const MASK: u32 = $name::[<$field:upper _MASK>];
+            const SHIFT: u32 = $name::[<$field:upper _SHIFT>];
+            let value = ((value as u32) << SHIFT) & MASK;
+            self.0 = (self.0 & !MASK) | value;
+
+            self
+        }
+        );
+    };
+
+    // Creates the IO accessors for a fixed offset register.
+    (@io $name:ident @ $offset:literal) => {
+        #[allow(dead_code)]
+        impl $name {
+            #[inline]
+            pub(crate) fn read<const SIZE: usize, T>(io: &T) -> Self where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+            {
+                Self(io.read32($offset))
+            }
+
+            #[inline]
+            pub(crate) fn write<const SIZE: usize, T>(self, io: &T) where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+            {
+                io.write32(self.0, $offset)
+            }
+
+            #[inline]
+            pub(crate) fn alter<const SIZE: usize, T, F>(
+                io: &T,
+                f: F,
+            ) where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+                F: ::core::ops::FnOnce(Self) -> Self,
+            {
+                let reg = f(Self::read(io));
+                reg.write(io);
+            }
+        }
+    };
+
+    // Create the IO accessors for a relative offset register.
+    (@io $name:ident @ + $offset:literal) => {
+        #[allow(dead_code)]
+        impl $name {
+            #[inline]
+            pub(crate) fn read<const SIZE: usize, T>(
+                io: &T,
+                base: usize,
+            ) -> Self where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+            {
+                Self(io.read32(base + $offset))
+            }
+
+            #[inline]
+            pub(crate) fn write<const SIZE: usize, T>(
+                self,
+                io: &T,
+                base: usize,
+            ) where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+            {
+                io.write32(self.0, base + $offset)
+            }
+
+            #[inline]
+            pub(crate) fn alter<const SIZE: usize, T, F>(
+                io: &T,
+                base: usize,
+                f: F,
+            ) where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+                F: ::core::ops::FnOnce(Self) -> Self,
+            {
+                let reg = f(Self::read(io, base));
+                reg.write(io, base);
+            }
+
+            #[inline]
+            pub(crate) fn try_read<const SIZE: usize, T>(
+                io: &T,
+                base: usize,
+            ) -> ::kernel::error::Result<Self> where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+            {
+                io.try_read32(base + $offset).map(Self)
+            }
+
+            #[inline]
+            pub(crate) fn try_write<const SIZE: usize, T>(
+                self,
+                io: &T,
+                base: usize,
+            ) -> ::kernel::error::Result<()> where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+            {
+                io.try_write32(self.0, base + $offset)
+            }
+
+            #[inline]
+            pub(crate) fn try_alter<const SIZE: usize, T, F>(
+                io: &T,
+                base: usize,
+                f: F,
+            ) -> ::kernel::error::Result<()> where
+                T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>,
+                F: ::core::ops::FnOnce(Self) -> Self,
+            {
+                let reg = f(Self::try_read(io, base)?);
+                reg.try_write(io, base)
+            }
+        }
+    };
+}

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 06/19] gpu: nova-core: fix layout of NV_PMC_BOOT_0
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (4 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 05/19] gpu: nova-core: define registers layout using helper macro Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 07/19] gpu: nova-core: move Firmware to firmware module Alexandre Courbot
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

The layout of NV_PMC_BOOT_0 has two small issues:

- The "chipset" field, while useful to identify a chip, is actually an
  aggregate of two distinct fields named "architecture" and
  "implementation".
- The "architecture" field is split, with its MSB being at a different
  location than the rest of its bits.

Redefine the register layout to match its actual definition as provided
by OpenRM and expose the fully-constructed "architecture" field through
our own "Architecture" type. The "chipset" pseudo-field is also useful
to have, so keep providing it.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs  | 19 ++++++++++++++++---
 drivers/gpu/nova-core/regs.rs | 26 ++++++++++++++++++++++++--
 2 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 43139b527fac35f5f33facbc0be50b9715044245..29db6726a1f5196bd2d1778117313c987ccf343b 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -101,9 +101,22 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
 /// Enum representation of the GPU generation.
 #[derive(fmt::Debug)]
 pub(crate) enum Architecture {
-    Turing,
-    Ampere,
-    Ada,
+    Turing = 0x16,
+    Ampere = 0x17,
+    Ada = 0x19,
+}
+
+impl TryFrom<u8> for Architecture {
+    type Error = Error;
+
+    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
+        match value {
+            0x16 => Ok(Self::Turing),
+            0x17 => Ok(Self::Ampere),
+            0x19 => Ok(Self::Ada),
+            _ => Err(ENODEV),
+        }
+    }
 }
 
 pub(crate) struct Revision {
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 498fefb52f33bf01518f19d32287962f1fdc3224..bfb9555b203ff880c0fc373bb22e5ce6048015d4 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -4,15 +4,37 @@
 // but are mapped to types.
 #![allow(non_camel_case_types)]
 
+use kernel::error::Error;
+
 #[macro_use]
 mod macros;
 
-use crate::gpu::Chipset;
+use crate::gpu::{Architecture, Chipset};
 
 /* PMC */
 
 register!(NV_PMC_BOOT_0 @ 0x00000000, "Basic revision information about the GPU" {
     3:0     minor_revision as u8, "Minor revision of the chip";
     7:4     major_revision as u8, "Major revision of the chip";
-    28:20   chipset as u32 ?=> Chipset, "Chipset model";
+    8:8     architecture_1 as u8, "MSB of the architecture";
+    23:20   implementation as u8, "Implementation version of the architecture";
+    28:24   architecture_0 as u8, "Lower bits of the architecture";
 });
+
+impl NV_PMC_BOOT_0 {
+    /// Combines `architecture_0` and `architecture_1` to obtain the architecture of the chip.
+    pub(crate) fn architecture(self) -> Result<Architecture, Error> {
+        Architecture::try_from(
+            self.architecture_0() | (self.architecture_1() << Self::ARCHITECTURE_0.len()),
+        )
+    }
+
+    /// Combines `architecture` and `implementation` to obtain a code unique to the chipset.
+    pub(crate) fn chipset(self) -> Result<Chipset, Error> {
+        self.architecture()
+            .map(|arch| {
+                ((arch as u32) << Self::IMPLEMENTATION.len()) | self.implementation() as u32
+            })
+            .and_then(Chipset::try_from)
+    }
+}

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 07/19] gpu: nova-core: move Firmware to firmware module
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (5 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 06/19] gpu: nova-core: fix layout of NV_PMC_BOOT_0 Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 08/19] rust: make ETIMEDOUT error available Alexandre Courbot
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

We will extend the firmware methods, so move it to its own module
instead to keep gpu.rs focused.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 46 +++++++++++++++++++++++++++++++++++++--
 drivers/gpu/nova-core/gpu.rs      | 35 +++--------------------------
 2 files changed, 47 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 6e6361c59ca1ae9a52185e66e850ba1db93eb8ce..cb79d039948858e657c9a23a62ed27ff780ac169 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -1,12 +1,54 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use crate::gpu;
+//! Contains structures and functions dedicated to the parsing, building and patching of firmwares
+//! to be loaded into a given execution unit.
+
+use kernel::device;
 use kernel::firmware;
+use kernel::prelude::*;
+use kernel::str::CString;
+
+use crate::gpu;
+use crate::gpu::Chipset;
+
+pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
+
+/// Structure encapsulating the firmware blobs required for the GPU to operate.
+#[expect(dead_code)]
+pub(crate) struct Firmware {
+    pub booter_load: firmware::Firmware,
+    pub booter_unload: firmware::Firmware,
+    pub bootloader: firmware::Firmware,
+    pub gsp: firmware::Firmware,
+}
+
+impl Firmware {
+    pub(crate) fn new(
+        dev: &device::Device<device::Bound>,
+        chipset: Chipset,
+        ver: &str,
+    ) -> Result<Firmware> {
+        let mut chip_name = CString::try_from_fmt(fmt!("{}", chipset))?;
+        chip_name.make_ascii_lowercase();
+
+        let request = |name_| {
+            CString::try_from_fmt(fmt!("nvidia/{}/gsp/{}-{}.bin", &*chip_name, name_, ver))
+                .and_then(|path| firmware::Firmware::request(&path, dev))
+        };
+
+        Ok(Firmware {
+            booter_load: request("booter_load")?,
+            booter_unload: request("booter_unload")?,
+            bootloader: request("bootloader")?,
+            gsp: request("gsp")?,
+        })
+    }
+}
 
 pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>);
 
 impl<const N: usize> ModInfoBuilder<N> {
-    const VERSION: &'static str = "535.113.01";
+    const VERSION: &'static str = FIRMWARE_VERSION;
 
     const fn make_entry_file(self, chipset: &str, fw: &str) -> Self {
         ModInfoBuilder(
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 29db6726a1f5196bd2d1778117313c987ccf343b..9c6a9270ffa374bd386af352f6ad9b857f9c5f8d 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -1,10 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use kernel::{
-    device, devres::Devres, error::code::*, firmware, fmt, pci, prelude::*, str::CString,
-};
+use kernel::{device, devres::Devres, error::code::*, pci, prelude::*};
 
 use crate::driver::Bar0;
+use crate::firmware::Firmware;
 use crate::regs;
 use crate::util;
 use core::fmt;
@@ -157,34 +156,6 @@ fn new(bar: &Bar0) -> Result<Spec> {
     }
 }
 
-/// Structure encapsulating the firmware blobs required for the GPU to operate.
-#[expect(dead_code)]
-pub(crate) struct Firmware {
-    booter_load: firmware::Firmware,
-    booter_unload: firmware::Firmware,
-    bootloader: firmware::Firmware,
-    gsp: firmware::Firmware,
-}
-
-impl Firmware {
-    fn new(dev: &device::Device, spec: &Spec, ver: &str) -> Result<Firmware> {
-        let mut chip_name = CString::try_from_fmt(fmt!("{}", spec.chipset))?;
-        chip_name.make_ascii_lowercase();
-
-        let request = |name_| {
-            CString::try_from_fmt(fmt!("nvidia/{}/gsp/{}-{}.bin", &*chip_name, name_, ver))
-                .and_then(|path| firmware::Firmware::request(&path, dev))
-        };
-
-        Ok(Firmware {
-            booter_load: request("booter_load")?,
-            booter_unload: request("booter_unload")?,
-            bootloader: request("bootloader")?,
-            gsp: request("gsp")?,
-        })
-    }
-}
-
 /// Structure holding the resources required to operate the GPU.
 #[pin_data]
 pub(crate) struct Gpu {
@@ -201,7 +172,7 @@ pub(crate) fn new(
     ) -> Result<impl PinInit<Self>> {
         let bar = devres_bar.access(pdev.as_ref())?;
         let spec = Spec::new(bar)?;
-        let fw = Firmware::new(pdev.as_ref(), &spec, "535.113.01")?;
+        let fw = Firmware::new(pdev.as_ref(), spec.chipset, "535.113.01")?;
 
         dev_info!(
             pdev.as_ref(),

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 08/19] rust: make ETIMEDOUT error available
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (6 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 07/19] gpu: nova-core: move Firmware to firmware module Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion Alexandre Courbot
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

We will use this error in the nova-core driver.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 rust/kernel/error.rs | 1 +
 1 file changed, 1 insertion(+)

diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index 3dee3139fcd4379b94748c0ba1965f4e1865b633..083c7b068cf4e185100de96e520c54437898ee72 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -65,6 +65,7 @@ macro_rules! declare_err {
     declare_err!(EDOM, "Math argument out of domain of func.");
     declare_err!(ERANGE, "Math result not representable.");
     declare_err!(EOVERFLOW, "Value too large for defined data type.");
+    declare_err!(ETIMEDOUT, "Connection timed out.");
     declare_err!(ERESTARTSYS, "Restart the system call.");
     declare_err!(ERESTARTNOINTR, "System call was interrupted by a signal and will be restarted.");
     declare_err!(ERESTARTNOHAND, "Restart if no handler.");

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (7 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 08/19] rust: make ETIMEDOUT error available Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 14:07   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 10/19] gpu: nova-core: add DMA object struct Alexandre Courbot
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

Upon reset, the GPU executes the GFW_BOOT firmware in order to
initialize its base parameters such as clocks. The driver must ensure
that this step is completed before using the hardware.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/devinit.rs   | 38 ++++++++++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/driver.rs    |  2 +-
 drivers/gpu/nova-core/gpu.rs       |  5 +++++
 drivers/gpu/nova-core/nova_core.rs |  1 +
 drivers/gpu/nova-core/regs.rs      | 11 +++++++++++
 5 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/devinit.rs b/drivers/gpu/nova-core/devinit.rs
new file mode 100644
index 0000000000000000000000000000000000000000..5be2e0344fb651e5e53c9223aefeb5b2d95b8de1
--- /dev/null
+++ b/drivers/gpu/nova-core/devinit.rs
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Methods for device initialization.
+
+use kernel::bindings;
+use kernel::prelude::*;
+
+use crate::driver::Bar0;
+use crate::regs;
+
+/// Wait for devinit FW completion.
+///
+/// Upon reset, the GPU runs some firmware code to setup its core parameters. Most of the GPU is
+/// considered unusable until this step is completed, so it must be waited on very early during
+/// driver initialization.
+pub(crate) fn wait_gfw_boot_completion(bar: &Bar0) -> Result<()> {
+    let mut timeout = 2000;
+
+    loop {
+        let gfw_booted = regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK::read(bar)
+            .read_protection_level0()
+            && (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05::read(bar).value() & 0xff) == 0xff;
+
+        if gfw_booted {
+            return Ok(());
+        }
+
+        if timeout == 0 {
+            return Err(ETIMEDOUT);
+        }
+        timeout -= 1;
+
+        // TODO: use `read_poll_timeout` once it is available.
+        // (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/)
+        // SAFETY: msleep should be safe to call with any parameter.
+        unsafe { bindings::msleep(2) };
+    }
+}
diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index a08fb6599267a960f0e07b6efd0e3b6cdc296aa4..752ba4b0fcfe8d835d366570bb2f807840a196da 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -10,7 +10,7 @@ pub(crate) struct NovaCore {
     pub(crate) gpu: Gpu,
 }
 
-const BAR0_SIZE: usize = 8;
+const BAR0_SIZE: usize = 0x1000000;
 pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
 
 kernel::pci_device_table!(
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 9c6a9270ffa374bd386af352f6ad9b857f9c5f8d..a9fcf74717791dc7e23678869bf84f61e51873e2 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -2,6 +2,7 @@
 
 use kernel::{device, devres::Devres, error::code::*, pci, prelude::*};
 
+use crate::devinit;
 use crate::driver::Bar0;
 use crate::firmware::Firmware;
 use crate::regs;
@@ -182,6 +183,10 @@ pub(crate) fn new(
             spec.revision
         );
 
+        // We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
+        devinit::wait_gfw_boot_completion(bar)
+            .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
+
         Ok(pin_init!(Self {
             spec,
             bar: devres_bar,
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index a91cd924054b49966937a8db6aab9cd0614f10de..3d75e7acc6e0ddcfa142b1fb134cb5c966aaf472 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -2,6 +2,7 @@
 
 //! Nova Core GPU Driver
 
+mod devinit;
 mod driver;
 mod firmware;
 mod gpu;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index bfb9555b203ff880c0fc373bb22e5ce6048015d4..401d885539cee03cbe732102f5e2233785a7b284 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -38,3 +38,14 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
             .and_then(Chipset::try_from)
     }
 }
+
+/* PGC6 */
+
+register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 {
+    0:0     read_protection_level0 as bool;
+});
+
+// TODO: This is an array of registers.
+register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05 @ 0x00118234 {
+    31:0    value as u32;
+});

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 10/19] gpu: nova-core: add DMA object struct
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (8 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 14:25   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 11/19] gpu: nova-core: register sysmem flush page Alexandre Courbot
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

Since we will need to allocate lots of distinct memory chunks to be
shared between GPU and CPU, introduce a type dedicated to that. It is a
light wrapper around CoherentAllocation.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/dma.rs       | 60 ++++++++++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |  1 +
 2 files changed, 61 insertions(+)

diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
new file mode 100644
index 0000000000000000000000000000000000000000..9d90ae01d0044eaab4ddbc3eba216741d7a623ef
--- /dev/null
+++ b/drivers/gpu/nova-core/dma.rs
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Simple DMA object wrapper.
+
+// To be removed when all code is used.
+#![expect(dead_code)]
+
+use core::ops::{Deref, DerefMut};
+
+use kernel::device;
+use kernel::dma::CoherentAllocation;
+use kernel::page::PAGE_SIZE;
+use kernel::prelude::*;
+
+pub(crate) struct DmaObject {
+    dma: CoherentAllocation<u8>,
+}
+
+impl DmaObject {
+    pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Self> {
+        let len = core::alloc::Layout::from_size_align(len, PAGE_SIZE)
+            .map_err(|_| EINVAL)?
+            .pad_to_align()
+            .size();
+        let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?;
+
+        Ok(Self { dma })
+    }
+
+    pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
+        Self::new(dev, data.len()).map(|mut dma_obj| {
+            // SAFETY:
+            // - The copied data fits within the size of the allocated object.
+            // - We have just created this object and there is no other user at this stage.
+            unsafe {
+                core::ptr::copy_nonoverlapping(
+                    data.as_ptr(),
+                    dma_obj.dma.start_ptr_mut(),
+                    data.len(),
+                );
+            }
+
+            dma_obj
+        })
+    }
+}
+
+impl Deref for DmaObject {
+    type Target = CoherentAllocation<u8>;
+
+    fn deref(&self) -> &Self::Target {
+        &self.dma
+    }
+}
+
+impl DerefMut for DmaObject {
+    fn deref_mut(&mut self) -> &mut Self::Target {
+        &mut self.dma
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 3d75e7acc6e0ddcfa142b1fb134cb5c966aaf472..1c7333e9fabe357f3ecbc6944ca98b66fa17c9a5 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -3,6 +3,7 @@
 //! Nova Core GPU Driver
 
 mod devinit;
+mod dma;
 mod driver;
 mod firmware;
 mod gpu;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 11/19] gpu: nova-core: register sysmem flush page
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (9 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 10/19] gpu: nova-core: add DMA object struct Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 14:47   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 12/19] gpu: nova-core: add helper function to wait on condition Alexandre Courbot
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

A page of system memory is reserved so sysmembar can perform a read on
it if a system write occurred since the last flush. Do this early as it
can be required to e.g. reset the GPU falcons.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs  | 45 +++++++++++++++++++++++++++++++++++++++++--
 drivers/gpu/nova-core/regs.rs | 10 ++++++++++
 2 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index a9fcf74717791dc7e23678869bf84f61e51873e2..c338da69ecbc2200f1ef3061a4d62971b021e3eb 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -3,6 +3,7 @@
 use kernel::{device, devres::Devres, error::code::*, pci, prelude::*};
 
 use crate::devinit;
+use crate::dma::DmaObject;
 use crate::driver::Bar0;
 use crate::firmware::Firmware;
 use crate::regs;
@@ -158,12 +159,32 @@ fn new(bar: &Bar0) -> Result<Spec> {
 }
 
 /// Structure holding the resources required to operate the GPU.
-#[pin_data]
+#[pin_data(PinnedDrop)]
 pub(crate) struct Gpu {
     spec: Spec,
     /// MMIO mapping of PCI BAR 0
     bar: Devres<Bar0>,
     fw: Firmware,
+    // System memory page required for flushing all pending GPU-side memory writes done through
+    // PCIE into system memory.
+    sysmem_flush: DmaObject,
+}
+
+#[pinned_drop]
+impl PinnedDrop for Gpu {
+    fn drop(self: Pin<&mut Self>) {
+        // Unregister the sysmem flush page before we release it.
+        let _ = self.bar.try_access_with(|b| {
+            regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default()
+                .set_adr_39_08(0)
+                .write(b);
+            if self.spec.chipset >= Chipset::GA102 {
+                regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default()
+                    .set_adr_63_40(0)
+                    .write(b);
+            }
+        });
+    }
 }
 
 impl Gpu {
@@ -187,10 +208,30 @@ pub(crate) fn new(
         devinit::wait_gfw_boot_completion(bar)
             .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
 
+        // System memory page required for sysmembar to properly flush into system memory.
+        let sysmem_flush = {
+            let page = DmaObject::new(pdev.as_ref(), kernel::bindings::PAGE_SIZE)?;
+
+            // Register the sysmem flush page.
+            let handle = page.dma_handle();
+
+            regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default()
+                .set_adr_39_08((handle >> 8) as u32)
+                .write(bar);
+            if spec.chipset >= Chipset::GA102 {
+                regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default()
+                    .set_adr_63_40((handle >> 40) as u32)
+                    .write(bar);
+            }
+
+            page
+        };
+
         Ok(pin_init!(Self {
             spec,
             bar: devres_bar,
-            fw
+            fw,
+            sysmem_flush,
         }))
     }
 }
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 401d885539cee03cbe732102f5e2233785a7b284..218cb6441eb0e5c6e5b52eabba006163eec0c8b4 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -39,6 +39,16 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
     }
 }
 
+/* PFB */
+
+register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR @ 0x00100c10 {
+    31:0    adr_39_08 as u32;
+});
+
+register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI @ 0x00100c40 {
+    23:0    adr_63_40 as u32;
+});
+
 /* PGC6 */
 
 register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 12/19] gpu: nova-core: add helper function to wait on condition
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (10 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 11/19] gpu: nova-core: register sysmem flush page Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 14:50   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code Alexandre Courbot
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

While programming the hardware, we frequently need to busy-wait until
a condition (like a given bit of a register to switch value) happens.

Add a basic `wait_on` helper function to wait on such conditions
expressed as a closure, with a timeout argument.

This is temporary as we will switch to `read_poll_timeout` [1] once it
is available.

[1] https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/util.rs | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
index 332a64cfc6a9d7d787fbdc228887c0be53a97160..afb525228431a2645afe7bb34988e9537757b1d7 100644
--- a/drivers/gpu/nova-core/util.rs
+++ b/drivers/gpu/nova-core/util.rs
@@ -1,5 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 
+use core::time::Duration;
+
+use kernel::prelude::*;
+use kernel::time::Ktime;
+
 pub(crate) const fn to_lowercase_bytes<const N: usize>(s: &str) -> [u8; N] {
     let src = s.as_bytes();
     let mut dst = [0; N];
@@ -19,3 +24,28 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str {
         Err(_) => kernel::build_error!("Bytes are not valid UTF-8."),
     }
 }
+
+/// Wait until `cond` is true or `timeout` elapsed.
+///
+/// When `cond` evaluates to `Some`, its return value is returned.
+///
+/// `Err(ETIMEDOUT)` is returned if `timeout` has been reached without `cond` evaluating to
+/// `Some`.
+///
+/// TODO: replace with `read_poll_timeout` once it is available.
+/// (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/)
+#[expect(dead_code)]
+pub(crate) fn wait_on<R, F: Fn() -> Option<R>>(timeout: Duration, cond: F) -> Result<R> {
+    let start_time = Ktime::ktime_get();
+
+    loop {
+        if let Some(ret) = cond() {
+            return Ok(ret);
+        }
+
+        let cur_time = Ktime::ktime_get();
+        if (cur_time - start_time).to_ns() > timeout.as_nanos() as i64 {
+            return Err(ETIMEDOUT);
+        }
+    }
+}

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (11 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 12/19] gpu: nova-core: add helper function to wait on condition Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 16:19   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 14/19] gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS Alexandre Courbot
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

Add the common Falcon code and HAL for Ampere GPUs, and instantiate the
GSP and SEC2 Falcons that will be required to boot the GSP.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs           | 540 ++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/falcon/gsp.rs       |  22 ++
 drivers/gpu/nova-core/falcon/hal.rs       |  56 ++++
 drivers/gpu/nova-core/falcon/hal/ga102.rs | 120 +++++++
 drivers/gpu/nova-core/falcon/sec2.rs      |   8 +
 drivers/gpu/nova-core/gpu.rs              |  11 +
 drivers/gpu/nova-core/nova_core.rs        |   1 +
 drivers/gpu/nova-core/regs.rs             | 125 +++++++
 drivers/gpu/nova-core/util.rs             |   1 -
 9 files changed, 883 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
new file mode 100644
index 0000000000000000000000000000000000000000..b6552496ad9c5f8eeda4beaa757c7393f495072b
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -0,0 +1,540 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Falcon microprocessor base support
+
+// To be removed when all code is used.
+#![expect(dead_code)]
+
+use core::time::Duration;
+use hal::FalconHal;
+use kernel::bindings;
+use kernel::device;
+use kernel::prelude::*;
+use kernel::sync::Arc;
+use kernel::types::ARef;
+
+use crate::driver::Bar0;
+use crate::gpu::Chipset;
+use crate::regs;
+use crate::util;
+
+pub(crate) mod gsp;
+mod hal;
+pub(crate) mod sec2;
+
+/// Revision number of a falcon core, used in the [`crate::regs::NV_PFALCON_FALCON_HWCFG1`]
+/// register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
+pub(crate) enum FalconCoreRev {
+    #[default]
+    Rev1 = 1,
+    Rev2 = 2,
+    Rev3 = 3,
+    Rev4 = 4,
+    Rev5 = 5,
+    Rev6 = 6,
+    Rev7 = 7,
+}
+
+impl TryFrom<u8> for FalconCoreRev {
+    type Error = Error;
+
+    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
+        use FalconCoreRev::*;
+
+        let rev = match value {
+            1 => Rev1,
+            2 => Rev2,
+            3 => Rev3,
+            4 => Rev4,
+            5 => Rev5,
+            6 => Rev6,
+            7 => Rev7,
+            _ => return Err(EINVAL),
+        };
+
+        Ok(rev)
+    }
+}
+
+/// Revision subversion number of a falcon core, used in the
+/// [`crate::regs::NV_PFALCON_FALCON_HWCFG1`] register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
+pub(crate) enum FalconCoreRevSubversion {
+    #[default]
+    Subversion0 = 0,
+    Subversion1 = 1,
+    Subversion2 = 2,
+    Subversion3 = 3,
+}
+
+impl TryFrom<u8> for FalconCoreRevSubversion {
+    type Error = Error;
+
+    fn try_from(value: u8) -> Result<Self> {
+        use FalconCoreRevSubversion::*;
+
+        let sub_version = match value & 0b11 {
+            0 => Subversion0,
+            1 => Subversion1,
+            2 => Subversion2,
+            3 => Subversion3,
+            _ => return Err(EINVAL),
+        };
+
+        Ok(sub_version)
+    }
+}
+
+/// Security model of a falcon core, used in the [`crate::regs::NV_PFALCON_FALCON_HWCFG1`]
+/// register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone)]
+pub(crate) enum FalconSecurityModel {
+    /// Non-Secure: runs unsigned code without privileges.
+    #[default]
+    None = 0,
+    /// Low-secure: runs unsigned code with some privileges. Can only be entered from `Heavy` mode.
+    Light = 2,
+    /// High-Secure: runs signed code with full privileges.
+    Heavy = 3,
+}
+
+impl TryFrom<u8> for FalconSecurityModel {
+    type Error = Error;
+
+    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
+        use FalconSecurityModel::*;
+
+        let sec_model = match value {
+            0 => None,
+            2 => Light,
+            3 => Heavy,
+            _ => return Err(EINVAL),
+        };
+
+        Ok(sec_model)
+    }
+}
+
+/// Signing algorithm for a given firmware, used in the [`crate::regs::NV_PFALCON2_FALCON_MOD_SEL`]
+/// register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)]
+pub(crate) enum FalconModSelAlgo {
+    /// RSA3K.
+    #[default]
+    Rsa3k = 1,
+}
+
+impl TryFrom<u8> for FalconModSelAlgo {
+    type Error = Error;
+
+    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
+        match value {
+            1 => Ok(FalconModSelAlgo::Rsa3k),
+            _ => Err(EINVAL),
+        }
+    }
+}
+
+/// Valid values for the `size` field of the [`crate::regs::NV_PFALCON_FALCON_DMATRFCMD`] register.
+#[repr(u8)]
+#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)]
+pub(crate) enum DmaTrfCmdSize {
+    /// 256 bytes transfer.
+    #[default]
+    Size256B = 0x6,
+}
+
+impl TryFrom<u8> for DmaTrfCmdSize {
+    type Error = Error;
+
+    fn try_from(value: u8) -> Result<Self> {
+        match value {
+            0x6 => Ok(Self::Size256B),
+            _ => Err(EINVAL),
+        }
+    }
+}
+
+/// Currently active core on a dual falcon/riscv (Peregrine) controller.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum PeregrineCoreSelect {
+    /// Falcon core is active.
+    Falcon = 0,
+    /// RISC-V core is active.
+    Riscv = 1,
+}
+
+impl From<bool> for PeregrineCoreSelect {
+    fn from(value: bool) -> Self {
+        match value {
+            false => PeregrineCoreSelect::Falcon,
+            true => PeregrineCoreSelect::Riscv,
+        }
+    }
+}
+
+/// Different types of memory present in a falcon core.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum FalconMem {
+    /// Instruction Memory.
+    Imem,
+    /// Data Memory.
+    Dmem,
+}
+
+/// Target/source of a DMA transfer to/from falcon memory.
+#[derive(Debug, Clone, Default)]
+pub(crate) enum FalconFbifTarget {
+    /// VRAM.
+    #[default]
+    LocalFb = 0,
+    /// Coherent system memory.
+    CoherentSysmem = 1,
+    /// Non-coherent system memory.
+    NoncoherentSysmem = 2,
+}
+
+impl TryFrom<u8> for FalconFbifTarget {
+    type Error = Error;
+
+    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
+        let res = match value {
+            0 => Self::LocalFb,
+            1 => Self::CoherentSysmem,
+            2 => Self::NoncoherentSysmem,
+            _ => return Err(EINVAL),
+        };
+
+        Ok(res)
+    }
+}
+
+/// Type of memory addresses to use.
+#[derive(Debug, Clone, Default)]
+pub(crate) enum FalconFbifMemType {
+    /// Physical memory addresses.
+    #[default]
+    Virtual = 0,
+    /// Virtual memory addresses.
+    Physical = 1,
+}
+
+impl From<bool> for FalconFbifMemType {
+    fn from(value: bool) -> Self {
+        match value {
+            false => Self::Virtual,
+            true => Self::Physical,
+        }
+    }
+}
+
+/// Trait defining the parameters of a given Falcon instance.
+pub(crate) trait FalconEngine: Sync {
+    /// Base I/O address for the falcon, relative from which its registers are accessed.
+    const BASE: usize;
+}
+
+/// Represents a portion of the firmware to be loaded into a particular memory (e.g. IMEM or DMEM).
+#[derive(Debug)]
+pub(crate) struct FalconLoadTarget {
+    /// Offset from the start of the source object to copy from.
+    pub(crate) src_start: u32,
+    /// Offset from the start of the destination memory to copy into.
+    pub(crate) dst_start: u32,
+    /// Number of bytes to copy.
+    pub(crate) len: u32,
+}
+
+/// Parameters for the falcon boot ROM.
+#[derive(Debug)]
+pub(crate) struct FalconBromParams {
+    /// Offset in `DMEM`` of the firmware's signature.
+    pub(crate) pkc_data_offset: u32,
+    /// Mask of engines valid for this firmware.
+    pub(crate) engine_id_mask: u16,
+    /// ID of the ucode used to infer a fuse register to validate the signature.
+    pub(crate) ucode_id: u8,
+}
+
+/// Trait for a falcon firmware.
+pub(crate) trait FalconFirmware {
+    /// Engine on which this firmware is to be loaded.
+    type Target: FalconEngine;
+
+    /// Returns the DMA handle of the object containing the firmware.
+    fn dma_handle(&self) -> bindings::dma_addr_t;
+
+    /// Returns the load parameters for `IMEM`.
+    fn imem_load(&self) -> FalconLoadTarget;
+
+    /// Returns the load parameters for `DMEM`.
+    fn dmem_load(&self) -> FalconLoadTarget;
+
+    /// Returns the parameters to write into the BROM registers.
+    fn brom_params(&self) -> FalconBromParams;
+
+    /// Returns the start address of the firmware.
+    fn boot_addr(&self) -> u32;
+}
+
+/// Contains the base parameters common to all Falcon instances.
+pub(crate) struct Falcon<E: FalconEngine> {
+    hal: Arc<dyn FalconHal<E>>,
+    dev: ARef<device::Device>,
+}
+
+impl<E: FalconEngine + 'static> Falcon<E> {
+    /// Create a new falcon instance.
+    ///
+    /// `need_riscv` is set to `true` if the caller expects the falcon to be a dual falcon/riscv
+    /// controller.
+    pub(crate) fn new(
+        dev: &device::Device,
+        chipset: Chipset,
+        bar: &Bar0,
+        need_riscv: bool,
+    ) -> Result<Self> {
+        let hwcfg1 = regs::NV_PFALCON_FALCON_HWCFG1::read(bar, E::BASE);
+        // Ensure that the revision and security model contain valid values.
+        let _rev = hwcfg1.core_rev()?;
+        let _sec_model = hwcfg1.security_model()?;
+
+        if need_riscv {
+            let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
+            if !hwcfg2.riscv() {
+                dev_err!(
+                    dev,
+                    "riscv support requested on a controller that does not support it\n"
+                );
+                return Err(EINVAL);
+            }
+        }
+
+        Ok(Self {
+            hal: hal::create_falcon_hal(chipset)?,
+            dev: dev.into(),
+        })
+    }
+
+    /// Wait for memory scrubbing to complete.
+    fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result<()> {
+        util::wait_on(Duration::from_millis(20), || {
+            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
+            if r.mem_scrubbing() {
+                Some(())
+            } else {
+                None
+            }
+        })
+    }
+
+    /// Reset the falcon engine.
+    fn reset_eng(&self, bar: &Bar0) -> Result<()> {
+        let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
+
+        // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set
+        // RESET_READY so a non-failing timeout is used.
+        let _ = util::wait_on(Duration::from_micros(150), || {
+            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
+            if r.reset_ready() {
+                Some(())
+            } else {
+                None
+            }
+        });
+
+        regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(true));
+
+        let _: Result<()> = util::wait_on(Duration::from_micros(10), || None);
+
+        regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(false));
+
+        self.reset_wait_mem_scrubbing(bar)?;
+
+        Ok(())
+    }
+
+    /// Reset the controller, select the falcon core, and wait for memory scrubbing to complete.
+    pub(crate) fn reset(&self, bar: &Bar0) -> Result<()> {
+        self.reset_eng(bar)?;
+        self.hal.select_core(self, bar)?;
+        self.reset_wait_mem_scrubbing(bar)?;
+
+        regs::NV_PFALCON_FALCON_RM::default()
+            .set_value(regs::NV_PMC_BOOT_0::read(bar).into())
+            .write(bar, E::BASE);
+
+        Ok(())
+    }
+
+    /// Perform a DMA write according to `load_offsets` from `dma_handle` into the falcon's
+    /// `target_mem`.
+    ///
+    /// `sec` is set if the loaded firmware is expected to run in secure mode.
+    fn dma_wr(
+        &self,
+        bar: &Bar0,
+        dma_handle: bindings::dma_addr_t,
+        target_mem: FalconMem,
+        load_offsets: FalconLoadTarget,
+        sec: bool,
+    ) -> Result<()> {
+        const DMA_LEN: u32 = 256;
+
+        // For IMEM, we want to use the start offset as a virtual address tag for each page, since
+        // code addresses in the firmware (and the boot vector) are virtual.
+        //
+        // For DMEM we can fold the start offset into the DMA handle.
+        let (src_start, dma_start) = match target_mem {
+            FalconMem::Imem => (load_offsets.src_start, dma_handle),
+            FalconMem::Dmem => (
+                0,
+                dma_handle + load_offsets.src_start as bindings::dma_addr_t,
+            ),
+        };
+        if dma_start % DMA_LEN as bindings::dma_addr_t > 0 {
+            dev_err!(
+                self.dev,
+                "DMA transfer start addresses must be a multiple of {}",
+                DMA_LEN
+            );
+            return Err(EINVAL);
+        }
+        if load_offsets.len % DMA_LEN > 0 {
+            dev_err!(
+                self.dev,
+                "DMA transfer length must be a multiple of {}",
+                DMA_LEN
+            );
+            return Err(EINVAL);
+        }
+
+        // Set up the base source DMA address.
+
+        regs::NV_PFALCON_FALCON_DMATRFBASE::default()
+            .set_base((dma_start >> 8) as u32)
+            .write(bar, E::BASE);
+        regs::NV_PFALCON_FALCON_DMATRFBASE1::default()
+            .set_base((dma_start >> 40) as u16)
+            .write(bar, E::BASE);
+
+        let cmd = regs::NV_PFALCON_FALCON_DMATRFCMD::default()
+            .set_size(DmaTrfCmdSize::Size256B)
+            .set_imem(target_mem == FalconMem::Imem)
+            .set_sec(if sec { 1 } else { 0 });
+
+        for pos in (0..load_offsets.len).step_by(DMA_LEN as usize) {
+            // Perform a transfer of size `DMA_LEN`.
+            regs::NV_PFALCON_FALCON_DMATRFMOFFS::default()
+                .set_offs(load_offsets.dst_start + pos)
+                .write(bar, E::BASE);
+            regs::NV_PFALCON_FALCON_DMATRFFBOFFS::default()
+                .set_offs(src_start + pos)
+                .write(bar, E::BASE);
+            cmd.write(bar, E::BASE);
+
+            // Wait for the transfer to complete.
+            util::wait_on(Duration::from_millis(2000), || {
+                let r = regs::NV_PFALCON_FALCON_DMATRFCMD::read(bar, E::BASE);
+                if r.idle() {
+                    Some(())
+                } else {
+                    None
+                }
+            })?;
+        }
+
+        Ok(())
+    }
+
+    /// Perform a DMA load into `IMEM` and `DMEM` of `fw`, and prepare the falcon to run it.
+    pub(crate) fn dma_load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result<()> {
+        let dma_handle = fw.dma_handle();
+
+        regs::NV_PFALCON_FBIF_CTL::alter(bar, E::BASE, |v| v.set_allow_phys_no_ctx(true));
+        regs::NV_PFALCON_FALCON_DMACTL::default().write(bar, E::BASE);
+        regs::NV_PFALCON_FBIF_TRANSCFG::alter(bar, E::BASE, |v| {
+            v.set_target(FalconFbifTarget::CoherentSysmem)
+                .set_mem_type(FalconFbifMemType::Physical)
+        });
+
+        self.dma_wr(bar, dma_handle, FalconMem::Imem, fw.imem_load(), true)?;
+        self.dma_wr(bar, dma_handle, FalconMem::Dmem, fw.dmem_load(), true)?;
+
+        self.hal.program_brom(self, bar, &fw.brom_params())?;
+
+        // Set `BootVec` to start of non-secure code.
+        regs::NV_PFALCON_FALCON_BOOTVEC::default()
+            .set_value(fw.boot_addr())
+            .write(bar, E::BASE);
+
+        Ok(())
+    }
+
+    /// Start running the loaded firmware.
+    ///
+    /// `mbox0` and `mbox1` are optional parameters to write into the `MBOX0` and `MBOX1` registers
+    /// prior to running.
+    ///
+    /// Returns `MBOX0` and `MBOX1` after the firmware has stopped running.
+    pub(crate) fn boot(
+        &self,
+        bar: &Bar0,
+        mbox0: Option<u32>,
+        mbox1: Option<u32>,
+    ) -> Result<(u32, u32)> {
+        if let Some(mbox0) = mbox0 {
+            regs::NV_PFALCON_FALCON_MAILBOX0::default()
+                .set_value(mbox0)
+                .write(bar, E::BASE);
+        }
+
+        if let Some(mbox1) = mbox1 {
+            regs::NV_PFALCON_FALCON_MAILBOX1::default()
+                .set_value(mbox1)
+                .write(bar, E::BASE);
+        }
+
+        match regs::NV_PFALCON_FALCON_CPUCTL::read(bar, E::BASE).alias_en() {
+            true => regs::NV_PFALCON_FALCON_CPUCTL_ALIAS::default()
+                .set_startcpu(true)
+                .write(bar, E::BASE),
+            false => regs::NV_PFALCON_FALCON_CPUCTL::default()
+                .set_startcpu(true)
+                .write(bar, E::BASE),
+        }
+
+        util::wait_on(Duration::from_secs(2), || {
+            let r = regs::NV_PFALCON_FALCON_CPUCTL::read(bar, E::BASE);
+            if r.halted() {
+                Some(())
+            } else {
+                None
+            }
+        })?;
+
+        let (mbox0, mbox1) = (
+            regs::NV_PFALCON_FALCON_MAILBOX0::read(bar, E::BASE).value(),
+            regs::NV_PFALCON_FALCON_MAILBOX1::read(bar, E::BASE).value(),
+        );
+
+        Ok((mbox0, mbox1))
+    }
+
+    /// Returns the fused version of the signature to use in order to run a HS firmware on this
+    /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header.
+    pub(crate) fn get_signature_reg_fuse_version(
+        &self,
+        bar: &Bar0,
+        engine_id_mask: u16,
+        ucode_id: u8,
+    ) -> Result<u32> {
+        self.hal
+            .get_signature_reg_fuse_version(self, bar, engine_id_mask, ucode_id)
+    }
+}
diff --git a/drivers/gpu/nova-core/falcon/gsp.rs b/drivers/gpu/nova-core/falcon/gsp.rs
new file mode 100644
index 0000000000000000000000000000000000000000..f74aeadaee9ae96bb1961d3c55b2cf1999943377
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/gsp.rs
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use crate::{
+    driver::Bar0,
+    falcon::{Falcon, FalconEngine},
+    regs,
+};
+
+pub(crate) struct Gsp;
+impl FalconEngine for Gsp {
+    const BASE: usize = 0x00110000;
+}
+
+impl Falcon<Gsp> {
+    /// Clears the SWGEN0 bit in the Falcon's IRQ status clear register to
+    /// allow GSP to signal CPU for processing new messages in message queue.
+    pub(crate) fn clear_swgen0_intr(&self, bar: &Bar0) {
+        regs::NV_PFALCON_FALCON_IRQSCLR::default()
+            .set_swgen0(true)
+            .write(bar, Gsp::BASE);
+    }
+}
diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
new file mode 100644
index 0000000000000000000000000000000000000000..0b0ab8174caafeec3a2a2ba8211a740b7feadb75
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::prelude::*;
+use kernel::sync::Arc;
+
+use crate::driver::Bar0;
+use crate::falcon::{Falcon, FalconBromParams, FalconEngine};
+use crate::gpu::Chipset;
+
+mod ga102;
+
+/// Hardware Abstraction Layer for Falcon cores.
+///
+/// Implements chipset-specific low-level operations. The trait is generic against [`FalconEngine`]
+/// so its `BASE` parameter can be used in order to avoid runtime bound checks when accessing
+/// registers.
+pub(crate) trait FalconHal<E: FalconEngine>: Sync {
+    // Activates the Falcon core if the engine is a risvc/falcon dual engine.
+    fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result<()> {
+        Ok(())
+    }
+
+    /// Returns the fused version of the signature to use in order to run a HS firmware on this
+    /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header.
+    fn get_signature_reg_fuse_version(
+        &self,
+        falcon: &Falcon<E>,
+        bar: &Bar0,
+        engine_id_mask: u16,
+        ucode_id: u8,
+    ) -> Result<u32>;
+
+    // Program the boot ROM registers prior to starting a secure firmware.
+    fn program_brom(&self, falcon: &Falcon<E>, bar: &Bar0, params: &FalconBromParams)
+        -> Result<()>;
+}
+
+/// Returns a boxed falcon HAL adequate for the passed `chipset`.
+///
+/// We use this function and a heap-allocated trait object instead of statically defined trait
+/// objects because of the two-dimensional (Chipset, Engine) lookup required to return the
+/// requested HAL.
+///
+/// TODO: replace the return type with `KBox` once it gains the ability to host trait objects.
+pub(crate) fn create_falcon_hal<E: FalconEngine + 'static>(
+    chipset: Chipset,
+) -> Result<Arc<dyn FalconHal<E>>> {
+    let hal = match chipset {
+        Chipset::GA102 | Chipset::GA103 | Chipset::GA104 | Chipset::GA106 | Chipset::GA107 => {
+            Arc::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as Arc<dyn FalconHal<E>>
+        }
+        _ => return Err(ENOTSUPP),
+    };
+
+    Ok(hal)
+}
diff --git a/drivers/gpu/nova-core/falcon/hal/ga102.rs b/drivers/gpu/nova-core/falcon/hal/ga102.rs
new file mode 100644
index 0000000000000000000000000000000000000000..68d807c192a16a908f40b65f9b03d107f4042b63
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/hal/ga102.rs
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use core::marker::PhantomData;
+use core::time::Duration;
+
+use kernel::{device, prelude::*};
+
+use crate::driver::Bar0;
+use crate::falcon::{
+    Falcon, FalconBromParams, FalconEngine, FalconModSelAlgo, PeregrineCoreSelect,
+};
+use crate::regs;
+use crate::util;
+
+use super::FalconHal;
+
+fn select_core_ga102<E: FalconEngine>(bar: &Bar0) -> Result<()> {
+    let bcr_ctrl = regs::NV_PRISCV_RISCV_BCR_CTRL::read(bar, E::BASE);
+    if bcr_ctrl.core_select() != PeregrineCoreSelect::Falcon {
+        regs::NV_PRISCV_RISCV_BCR_CTRL::default()
+            .set_core_select(PeregrineCoreSelect::Falcon)
+            .write(bar, E::BASE);
+
+        util::wait_on(Duration::from_millis(10), || {
+            let r = regs::NV_PRISCV_RISCV_BCR_CTRL::read(bar, E::BASE);
+            if r.valid() {
+                Some(())
+            } else {
+                None
+            }
+        })?;
+    }
+
+    Ok(())
+}
+
+fn get_signature_reg_fuse_version_ga102(
+    dev: &device::Device,
+    bar: &Bar0,
+    engine_id_mask: u16,
+    ucode_id: u8,
+) -> Result<u32> {
+    // The ucode fuse versions are contained in the FUSE_OPT_FPF_<ENGINE>_UCODE<X>_VERSION
+    // registers, which are an array. Our register definition macros do not allow us to manage them
+    // properly, so we need to hardcode their addresses for now.
+
+    // Each engine has 16 ucode version registers numbered from 1 to 16.
+    if ucode_id == 0 || ucode_id > 16 {
+        dev_warn!(dev, "invalid ucode id {:#x}", ucode_id);
+        return Err(EINVAL);
+    }
+    let reg_fuse = if engine_id_mask & 0x0001 != 0 {
+        // NV_FUSE_OPT_FPF_SEC2_UCODE1_VERSION
+        0x824140
+    } else if engine_id_mask & 0x0004 != 0 {
+        // NV_FUSE_OPT_FPF_NVDEC_UCODE1_VERSION
+        0x824100
+    } else if engine_id_mask & 0x0400 != 0 {
+        // NV_FUSE_OPT_FPF_GSP_UCODE1_VERSION
+        0x8241c0
+    } else {
+        dev_warn!(dev, "unexpected engine_id_mask {:#x}", engine_id_mask);
+        return Err(EINVAL);
+    } + ((ucode_id - 1) as usize * core::mem::size_of::<u32>());
+
+    let reg_fuse_version = bar.read32(reg_fuse);
+
+    // Equivalent of Find Last Set bit.
+    Ok(u32::BITS - reg_fuse_version.leading_zeros())
+}
+
+fn program_brom_ga102<E: FalconEngine>(bar: &Bar0, params: &FalconBromParams) -> Result<()> {
+    regs::NV_PFALCON2_FALCON_BROM_PARAADDR::default()
+        .set_value(params.pkc_data_offset)
+        .write(bar, E::BASE);
+    regs::NV_PFALCON2_FALCON_BROM_ENGIDMASK::default()
+        .set_value(params.engine_id_mask as u32)
+        .write(bar, E::BASE);
+    regs::NV_PFALCON2_FALCON_BROM_CURR_UCODE_ID::default()
+        .set_ucode_id(params.ucode_id)
+        .write(bar, E::BASE);
+    regs::NV_PFALCON2_FALCON_MOD_SEL::default()
+        .set_algo(FalconModSelAlgo::Rsa3k)
+        .write(bar, E::BASE);
+
+    Ok(())
+}
+
+pub(super) struct Ga102<E: FalconEngine>(PhantomData<E>);
+
+impl<E: FalconEngine> Ga102<E> {
+    pub(super) fn new() -> Self {
+        Self(PhantomData)
+    }
+}
+
+impl<E: FalconEngine> FalconHal<E> for Ga102<E> {
+    fn select_core(&self, _falcon: &Falcon<E>, bar: &Bar0) -> Result<()> {
+        select_core_ga102::<E>(bar)
+    }
+
+    fn get_signature_reg_fuse_version(
+        &self,
+        falcon: &Falcon<E>,
+        bar: &Bar0,
+        engine_id_mask: u16,
+        ucode_id: u8,
+    ) -> Result<u32> {
+        get_signature_reg_fuse_version_ga102(&falcon.dev, bar, engine_id_mask, ucode_id)
+    }
+
+    fn program_brom(
+        &self,
+        _falcon: &Falcon<E>,
+        bar: &Bar0,
+        params: &FalconBromParams,
+    ) -> Result<()> {
+        program_brom_ga102::<E>(bar, params)
+    }
+}
diff --git a/drivers/gpu/nova-core/falcon/sec2.rs b/drivers/gpu/nova-core/falcon/sec2.rs
new file mode 100644
index 0000000000000000000000000000000000000000..c1efdaa7c4e1b8c04c4e041aae3b61a8b65f656b
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/sec2.rs
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use crate::falcon::FalconEngine;
+
+pub(crate) struct Sec2;
+impl FalconEngine for Sec2 {
+    const BASE: usize = 0x00840000;
+}
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index c338da69ecbc2200f1ef3061a4d62971b021e3eb..ece13594fba687f3f714e255b5436e72d80dece3 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -5,6 +5,7 @@
 use crate::devinit;
 use crate::dma::DmaObject;
 use crate::driver::Bar0;
+use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
 use crate::firmware::Firmware;
 use crate::regs;
 use crate::util;
@@ -227,6 +228,16 @@ pub(crate) fn new(
             page
         };
 
+        let gsp_falcon = Falcon::<Gsp>::new(
+            pdev.as_ref(),
+            spec.chipset,
+            bar,
+            spec.chipset > Chipset::GA100,
+        )?;
+        gsp_falcon.clear_swgen0_intr(bar);
+
+        let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
+
         Ok(pin_init!(Self {
             spec,
             bar: devres_bar,
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 1c7333e9fabe357f3ecbc6944ca98b66fa17c9a5..8342482a1aa16da2e69f7d99143c1549a82c969e 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -5,6 +5,7 @@
 mod devinit;
 mod dma;
 mod driver;
+mod falcon;
 mod firmware;
 mod gpu;
 mod regs;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 218cb6441eb0e5c6e5b52eabba006163eec0c8b4..b5c6eeb6ed873a06b4aefcb375f4944eb0b20597 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -9,6 +9,10 @@
 #[macro_use]
 mod macros;
 
+use crate::falcon::{
+    DmaTrfCmdSize, FalconCoreRev, FalconCoreRevSubversion, FalconFbifMemType, FalconFbifTarget,
+    FalconModSelAlgo, FalconSecurityModel, PeregrineCoreSelect,
+};
 use crate::gpu::{Architecture, Chipset};
 
 /* PMC */
@@ -59,3 +63,124 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
 register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05 @ 0x00118234 {
     31:0    value as u32;
 });
+
+/* PFALCON */
+
+register!(NV_PFALCON_FALCON_IRQSCLR @ +0x00000004 {
+    4:4     halt as bool;
+    6:6     swgen0 as bool;
+});
+
+register!(NV_PFALCON_FALCON_MAILBOX0 @ +0x00000040 {
+    31:0    value as u32;
+});
+
+register!(NV_PFALCON_FALCON_MAILBOX1 @ +0x00000044 {
+    31:0    value as u32;
+});
+
+register!(NV_PFALCON_FALCON_RM @ +0x00000084 {
+    31:0    value as u32;
+});
+
+register!(NV_PFALCON_FALCON_HWCFG2 @ +0x000000f4 {
+    10:10   riscv as bool;
+    12:12   mem_scrubbing as bool;
+    31:31   reset_ready as bool, "Signal indicating that reset is completed (GA102+)";
+});
+
+register!(NV_PFALCON_FALCON_CPUCTL @ +0x00000100 {
+    1:1     startcpu as bool;
+    4:4     halted as bool;
+    6:6     alias_en as bool;
+});
+
+register!(NV_PFALCON_FALCON_BOOTVEC @ +0x00000104 {
+    31:0    value as u32;
+});
+
+register!(NV_PFALCON_FALCON_DMACTL @ +0x0000010c {
+    0:0     require_ctx as bool;
+    1:1     dmem_scrubbing as bool;
+    2:2     imem_scrubbing as bool;
+    6:3     dmaq_num as u8;
+    7:7     secure_stat as bool;
+});
+
+register!(NV_PFALCON_FALCON_DMATRFBASE @ +0x00000110 {
+    31:0    base as u32;
+});
+
+register!(NV_PFALCON_FALCON_DMATRFMOFFS @ +0x00000114 {
+    23:0    offs as u32;
+});
+
+register!(NV_PFALCON_FALCON_DMATRFCMD @ +0x00000118 {
+    0:0     full as bool;
+    1:1     idle as bool;
+    3:2     sec as u8;
+    4:4     imem as bool;
+    5:5     is_write as bool;
+    10:8    size as u8 ?=> DmaTrfCmdSize;
+    14:12   ctxdma as u8;
+    16:16   set_dmtag as u8;
+});
+
+register!(NV_PFALCON_FALCON_DMATRFFBOFFS @ +0x0000011c {
+    31:0    offs as u32;
+});
+
+register!(NV_PFALCON_FALCON_DMATRFBASE1 @ +0x00000128 {
+    8:0     base as u16;
+});
+
+register!(NV_PFALCON_FALCON_HWCFG1 @ +0x0000012c {
+    3:0     core_rev as u8 ?=> FalconCoreRev, "Core revision";
+    5:4     security_model as u8 ?=> FalconSecurityModel, "Security model";
+    7:6     core_rev_subversion as u8 ?=> FalconCoreRevSubversion, "Core revision subversion";
+});
+
+register!(NV_PFALCON_FALCON_CPUCTL_ALIAS @ +0x00000130 {
+    1:1     startcpu as bool;
+});
+
+// Actually known as `NV_PSEC_FALCON_ENGINE` and `NV_PGSP_FALCON_ENGINE` depending on the falcon
+// instance.
+register!(NV_PFALCON_FALCON_ENGINE @ +0x000003c0 {
+    0:0     reset as bool;
+});
+
+// TODO: this is an array of registers.
+register!(NV_PFALCON_FBIF_TRANSCFG @ +0x00000600 {
+    1:0     target as u8 ?=> FalconFbifTarget;
+    2:2     mem_type as bool => FalconFbifMemType;
+});
+
+register!(NV_PFALCON_FBIF_CTL @ +0x00000624 {
+    7:7     allow_phys_no_ctx as bool;
+});
+
+register!(NV_PFALCON2_FALCON_MOD_SEL @ +0x00001180 {
+    7:0     algo as u8 ?=> FalconModSelAlgo;
+});
+
+register!(NV_PFALCON2_FALCON_BROM_CURR_UCODE_ID @ +0x00001198 {
+    7:0    ucode_id as u8;
+});
+
+register!(NV_PFALCON2_FALCON_BROM_ENGIDMASK @ +0x0000119c {
+    31:0    value as u32;
+});
+
+// TODO: this is an array of registers.
+register!(NV_PFALCON2_FALCON_BROM_PARAADDR @ +0x00001210 {
+    31:0    value as u32;
+});
+
+/* PRISCV */
+
+register!(NV_PRISCV_RISCV_BCR_CTRL @ +0x00001668 {
+    0:0     valid as bool;
+    4:4     core_select as bool => PeregrineCoreSelect;
+    8:8     br_fetch as bool;
+});
diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
index afb525228431a2645afe7bb34988e9537757b1d7..81fcfff1f6f437d2f6a2130ce2249fbf4c1501be 100644
--- a/drivers/gpu/nova-core/util.rs
+++ b/drivers/gpu/nova-core/util.rs
@@ -34,7 +34,6 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str {
 ///
 /// TODO: replace with `read_poll_timeout` once it is available.
 /// (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/)
-#[expect(dead_code)]
 pub(crate) fn wait_on<R, F: Fn() -> Option<R>>(timeout: Duration, cond: F) -> Result<R> {
     let start_time = Ktime::ktime_get();
 

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 14/19] gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (12 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 15/19] rust: num: Add an upward alignment helper for usize Alexandre Courbot
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

FWSEC-FRTS is the first firmware we need to run on the GSP falcon in
order to initiate the GSP boot process. Introduce the structure that
describes it.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs | 43 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index cb79d039948858e657c9a23a62ed27ff780ac169..1eb216307cd01d975b3d5beda1dc516f34b4b3f2 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -45,6 +45,49 @@ pub(crate) fn new(
     }
 }
 
+/// Structure used to describe some firmwares, notably FWSEC-FRTS.
+#[repr(C)]
+#[derive(Debug, Clone)]
+pub(crate) struct FalconUCodeDescV3 {
+    /// Header defined by `NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*` in OpenRM.
+    ///
+    /// Bits `31:16` contain the size of the header, after which the actual ucode data starts.
+    hdr: u32,
+    /// Stored size of the ucode after the header.
+    stored_size: u32,
+    /// Offset in `DMEM` at which the signature is expected to be found.
+    pub(crate) pkc_data_offset: u32,
+    /// Offset after the code segment at which the app headers are located.
+    pub(crate) interface_offset: u32,
+    /// Base address at which to load the code segment into `IMEM`.
+    pub(crate) imem_phys_base: u32,
+    /// Size in bytes of the code to copy into `IMEM`.
+    pub(crate) imem_load_size: u32,
+    /// Virtual `IMEM` address (i.e. `tag`) at which the code should start.
+    pub(crate) imem_virt_base: u32,
+    /// Base address at which to load the data segment into `DMEM`.
+    pub(crate) dmem_phys_base: u32,
+    /// Size in bytes of the data to copy into `DMEM`.
+    pub(crate) dmem_load_size: u32,
+    /// Mask of the falcon engines on which this firmware can run.
+    pub(crate) engine_id_mask: u16,
+    /// ID of the ucode used to infer a fuse register to validate the signature.
+    pub(crate) ucode_id: u8,
+    /// Number of signatures in this firmware.
+    pub(crate) signature_count: u8,
+    /// Versions of the signatures, used to infer a valid signature to use.
+    pub(crate) signature_versions: u16,
+    _reserved: u16,
+}
+
+// To be removed once that code is used.
+#[expect(dead_code)]
+impl FalconUCodeDescV3 {
+    pub(crate) fn size(&self) -> usize {
+        ((self.hdr & 0xffff0000) >> 16) as usize
+    }
+}
+
 pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>);
 
 impl<const N: usize> ModInfoBuilder<N> {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 15/19] rust: num: Add an upward alignment helper for usize
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (13 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 14/19] gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-07 13:52 ` [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot Alexandre Courbot
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

From: Joel Fernandes <joelagnelf@nvidia.com>

This will be used in the nova-core driver where we need to upward-align
the image size to get to the next image in the VBIOS ROM.

[acourbot@nvidia.com: handled conflicts due to removal of patch creating
num.rs]

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 rust/kernel/lib.rs |  1 +
 rust/kernel/num.rs | 44 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index ab0286857061d2de1be0279cbd2cd3490e5a48c3..be75b196aa7a29cf3eed7c902ed8fb98689bbb50 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -67,6 +67,7 @@
 pub mod miscdevice;
 #[cfg(CONFIG_NET)]
 pub mod net;
+pub mod num;
 pub mod of;
 pub mod page;
 #[cfg(CONFIG_PCI)]
diff --git a/rust/kernel/num.rs b/rust/kernel/num.rs
new file mode 100644
index 0000000000000000000000000000000000000000..7d69cbd957546ab75fcf677b851f102c3a5e5c39
--- /dev/null
+++ b/rust/kernel/num.rs
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Numerical and binary utilities for primitive types.
+
+/// A trait providing alignment operations for `usize`.
+use core::ops::{Add, BitAnd, BitOr, Not, Sub};
+
+/// Traits for unsigned integers
+pub trait Unsigned:
+    Copy
+    + BitAnd<Output = Self>
+    + BitOr<Output = Self>
+    + Not<Output = Self>
+    + Add<Output = Self>
+    + Sub<Output = Self>
+    + From<u8>
+{
+}
+
+macro_rules! unsigned_trait_impl {
+    ($($t:ty),+) => {
+        $(
+            impl Unsigned for $t {}
+        )+
+    };
+}
+unsigned_trait_impl!(usize, u8, u16, u32, u64, u128);
+
+/// Trait for unsigned integer alignment
+pub trait NumAlign {
+    /// Implement upward power-of-2 alignment for unsigned ints
+    fn align_up(self, alignment: Self) -> Self;
+}
+
+impl<T> NumAlign for T
+where
+    T: Unsigned,
+{
+    #[inline]
+    fn align_up(self, alignment: Self) -> Self {
+        let one = T::from(1u8);
+        (self + alignment - one) & !(alignment - one)
+    }
+}

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (14 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 15/19] rust: num: Add an upward alignment helper for usize Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 17:19   ` Danilo Krummrich
                     ` (2 more replies)
  2025-05-07 13:52 ` [PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region Alexandre Courbot
                   ` (3 subsequent siblings)
  19 siblings, 3 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot, Shirish Baskaran

From: Joel Fernandes <joelagnelf@nvidia.com>

Add support for navigating and setting up vBIOS ucode data required for
GSP to boot. The main data extracted from the vBIOS is the FWSEC-FRTS
firmware which runs on the GSP processor. This firmware runs in high
secure mode, and sets up the WPR2 (Write protected region) before the
Booter runs on the SEC2 processor.

Also add log messages to show the BIOS images.

[102141.013287] NovaCore: Found BIOS image at offset 0x0, size: 0xfe00, type: PciAt
[102141.080692] NovaCore: Found BIOS image at offset 0xfe00, size: 0x14800, type: Efi
[102141.098443] NovaCore: Found BIOS image at offset 0x24600, size: 0x5600, type: FwSec
[102141.415095] NovaCore: Found BIOS image at offset 0x29c00, size: 0x60800, type: FwSec

Tested on my Ampere GA102 and boot is successful.

[applied changes by Alex Courbot for fwsec signatures]
[applied feedback from Alex Courbot and Timur Tabi]
[applied changes related to code reorg, prints etc from Danilo Krummrich]
[acourbot@nvidia.com: fix clippy warnings]
[acourbot@nvidia.com: remove now-unneeded Devres acquisition]
[acourbot@nvidia.com: fix read_more to read `len` bytes, not u32s]

Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Shirish Baskaran <sbaskaran@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Ben Skeggs <bskeggs@nvidia.com>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/firmware.rs  |    2 -
 drivers/gpu/nova-core/gpu.rs       |    3 +
 drivers/gpu/nova-core/nova_core.rs |    1 +
 drivers/gpu/nova-core/vbios.rs     | 1147 ++++++++++++++++++++++++++++++++++++
 4 files changed, 1151 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 1eb216307cd01d975b3d5beda1dc516f34b4b3f2..960982174d834c7c66a47ecfb3a15bf47116b2c5 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -80,8 +80,6 @@ pub(crate) struct FalconUCodeDescV3 {
     _reserved: u16,
 }
 
-// To be removed once that code is used.
-#[expect(dead_code)]
 impl FalconUCodeDescV3 {
     pub(crate) fn size(&self) -> usize {
         ((self.hdr & 0xffff0000) >> 16) as usize
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index ece13594fba687f3f714e255b5436e72d80dece3..4bf7f72247e5320935a517270b5a0e1ec2becfec 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -9,6 +9,7 @@
 use crate::firmware::Firmware;
 use crate::regs;
 use crate::util;
+use crate::vbios::Vbios;
 use core::fmt;
 
 macro_rules! define_chipset {
@@ -238,6 +239,8 @@ pub(crate) fn new(
 
         let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
 
+        let _bios = Vbios::new(pdev, bar)?;
+
         Ok(pin_init!(Self {
             spec,
             bar: devres_bar,
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 8342482a1aa16da2e69f7d99143c1549a82c969e..ff6d0b40c18f36af4c7e2d5c839fdf77dba23321 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -10,6 +10,7 @@
 mod gpu;
 mod regs;
 mod util;
+mod vbios;
 
 kernel::module_pci_driver! {
     type: driver::NovaCore,
diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
new file mode 100644
index 0000000000000000000000000000000000000000..cd55d8dbf8e12d532f776d7544c7e5f2a865d6f8
--- /dev/null
+++ b/drivers/gpu/nova-core/vbios.rs
@@ -0,0 +1,1147 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! VBIOS extraction and parsing.
+
+// To be removed when all code is used.
+#![expect(dead_code)]
+
+use crate::driver::Bar0;
+use crate::firmware::FalconUCodeDescV3;
+use core::convert::TryFrom;
+use kernel::device;
+use kernel::error::Result;
+use kernel::num::NumAlign;
+use kernel::pci;
+use kernel::prelude::*;
+
+/// The offset of the VBIOS ROM in the BAR0 space.
+const ROM_OFFSET: usize = 0x300000;
+/// The maximum length of the VBIOS ROM to scan into.
+const BIOS_MAX_SCAN_LEN: usize = 0x100000;
+/// The size to read ahead when parsing initial BIOS image headers.
+const BIOS_READ_AHEAD_SIZE: usize = 1024;
+/// The bit in the last image indicator byte for the PCI Data Structure that
+/// indicates the last image. Bit 0-6 are reserved, bit 7 is last image bit.
+const LAST_IMAGE_BIT_MASK: u8 = 0x80;
+
+// PMU lookup table entry types. Used to locate PMU table entries
+// in the Fwsec image, corresponding to falcon ucodes.
+#[expect(dead_code)]
+const FALCON_UCODE_ENTRY_APPID_FIRMWARE_SEC_LIC: u8 = 0x05;
+#[expect(dead_code)]
+const FALCON_UCODE_ENTRY_APPID_FWSEC_DBG: u8 = 0x45;
+const FALCON_UCODE_ENTRY_APPID_FWSEC_PROD: u8 = 0x85;
+
+/// Vbios Reader for constructing the VBIOS data
+struct VbiosIterator<'a> {
+    pdev: &'a pci::Device,
+    bar0: &'a Bar0,
+    // VBIOS data vector: As BIOS images are scanned, they are added to this vector
+    // for reference or copying into other data structures. It is the entire
+    // scanned contents of the VBIOS which progressively extends. It is used
+    // so that we do not re-read any contents that are already read as we use
+    // the cumulative length read so far, and re-read any gaps as we extend
+    // the length.
+    data: KVec<u8>,
+    current_offset: usize, // Current offset for iterator
+    last_found: bool,      // Whether the last image has been found
+}
+
+impl<'a> VbiosIterator<'a> {
+    fn new(pdev: &'a pci::Device, bar0: &'a Bar0) -> Result<Self> {
+        Ok(Self {
+            pdev,
+            bar0,
+            data: KVec::new(),
+            current_offset: 0,
+            last_found: false,
+        })
+    }
+
+    /// Read bytes from the ROM at the current end of the data vector
+    fn read_more(&mut self, len: usize) -> Result {
+        let current_len = self.data.len();
+        let start = ROM_OFFSET + current_len;
+
+        // Ensure length is a multiple of 4 for 32-bit reads
+        if len % core::mem::size_of::<u32>() != 0 {
+            dev_err!(
+                self.pdev.as_ref(),
+                "VBIOS read length {} is not a multiple of 4\n",
+                len
+            );
+            return Err(EINVAL);
+        }
+
+        self.data.reserve(len, GFP_KERNEL)?;
+        // Read ROM data bytes and push directly to vector
+        for i in (0..len).step_by(core::mem::size_of::<u32>()) {
+            // Read 32-bit word from the VBIOS ROM
+            let word = self.bar0.try_read32(start + i)?;
+
+            // Convert the u32 to a 4 byte array and push each byte
+            word.to_ne_bytes()
+                .iter()
+                .try_for_each(|&b| self.data.push(b, GFP_KERNEL))?;
+        }
+
+        Ok(())
+    }
+
+    /// Read bytes at a specific offset, filling any gap
+    fn read_more_at_offset(&mut self, offset: usize, len: usize) -> Result {
+        if offset > BIOS_MAX_SCAN_LEN {
+            dev_err!(self.pdev.as_ref(), "Error: exceeded BIOS scan limit.\n");
+            return Err(EINVAL);
+        }
+
+        // If offset is beyond current data size, fill the gap first
+        let current_len = self.data.len();
+        let gap_bytes = offset.saturating_sub(current_len);
+
+        // Now read the requested bytes at the offset
+        self.read_more(gap_bytes + len)
+    }
+
+    /// Read a BIOS image at a specific offset and create a BiosImage from it.
+    /// self.data is extended as needed and a new BiosImage is returned.
+    /// @context is a string describing the operation for error reporting
+    fn read_bios_image_at_offset(
+        &mut self,
+        offset: usize,
+        len: usize,
+        context: &str,
+    ) -> Result<BiosImage> {
+        let data_len = self.data.len();
+        if offset + len > data_len {
+            self.read_more_at_offset(offset, len).inspect_err(|e| {
+                dev_err!(
+                    self.pdev.as_ref(),
+                    "Failed to read more at offset {:#x}: {:?}\n",
+                    offset,
+                    e
+                )
+            })?;
+        }
+
+        BiosImage::new(self.pdev, &self.data[offset..offset + len]).inspect_err(|err| {
+            dev_err!(
+                self.pdev.as_ref(),
+                "Failed to {} at offset {:#x}: {:?}\n",
+                context,
+                offset,
+                err
+            )
+        })
+    }
+}
+
+impl<'a> Iterator for VbiosIterator<'a> {
+    type Item = Result<BiosImage>;
+
+    /// Iterate over all VBIOS images until the last image is detected or offset
+    /// exceeds scan limit.
+    fn next(&mut self) -> Option<Self::Item> {
+        if self.last_found {
+            return None;
+        }
+
+        if self.current_offset > BIOS_MAX_SCAN_LEN {
+            dev_err!(
+                self.pdev.as_ref(),
+                "Error: exceeded BIOS scan limit, stopping scan\n"
+            );
+            return None;
+        }
+
+        // Parse image headers first to get image size
+        let image_size = match self
+            .read_bios_image_at_offset(
+                self.current_offset,
+                BIOS_READ_AHEAD_SIZE,
+                "parse initial BIOS image headers",
+            )
+            .and_then(|image| image.image_size_bytes())
+        {
+            Ok(size) => size,
+            Err(e) => return Some(Err(e)),
+        };
+
+        // Now create a new BiosImage with the full image data
+        let full_image = match self.read_bios_image_at_offset(
+            self.current_offset,
+            image_size,
+            "parse full BIOS image",
+        ) {
+            Ok(image) => image,
+            Err(e) => return Some(Err(e)),
+        };
+
+        self.last_found = full_image.is_last();
+
+        // Advance to next image (aligned to 512 bytes)
+        self.current_offset += image_size;
+        self.current_offset = self.current_offset.align_up(512);
+
+        Some(Ok(full_image))
+    }
+}
+
+pub(crate) struct Vbios {
+    pub fwsec_image: Option<FwSecBiosImage>,
+}
+
+impl Vbios {
+    /// Probe for VBIOS extraction
+    /// Once the VBIOS object is built, bar0 is not read for vbios purposes anymore.
+    pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> Result<Vbios> {
+        // Images to extract from iteration
+        let mut pci_at_image: Option<PciAtBiosImage> = None;
+        let mut first_fwsec_image: Option<FwSecBiosImage> = None;
+        let mut second_fwsec_image: Option<FwSecBiosImage> = None;
+
+        // Parse all VBIOS images in the ROM
+        for image_result in VbiosIterator::new(pdev, bar0)? {
+            let full_image = image_result?;
+
+            dev_info!(
+                pdev.as_ref(),
+                "Found BIOS image: size: {:#x}, type: {}, last: {}\n",
+                full_image.image_size_bytes()?,
+                full_image.image_type_str(),
+                full_image.is_last()
+            );
+
+            // Get references to images we will need after the loop, in order to
+            // setup the falcon data offset.
+            match full_image {
+                BiosImage::PciAt(image) => {
+                    pci_at_image = Some(image);
+                }
+                BiosImage::FwSec(image) => {
+                    if first_fwsec_image.is_none() {
+                        first_fwsec_image = Some(image);
+                    } else {
+                        second_fwsec_image = Some(image);
+                    }
+                }
+                // For now we don't need to handle these
+                BiosImage::Efi(_image) => {}
+                BiosImage::Nbsi(_image) => {}
+            }
+        }
+
+        // Using all the images, setup the falcon data pointer in Fwsec.
+        // We need mutable access here, so we handle the Option manually.
+        let final_fwsec_image = {
+            let mut second = second_fwsec_image; // Take ownership of the option
+
+            if let (Some(second), Some(first), Some(pci_at)) =
+                (second.as_mut(), first_fwsec_image, pci_at_image)
+            {
+                second
+                    .setup_falcon_data(pdev, &pci_at, &first)
+                    .inspect_err(|e| {
+                        dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
+                    })?;
+            } else {
+                dev_err!(
+                    pdev.as_ref(),
+                    "Missing required images for falcon data setup, skipping\n"
+                );
+                return Err(EINVAL);
+            }
+            second
+        };
+
+        Ok(Vbios {
+            fwsec_image: final_fwsec_image,
+        })
+    }
+
+    pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> {
+        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
+        image.fwsec_header(pdev)
+    }
+
+    pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> {
+        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
+        image.fwsec_ucode(pdev, image.fwsec_header(pdev)?)
+    }
+
+    pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> {
+        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
+        image.fwsec_sigs(pdev, image.fwsec_header(pdev)?)
+    }
+}
+
+/// PCI Data Structure as defined in PCI Firmware Specification
+#[derive(Debug, Clone)]
+#[repr(C)]
+struct PcirStruct {
+    /// PCI Data Structure signature ("PCIR" or "NPDS")
+    pub signature: [u8; 4],
+    /// PCI Vendor ID (e.g., 0x10DE for NVIDIA)
+    pub vendor_id: u16,
+    /// PCI Device ID
+    pub device_id: u16,
+    /// Device List Pointer
+    pub device_list_ptr: u16,
+    /// PCI Data Structure Length
+    pub pci_data_struct_len: u16,
+    /// PCI Data Structure Revision
+    pub pci_data_struct_rev: u8,
+    /// Class code (3 bytes, 0x03 for display controller)
+    pub class_code: [u8; 3],
+    /// Size of this image in 512-byte blocks
+    pub image_len: u16,
+    /// Revision Level of the Vendor's ROM
+    pub vendor_rom_rev: u16,
+    /// ROM image type (0x00 = PC-AT compatible, 0x03 = EFI, 0x70 = NBSI)
+    pub code_type: u8,
+    /// Last image indicator (0x00 = Not last image, 0x80 = Last image)
+    pub last_image: u8,
+    /// Maximum Run-time Image Length (units of 512 bytes)
+    pub max_runtime_image_len: u16,
+}
+
+impl PcirStruct {
+    fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> {
+        if data.len() < core::mem::size_of::<PcirStruct>() {
+            dev_err!(pdev.as_ref(), "Not enough data for PcirStruct\n");
+            return Err(EINVAL);
+        }
+
+        let mut signature = [0u8; 4];
+        signature.copy_from_slice(&data[0..4]);
+
+        // Signature should be "PCIR" (0x52494350) or "NPDS" (0x5344504e)
+        if &signature != b"PCIR" && &signature != b"NPDS" {
+            dev_err!(
+                pdev.as_ref(),
+                "Invalid signature for PcirStruct: {:?}\n",
+                signature
+            );
+            return Err(EINVAL);
+        }
+
+        let mut class_code = [0u8; 3];
+        class_code.copy_from_slice(&data[13..16]);
+
+        Ok(PcirStruct {
+            signature,
+            vendor_id: u16::from_le_bytes([data[4], data[5]]),
+            device_id: u16::from_le_bytes([data[6], data[7]]),
+            device_list_ptr: u16::from_le_bytes([data[8], data[9]]),
+            pci_data_struct_len: u16::from_le_bytes([data[10], data[11]]),
+            pci_data_struct_rev: data[12],
+            class_code,
+            image_len: u16::from_le_bytes([data[16], data[17]]),
+            vendor_rom_rev: u16::from_le_bytes([data[18], data[19]]),
+            code_type: data[20],
+            last_image: data[21],
+            max_runtime_image_len: u16::from_le_bytes([data[22], data[23]]),
+        })
+    }
+
+    /// Check if this is the last image in the ROM
+    fn is_last(&self) -> bool {
+        self.last_image & LAST_IMAGE_BIT_MASK != 0
+    }
+
+    /// Calculate image size in bytes
+    fn image_size_bytes(&self) -> Result<usize> {
+        if self.image_len > 0 {
+            // Image size is in 512-byte blocks
+            Ok(self.image_len as usize * 512)
+        } else {
+            Err(EINVAL)
+        }
+    }
+}
+
+/// BIOS Information Table (BIT) Header
+/// This is the head of the BIT table, that is used to locate the Falcon data.
+/// The BIT table (with its header) is in the PciAtBiosImage and the falcon data
+/// it is pointing to is in the FwSecBiosImage.
+#[derive(Debug, Clone, Copy)]
+#[expect(dead_code)]
+struct BitHeader {
+    /// 0h: BIT Header Identifier (BMP=0x7FFF/BIT=0xB8FF)
+    pub id: u16,
+    /// 2h: BIT Header Signature ("BIT\0")
+    pub signature: [u8; 4],
+    /// 6h: Binary Coded Decimal Version, ex: 0x0100 is 1.00.
+    pub bcd_version: u16,
+    /// 8h: Size of BIT Header (in bytes)
+    pub header_size: u8,
+    /// 9h: Size of BIT Tokens (in bytes)
+    pub token_size: u8,
+    /// 10h: Number of token entries that follow
+    pub token_entries: u8,
+    /// 11h: BIT Header Checksum
+    pub checksum: u8,
+}
+
+impl BitHeader {
+    fn new(data: &[u8]) -> Result<Self> {
+        if data.len() < 12 {
+            return Err(EINVAL);
+        }
+
+        let mut signature = [0u8; 4];
+        signature.copy_from_slice(&data[2..6]);
+
+        // Check header ID and signature
+        let id = u16::from_le_bytes([data[0], data[1]]);
+        if id != 0xB8FF || &signature != b"BIT\0" {
+            return Err(EINVAL);
+        }
+
+        Ok(BitHeader {
+            id,
+            signature,
+            bcd_version: u16::from_le_bytes([data[6], data[7]]),
+            header_size: data[8],
+            token_size: data[9],
+            token_entries: data[10],
+            checksum: data[11],
+        })
+    }
+}
+
+/// BIT Token Entry: Records in the BIT table followed by the BIT header
+#[derive(Debug, Clone, Copy)]
+#[expect(dead_code)]
+struct BitToken {
+    /// 00h: Token identifier
+    pub id: u8,
+    /// 01h: Version of the token data
+    pub data_version: u8,
+    /// 02h: Size of token data in bytes
+    pub data_size: u16,
+    /// 04h: Offset to the token data
+    pub data_offset: u16,
+}
+
+// Define the token ID for the Falcon data
+pub(in crate::vbios) const BIT_TOKEN_ID_FALCON_DATA: u8 = 0x70;
+
+impl BitToken {
+    /// Find a BIT token entry by BIT ID in a PciAtBiosImage
+    pub(in crate::vbios) fn from_id(image: &PciAtBiosImage, token_id: u8) -> Result<Self> {
+        let header = image.bit_header.as_ref().ok_or(EINVAL)?;
+
+        // Offset to the first token entry
+        let tokens_start = image.bit_offset.ok_or(EINVAL)? + header.header_size as usize;
+
+        for i in 0..header.token_entries as usize {
+            let entry_offset = tokens_start + (i * header.token_size as usize);
+
+            // Make sure we don't go out of bounds
+            if entry_offset + header.token_size as usize > image.base.data.len() {
+                return Err(EINVAL);
+            }
+
+            // Check if this token has the requested ID
+            if image.base.data[entry_offset] == token_id {
+                return Ok(BitToken {
+                    id: image.base.data[entry_offset],
+                    data_version: image.base.data[entry_offset + 1],
+                    data_size: u16::from_le_bytes([
+                        image.base.data[entry_offset + 2],
+                        image.base.data[entry_offset + 3],
+                    ]),
+                    data_offset: u16::from_le_bytes([
+                        image.base.data[entry_offset + 4],
+                        image.base.data[entry_offset + 5],
+                    ]),
+                });
+            }
+        }
+
+        // Token not found
+        Err(ENOENT)
+    }
+}
+
+/// PCI ROM Expansion Header as defined in PCI Firmware Specification.
+/// This is header is at the beginning of every image in the set of
+/// images in the ROM. It contains a pointer to the PCI Data Structure
+/// which describes the image.
+/// For "NBSI" images (NoteBook System Information), the ROM
+/// header deviates from the standard and contains an offset to the
+/// NBSI image however we do not yet parse that in this module and keep
+/// it for future reference.
+#[derive(Debug, Clone, Copy)]
+#[expect(dead_code)]
+struct PciRomHeader {
+    /// 00h: Signature (0xAA55)
+    pub signature: u16,
+    /// 02h: Reserved bytes for processor architecture unique data (20 bytes)
+    pub reserved: [u8; 20],
+    /// 16h: NBSI Data Offset (NBSI-specific, offset from header to NBSI image)
+    pub nbsi_data_offset: Option<u16>,
+    /// 18h: Pointer to PCI Data Structure (offset from start of ROM image)
+    pub pci_data_struct_offset: u16,
+    /// 1Ah: Size of block (this is NBSI-specific)
+    pub size_of_block: Option<u32>,
+}
+
+impl PciRomHeader {
+    fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> {
+        if data.len() < 26 {
+            // Need at least 26 bytes to read pciDataStrucPtr and sizeOfBlock
+            return Err(EINVAL);
+        }
+
+        let signature = u16::from_le_bytes([data[0], data[1]]);
+
+        // Check for valid ROM signatures
+        match signature {
+            0xAA55 | 0xBB77 | 0x4E56 => {}
+            _ => {
+                dev_err!(pdev.as_ref(), "ROM signature unknown {:#x}\n", signature);
+                return Err(EINVAL);
+            }
+        }
+
+        // Read the pointer to the PCI Data Structure at offset 0x18
+        let pci_data_struct_ptr = u16::from_le_bytes([data[24], data[25]]);
+
+        // Try to read optional fields if enough data
+        let mut size_of_block = None;
+        let mut nbsi_data_offset = None;
+
+        if data.len() >= 30 {
+            // Read size_of_block at offset 0x1A
+            size_of_block = Some(
+                (data[29] as u32) << 24
+                    | (data[28] as u32) << 16
+                    | (data[27] as u32) << 8
+                    | (data[26] as u32),
+            );
+        }
+
+        // For NBSI images, try to read the nbsiDataOffset at offset 0x16
+        if data.len() >= 24 {
+            nbsi_data_offset = Some(u16::from_le_bytes([data[22], data[23]]));
+        }
+
+        Ok(PciRomHeader {
+            signature,
+            reserved: [0u8; 20],
+            pci_data_struct_offset: pci_data_struct_ptr,
+            size_of_block,
+            nbsi_data_offset,
+        })
+    }
+}
+
+/// NVIDIA PCI Data Extension Structure. This is similar to the
+/// PCI Data Structure, but is Nvidia-specific and is placed right after
+/// the PCI Data Structure. It contains some fields that are redundant
+/// with the PCI Data Structure, but are needed for traversing the
+/// BIOS images. It is expected to be present in all BIOS images except
+/// for NBSI images.
+#[derive(Debug, Clone)]
+#[expect(dead_code)]
+struct NpdeStruct {
+    /// 00h: Signature ("NPDE")
+    pub signature: [u8; 4],
+    /// 04h: NVIDIA PCI Data Extension Revision
+    pub npci_data_ext_rev: u16,
+    /// 06h: NVIDIA PCI Data Extension Length
+    pub npci_data_ext_len: u16,
+    /// 08h: Sub-image Length (in 512-byte units)
+    pub subimage_len: u16,
+    /// 0Ah: Last image indicator flag
+    pub last_image: u8,
+}
+
+impl NpdeStruct {
+    fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> {
+        if data.len() < 11 {
+            dev_err!(pdev.as_ref(), "Not enough data for NpdeStruct\n");
+            return Err(EINVAL);
+        }
+
+        let mut signature = [0u8; 4];
+        signature.copy_from_slice(&data[0..4]);
+
+        // Signature should be "NPDE" (0x4544504E)
+        if &signature != b"NPDE" {
+            dev_err!(
+                pdev.as_ref(),
+                "Invalid signature for NpdeStruct: {:?}\n",
+                signature
+            );
+            return Err(EINVAL);
+        }
+
+        Ok(NpdeStruct {
+            signature,
+            npci_data_ext_rev: u16::from_le_bytes([data[4], data[5]]),
+            npci_data_ext_len: u16::from_le_bytes([data[6], data[7]]),
+            subimage_len: u16::from_le_bytes([data[8], data[9]]),
+            last_image: data[10],
+        })
+    }
+
+    /// Check if this is the last image in the ROM
+    fn is_last(&self) -> bool {
+        self.last_image & LAST_IMAGE_BIT_MASK != 0
+    }
+
+    /// Calculate image size in bytes
+    fn image_size_bytes(&self) -> Result<usize> {
+        if self.subimage_len > 0 {
+            // Image size is in 512-byte blocks
+            Ok(self.subimage_len as usize * 512)
+        } else {
+            Err(EINVAL)
+        }
+    }
+
+    /// Try to find NPDE in the data, the NPDE is right after the PCIR.
+    fn find_in_data(
+        pdev: &pci::Device,
+        data: &[u8],
+        rom_header: &PciRomHeader,
+        pcir: &PcirStruct,
+    ) -> Option<Self> {
+        // Calculate the offset where NPDE might be located
+        // NPDE should be right after the PCIR structure, aligned to 16 bytes
+        let pcir_offset = rom_header.pci_data_struct_offset as usize;
+        let npde_start = (pcir_offset + pcir.pci_data_struct_len as usize + 0x0F) & !0x0F;
+
+        // Check if we have enough data
+        if npde_start + 11 > data.len() {
+            dev_err!(pdev.as_ref(), "Not enough data for NPDE\n");
+            return None;
+        }
+
+        // Try to create NPDE from the data
+        NpdeStruct::new(pdev, &data[npde_start..])
+            .inspect_err(|e| {
+                dev_err!(pdev.as_ref(), "Error creating NpdeStruct: {:?}\n", e);
+            })
+            .ok()
+    }
+}
+
+// Use a macro to implement BiosImage enum and methods. This avoids having to
+// repeat each enum type when implementing functions like base() in BiosImage.
+macro_rules! bios_image {
+    (
+        $($variant:ident $class:ident),* $(,)?
+    ) => {
+        // BiosImage enum with variants for each image type
+        enum BiosImage {
+            $($variant($class)),*
+        }
+
+        impl BiosImage {
+            /// Get a reference to the common BIOS image data regardless of type
+            fn base(&self) -> &BiosImageBase {
+                match self {
+                    $(Self::$variant(img) => &img.base),*
+                }
+            }
+
+            /// Returns a string representing the type of BIOS image
+            fn image_type_str(&self) -> &'static str {
+                match self {
+                    $(Self::$variant(_) => stringify!($variant)),*
+                }
+            }
+        }
+    }
+}
+
+impl BiosImage {
+    /// Check if this is the last image
+    fn is_last(&self) -> bool {
+        let base = self.base();
+
+        // For NBSI images (type == 0x70), return true as they're
+        // considered the last image
+        if matches!(self, Self::Nbsi(_)) {
+            return true;
+        }
+
+        // For other image types, check the NPDE first if available
+        if let Some(ref npde) = base.npde {
+            return npde.is_last();
+        }
+
+        // Otherwise, fall back to checking the PCIR last_image flag
+        base.pcir.is_last()
+    }
+
+    /// Get the image size in bytes
+    fn image_size_bytes(&self) -> Result<usize> {
+        let base = self.base();
+
+        // Prefer NPDE image size if available
+        if let Some(ref npde) = base.npde {
+            return npde.image_size_bytes();
+        }
+
+        // Otherwise, fall back to the PCIR image size
+        base.pcir.image_size_bytes()
+    }
+
+    /// Create a BiosImageBase from a byte slice and convert it to a BiosImage
+    /// which triggers the constructor of the specific BiosImage enum variant.
+    fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> {
+        let base = BiosImageBase::new(pdev, data)?;
+        let image = base.into_image().inspect_err(|e| {
+            dev_err!(pdev.as_ref(), "Failed to create BiosImage: {:?}\n", e);
+        })?;
+
+        image.image_size_bytes().inspect_err(|_| {
+            dev_err!(
+                pdev.as_ref(),
+                "Invalid image size computed during BiosImage creation\n"
+            )
+        })?;
+
+        Ok(image)
+    }
+}
+
+bios_image! {
+    PciAt PciAtBiosImage,   // PCI-AT compatible BIOS image
+    Efi EfiBiosImage,       // EFI (Extensible Firmware Interface)
+    Nbsi NbsiBiosImage,     // NBSI (Nvidia Bios System Interface)
+    FwSec FwSecBiosImage    // FWSEC (Firmware Security)
+}
+
+struct PciAtBiosImage {
+    base: BiosImageBase,
+    bit_header: Option<BitHeader>,
+    bit_offset: Option<usize>,
+}
+
+struct EfiBiosImage {
+    base: BiosImageBase,
+    // EFI-specific fields can be added here in the future.
+}
+
+struct NbsiBiosImage {
+    base: BiosImageBase,
+    // NBSI-specific fields can be added here in the future.
+}
+
+pub(crate) struct FwSecBiosImage {
+    base: BiosImageBase,
+    // FWSEC-specific fields
+    // The offset of the Falcon data from the start of Fwsec image
+    falcon_data_offset: Option<usize>,
+    // The PmuLookupTable starts at the offset of the falcon data pointer
+    pmu_lookup_table: Option<PmuLookupTable>,
+    // The offset of the Falcon ucode
+    falcon_ucode_offset: Option<usize>,
+}
+
+// Convert from BiosImageBase to BiosImage
+impl TryFrom<BiosImageBase> for BiosImage {
+    type Error = Error;
+
+    fn try_from(base: BiosImageBase) -> Result<Self> {
+        match base.pcir.code_type {
+            0x00 => Ok(BiosImage::PciAt(base.try_into()?)),
+            0x03 => Ok(BiosImage::Efi(EfiBiosImage { base })),
+            0x70 => Ok(BiosImage::Nbsi(NbsiBiosImage { base })),
+            0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage {
+                base,
+                falcon_data_offset: None,
+                pmu_lookup_table: None,
+                falcon_ucode_offset: None,
+            })),
+            _ => Err(EINVAL),
+        }
+    }
+}
+
+/// BIOS Image structure containing various headers and references
+/// fields base to all BIOS images. Each BiosImage type has a
+/// BiosImageBase type along with other image-specific fields.
+/// Note that Rust favors composition of types over inheritance.
+#[derive(Debug)]
+#[expect(dead_code)]
+struct BiosImageBase {
+    /// PCI ROM Expansion Header
+    rom_header: PciRomHeader,
+    /// PCI Data Structure
+    pcir: PcirStruct,
+    /// NVIDIA PCI Data Extension (optional)
+    npde: Option<NpdeStruct>,
+    /// Image data (includes ROM header and PCIR)
+    data: KVec<u8>,
+}
+
+impl BiosImageBase {
+    fn into_image(self) -> Result<BiosImage> {
+        BiosImage::try_from(self)
+    }
+
+    /// Creates a new BiosImageBase from raw byte data.
+    fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> {
+        // Ensure we have enough data for the ROM header
+        if data.len() < 26 {
+            dev_err!(pdev.as_ref(), "Not enough data for ROM header\n");
+            return Err(EINVAL);
+        }
+
+        // Parse the ROM header
+        let rom_header = PciRomHeader::new(pdev, &data[0..26])
+            .inspect_err(|e| dev_err!(pdev.as_ref(), "Failed to create PciRomHeader: {:?}\n", e))?;
+
+        // Get the PCI Data Structure using the pointer from the ROM header
+        let pcir_offset = rom_header.pci_data_struct_offset as usize;
+        let pcir_data = data
+            .get(pcir_offset..pcir_offset + core::mem::size_of::<PcirStruct>())
+            .ok_or(EINVAL)
+            .inspect_err(|_| {
+                dev_err!(
+                    pdev.as_ref(),
+                    "PCIR offset {:#x} out of bounds (data length: {})\n",
+                    pcir_offset,
+                    data.len()
+                );
+                dev_err!(
+                    pdev.as_ref(),
+                    "Consider reading more data for construction of BiosImage\n"
+                );
+            })?;
+
+        let pcir = PcirStruct::new(pdev, pcir_data)
+            .inspect_err(|e| dev_err!(pdev.as_ref(), "Failed to create PcirStruct: {:?}\n", e))?;
+
+        // Look for NPDE structure if this is not an NBSI image (type != 0x70)
+        let npde = NpdeStruct::find_in_data(pdev, data, &rom_header, &pcir);
+
+        // Create a copy of the data
+        let mut data_copy = KVec::new();
+        data_copy.extend_with(data.len(), 0, GFP_KERNEL)?;
+        data_copy.copy_from_slice(data);
+
+        Ok(BiosImageBase {
+            rom_header,
+            pcir,
+            npde,
+            data: data_copy,
+        })
+    }
+}
+
+/// The PciAt BIOS image is typically the first BIOS image type found in the
+/// BIOS image chain. It contains the BIT header and the BIT tokens.
+impl PciAtBiosImage {
+    /// Find a byte pattern in a slice
+    fn find_byte_pattern(haystack: &[u8], needle: &[u8]) -> Option<usize> {
+        haystack
+            .windows(needle.len())
+            .position(|window| window == needle)
+    }
+
+    /// Find the BIT header in the PciAtBiosImage
+    fn find_bit_header(data: &[u8]) -> Result<(BitHeader, usize)> {
+        let bit_pattern = [0xff, 0xb8, b'B', b'I', b'T', 0x00];
+        let bit_offset = Self::find_byte_pattern(data, &bit_pattern);
+        if bit_offset.is_none() {
+            return Err(EINVAL);
+        }
+
+        let bit_header = BitHeader::new(&data[bit_offset.ok_or(EINVAL)?..])?;
+        Ok((bit_header, bit_offset.ok_or(EINVAL)?))
+    }
+
+    /// Get a BIT token entry from the BIT table in the PciAtBiosImage
+    fn get_bit_token(&self, token_id: u8) -> Result<BitToken> {
+        BitToken::from_id(self, token_id)
+    }
+
+    /// Find the Falcon data pointer structure in the PciAtBiosImage
+    /// This is just a 4 byte structure that contains a pointer to the
+    /// Falcon data in the FWSEC image.
+    fn falcon_data_ptr(&self, pdev: &pci::Device) -> Result<u32> {
+        let token = self.get_bit_token(BIT_TOKEN_ID_FALCON_DATA)?;
+
+        // Make sure we don't go out of bounds
+        if token.data_offset as usize + 4 > self.base.data.len() {
+            return Err(EINVAL);
+        }
+
+        // read the 4 bytes at the offset specified in the token
+        let offset = token.data_offset as usize;
+        let bytes: [u8; 4] = self.base.data[offset..offset + 4].try_into().map_err(|_| {
+            dev_err!(pdev.as_ref(), "Failed to convert data slice to array");
+            EINVAL
+        })?;
+
+        let data_ptr = u32::from_le_bytes(bytes);
+
+        if (data_ptr as usize) < self.base.data.len() {
+            dev_err!(pdev.as_ref(), "Falcon data pointer out of bounds\n");
+            return Err(EINVAL);
+        }
+
+        Ok(data_ptr)
+    }
+}
+
+impl TryFrom<BiosImageBase> for PciAtBiosImage {
+    type Error = Error;
+
+    fn try_from(base: BiosImageBase) -> Result<Self> {
+        let data_slice = &base.data;
+        let (bit_header, bit_offset) = PciAtBiosImage::find_bit_header(data_slice)?;
+
+        Ok(PciAtBiosImage {
+            base,
+            bit_header: Some(bit_header),
+            bit_offset: Some(bit_offset),
+        })
+    }
+}
+
+/// The PmuLookupTableEntry structure is a single entry in the PmuLookupTable.
+/// See the PmuLookupTable description for more information.
+#[expect(dead_code)]
+struct PmuLookupTableEntry {
+    application_id: u8,
+    target_id: u8,
+    data: u32,
+}
+
+impl PmuLookupTableEntry {
+    fn new(data: &[u8]) -> Result<Self> {
+        if data.len() < 5 {
+            return Err(EINVAL);
+        }
+
+        Ok(PmuLookupTableEntry {
+            application_id: data[0],
+            target_id: data[1],
+            data: u32::from_le_bytes(data[2..6].try_into().map_err(|_| EINVAL)?),
+        })
+    }
+}
+
+/// The PmuLookupTableEntry structure is used to find the PmuLookupTableEntry
+/// for a given application ID. The table of entries is pointed to by the falcon
+/// data pointer in the BIT table, and is used to locate the Falcon Ucode.
+#[expect(dead_code)]
+struct PmuLookupTable {
+    version: u8,
+    header_len: u8,
+    entry_len: u8,
+    entry_count: u8,
+    table_data: KVec<u8>,
+}
+
+impl PmuLookupTable {
+    fn new(data: &[u8]) -> Result<Self> {
+        if data.len() < 4 {
+            return Err(EINVAL);
+        }
+
+        let header_len = data[1] as usize;
+        let entry_len = data[2] as usize;
+        let entry_count = data[3] as usize;
+
+        let required_bytes = header_len + (entry_count * entry_len);
+
+        if data.len() < required_bytes {
+            return Err(EINVAL);
+        }
+
+        // Create a copy of only the table data
+        let mut table_data = KVec::new();
+
+        // "last_entry_bytes" is a debugging aid.
+        let mut last_entry_bytes: Option<KVec<u8>> = if cfg!(debug_assertions) {
+            Some(KVec::new())
+        } else {
+            None
+        };
+
+        for &byte in &data[header_len..required_bytes] {
+            table_data.push(byte, GFP_KERNEL)?;
+
+            if cfg!(debug_assertions) {
+                // Debugging (dumps the table data to dmesg):
+                if let Some(ref mut last_entry_bytes) = last_entry_bytes {
+                    last_entry_bytes.push(byte, GFP_KERNEL)?;
+
+                    let last_entry_bytes_len = last_entry_bytes.len();
+                    if last_entry_bytes_len == entry_len {
+                        pr_info!("Last entry bytes: {:02x?}\n", &last_entry_bytes[..]);
+                        *last_entry_bytes = KVec::new();
+                    }
+                }
+            }
+        }
+
+        Ok(PmuLookupTable {
+            version: data[0],
+            header_len: header_len as u8,
+            entry_len: entry_len as u8,
+            entry_count: entry_count as u8,
+            table_data,
+        })
+    }
+
+    fn lookup_index(&self, idx: u8) -> Result<PmuLookupTableEntry> {
+        if idx >= self.entry_count {
+            return Err(EINVAL);
+        }
+
+        let index = (idx as usize) * self.entry_len as usize;
+        PmuLookupTableEntry::new(&self.table_data[index..])
+    }
+
+    // find entry by type value
+    fn find_entry_by_type(&self, entry_type: u8) -> Result<PmuLookupTableEntry> {
+        for i in 0..self.entry_count {
+            let entry = self.lookup_index(i)?;
+            if entry.application_id == entry_type {
+                return Ok(entry);
+            }
+        }
+
+        Err(EINVAL)
+    }
+}
+
+/// The FwSecBiosImage structure contains the PMU table and the Falcon Ucode.
+/// The PMU table contains voltage/frequency tables as well as a pointer to the
+/// Falcon Ucode.
+impl FwSecBiosImage {
+    fn setup_falcon_data(
+        &mut self,
+        pdev: &pci::Device,
+        pci_at_image: &PciAtBiosImage,
+        first_fwsec_image: &FwSecBiosImage,
+    ) -> Result<()> {
+        let mut offset = pci_at_image.falcon_data_ptr(pdev)? as usize;
+
+        // The falcon data pointer assumes that the PciAt and FWSEC images
+        // are contiguous in memory. However, testing shows the EFI image sits in
+        // between them. So calculate the offset from the end of the PciAt image
+        // rather than the start of it. Compensate.
+        offset -= pci_at_image.base.data.len();
+
+        // The offset is now from the start of the first Fwsec image, however
+        // the offset points to a location in the second Fwsec image. Since
+        // the fwsec images are contiguous, subtract the length of the first Fwsec
+        // image from the offset to get the offset to the start of the second
+        // Fwsec image.
+        offset -= first_fwsec_image.base.data.len();
+
+        self.falcon_data_offset = Some(offset);
+
+        // The PmuLookupTable starts at the offset of the falcon data pointer
+        self.pmu_lookup_table = Some(PmuLookupTable::new(&self.base.data[offset..])?);
+
+        match self
+            .pmu_lookup_table
+            .as_ref()
+            .ok_or(EINVAL)?
+            .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD)
+        {
+            Ok(entry) => {
+                let mut ucode_offset = entry.data as usize;
+                ucode_offset -= pci_at_image.base.data.len();
+                ucode_offset -= first_fwsec_image.base.data.len();
+                self.falcon_ucode_offset = Some(ucode_offset);
+                if cfg!(debug_assertions) {
+                    // Print the v3_desc header for debugging
+                    let v3_desc = self.fwsec_header(pdev.as_ref())?;
+                    pr_info!("PmuLookupTableEntry v3_desc: {:#?}\n", v3_desc);
+                }
+            }
+            Err(e) => {
+                dev_err!(
+                    pdev.as_ref(),
+                    "PmuLookupTableEntry not found, error: {:?}\n",
+                    e
+                );
+            }
+        }
+        Ok(())
+    }
+
+    /// TODO: These were borrowed from the old code for integrating this module
+    /// with the outside world. They should be cleaned up and integrated properly.
+    ///
+    /// Get the FwSec header (FalconUCodeDescV3)
+    fn fwsec_header(&self, dev: &device::Device) -> Result<&FalconUCodeDescV3> {
+        // Get the falcon ucode offset that was found in setup_falcon_data
+        let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)?;
+
+        // Make sure the offset is within the data bounds
+        if falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>() > self.base.data.len() {
+            dev_err!(dev, "fwsec-frts header not contained within BIOS bounds\n");
+            return Err(ERANGE);
+        }
+
+        // Read the first 4 bytes to get the version
+        let hdr_bytes: [u8; 4] = self.base.data[falcon_ucode_offset..falcon_ucode_offset + 4]
+            .try_into()
+            .map_err(|_| EINVAL)?;
+        let hdr = u32::from_le_bytes(hdr_bytes);
+        let ver = (hdr & 0xff00) >> 8;
+
+        if ver != 3 {
+            dev_err!(dev, "invalid fwsec firmware version\n");
+            return Err(EINVAL);
+        }
+
+        // Return a reference to the FalconUCodeDescV3 structure SAFETY: we have checked that
+        // `falcon_ucode_offset + size_of::<FalconUCodeDescV3` is within the bounds of `data.`
+        Ok(unsafe {
+            &*(self.base.data.as_ptr().add(falcon_ucode_offset) as *const FalconUCodeDescV3)
+        })
+    }
+    /// Get the ucode data as a byte slice
+    fn fwsec_ucode(&self, dev: &device::Device, v3_desc: &FalconUCodeDescV3) -> Result<&[u8]> {
+        let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)?;
+
+        // The ucode data follows the descriptor
+        let ucode_data_offset = falcon_ucode_offset + v3_desc.size();
+        let size = (v3_desc.imem_load_size + v3_desc.dmem_load_size) as usize;
+
+        // Get the data slice, checking bounds in a single operation
+        self.base
+            .data
+            .get(ucode_data_offset..ucode_data_offset + size)
+            .ok_or(ERANGE)
+            .inspect_err(|_| dev_err!(dev, "fwsec ucode data not contained within BIOS bounds\n"))
+    }
+
+    /// Get the signatures as a byte slice
+    fn fwsec_sigs(&self, dev: &device::Device, v3_desc: &FalconUCodeDescV3) -> Result<&[u8]> {
+        const SIG_SIZE: usize = 96 * 4;
+
+        let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)?;
+
+        // The signatures data follows the descriptor
+        let sigs_data_offset = falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>();
+        let size = v3_desc.signature_count as usize * SIG_SIZE;
+
+        // Make sure the data is within bounds
+        if sigs_data_offset + size > self.base.data.len() {
+            dev_err!(
+                dev,
+                "fwsec signatures data not contained within BIOS bounds\n"
+            );
+            return Err(ERANGE);
+        }
+
+        Ok(&self.base.data[sigs_data_offset..sigs_data_offset + size])
+    }
+}

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (15 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-13 16:41   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS Alexandre Courbot
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

FWSEC-FRTS is run with the desired address of the FRTS region as
parameter, which we need to compute depending on some hardware
parameters.

Do this in a `FbLayout` structure, that will be later extended to
describe more memory regions used to boot the GSP.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/gpu.rs       |   4 ++
 drivers/gpu/nova-core/gsp.rs       |   3 ++
 drivers/gpu/nova-core/gsp/fb.rs    | 108 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 drivers/gpu/nova-core/regs.rs      |  27 ++++++++++
 5 files changed, 143 insertions(+)

diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 4bf7f72247e5320935a517270b5a0e1ec2becfec..a3d96639706e808305cce66416778d2bf6e7e683 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -7,6 +7,7 @@
 use crate::driver::Bar0;
 use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
 use crate::firmware::Firmware;
+use crate::gsp::fb::FbLayout;
 use crate::regs;
 use crate::util;
 use crate::vbios::Vbios;
@@ -239,6 +240,9 @@ pub(crate) fn new(
 
         let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
 
+        let fb_layout = FbLayout::new(spec.chipset, bar)?;
+        dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout);
+
         let _bios = Vbios::new(pdev, bar)?;
 
         Ok(pin_init!(Self {
diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
new file mode 100644
index 0000000000000000000000000000000000000000..27616a9d2b7069b18661fc97811fa1cac285b8f8
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -0,0 +1,3 @@
+// SPDX-License-Identifier: GPL-2.0
+
+pub(crate) mod fb;
diff --git a/drivers/gpu/nova-core/gsp/fb.rs b/drivers/gpu/nova-core/gsp/fb.rs
new file mode 100644
index 0000000000000000000000000000000000000000..f28ded59469d52daf39e5d19c09efd7bf08fee92
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp/fb.rs
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use core::ops::Range;
+
+use kernel::prelude::*;
+
+use crate::driver::Bar0;
+use crate::gpu::Chipset;
+use crate::regs;
+
+fn align_down(value: u64, align: u64) -> u64 {
+    value & !(align - 1)
+}
+
+/// Layout of the GPU framebuffer memory.
+///
+/// Contains ranges of GPU memory reserved for a given purpose during the GSP bootup process.
+#[derive(Debug)]
+#[expect(dead_code)]
+pub(crate) struct FbLayout {
+    pub fb: Range<u64>,
+
+    pub vga_workspace: Range<u64>,
+    pub bios: Range<u64>,
+
+    pub frts: Range<u64>,
+}
+
+impl FbLayout {
+    pub(crate) fn new(chipset: Chipset, bar: &Bar0) -> Result<Self> {
+        let fb = {
+            let fb_size = vidmem_size(bar, chipset);
+
+            0..fb_size
+        };
+        let fb_len = fb.end - fb.start;
+
+        let vga_workspace = {
+            let vga_base = vga_workspace_addr(bar, fb_len, chipset);
+
+            vga_base..fb.end
+        };
+
+        let bios = vga_workspace.clone();
+
+        let frts = {
+            const FRTS_DOWN_ALIGN: u64 = 0x20000;
+            const FRTS_SIZE: u64 = 0x100000;
+            let frts_base = align_down(vga_workspace.start, FRTS_DOWN_ALIGN) - FRTS_SIZE;
+
+            frts_base..frts_base + FRTS_SIZE
+        };
+
+        Ok(Self {
+            fb,
+            vga_workspace,
+            bios,
+            frts,
+        })
+    }
+}
+
+/// Returns `true` if the display is disabled.
+fn display_disabled(bar: &Bar0, chipset: Chipset) -> bool {
+    if chipset >= Chipset::GA100 {
+        regs::NV_FUSE_STATUS_OPT_DISPLAY_MAXWELL::read(bar).display_disabled()
+    } else {
+        regs::NV_FUSE_STATUS_OPT_DISPLAY_AMPERE::read(bar).display_disabled()
+    }
+}
+
+/// Returns the video memory size in bytes.
+fn vidmem_size(bar: &Bar0, chipset: Chipset) -> u64 {
+    if chipset >= Chipset::GA102 {
+        (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_42::read(bar).value() as u64) << 20
+    } else {
+        let local_mem_range = regs::NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE::read(bar);
+        let size =
+            (local_mem_range.lower_mag() as u64) << ((local_mem_range.lower_scale() as u64) + 20);
+
+        if local_mem_range.ecc_mode_enabled() {
+            size / 16 * 15
+        } else {
+            size
+        }
+    }
+}
+
+/// Returns the vga workspace address.
+fn vga_workspace_addr(bar: &Bar0, fb_size: u64, chipset: Chipset) -> u64 {
+    let base = fb_size - 0x100000;
+    let vga_workspace_base = if display_disabled(bar, chipset) {
+        regs::NV_PDISP_VGA_WORKSPACE_BASE::read(bar)
+    } else {
+        return base;
+    };
+
+    if !vga_workspace_base.status_valid() {
+        return base;
+    }
+
+    let addr = (vga_workspace_base.addr() as u64) << 16;
+    if addr < base {
+        fb_size - 0x20000
+    } else {
+        addr
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index ff6d0b40c18f36af4c7e2d5c839fdf77dba23321..202e978e56f024de3ae8b178e65b63c2cea244e1 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -8,6 +8,7 @@
 mod falcon;
 mod firmware;
 mod gpu;
+mod gsp;
 mod regs;
 mod util;
 mod vbios;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index b5c6eeb6ed873a06b4aefcb375f4944eb0b20597..15ec9b7e69694ff198b5353d562fc1aff5eefd3f 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -53,6 +53,12 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
     23:0    adr_63_40 as u32;
 });
 
+register!(NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE @ 0x00100ce0 {
+    3:0     lower_scale as u8;
+    9:4     lower_mag as u8;
+    30:30   ecc_mode_enabled as bool;
+});
+
 /* PGC6 */
 
 register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 {
@@ -64,6 +70,27 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
     31:0    value as u32;
 });
 
+register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_42 @ 0x001183a4 {
+    31:0    value as u32;
+});
+
+/* PDISP */
+
+register!(NV_PDISP_VGA_WORKSPACE_BASE @ 0x00625f04 {
+    3:3     status_valid as bool;
+    31:8    addr as u32;
+});
+
+/* FUSE */
+
+register!(NV_FUSE_STATUS_OPT_DISPLAY_MAXWELL @ 0x00021c04 {
+    0:0     display_disabled as bool;
+});
+
+register!(NV_FUSE_STATUS_OPT_DISPLAY_AMPERE @ 0x00820c04 {
+    0:0     display_disabled as bool;
+});
+
 /* PFALCON */
 
 register!(NV_PFALCON_FALCON_IRQSCLR @ +0x00000004 {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (16 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-14 16:38   ` Danilo Krummrich
  2025-05-07 13:52 ` [PATCH v3 19/19] gpu: nova-core: load and " Alexandre Courbot
  2025-05-13 13:10 ` [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Danilo Krummrich
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

The FWSEC firmware needs to be extracted from the VBIOS and patched with
the desired command, as well as the right signature. Do this so we are
ready to load and run this firmware into the GSP falcon and create the
FRTS region.

[joelagnelf@nvidia.com: give better names to FalconAppifHdrV1's fields]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/dma.rs            |   3 -
 drivers/gpu/nova-core/firmware.rs       |  18 ++
 drivers/gpu/nova-core/firmware/fwsec.rs | 359 ++++++++++++++++++++++++++++++++
 drivers/gpu/nova-core/gpu.rs            |  20 +-
 drivers/gpu/nova-core/vbios.rs          |   3 -
 5 files changed, 395 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
index 9d90ae01d0044eaab4ddbc3eba216741d7a623ef..a12d0dff574aa38fb5eb8f4d759611af2f8ba3ec 100644
--- a/drivers/gpu/nova-core/dma.rs
+++ b/drivers/gpu/nova-core/dma.rs
@@ -2,9 +2,6 @@
 
 //! Simple DMA object wrapper.
 
-// To be removed when all code is used.
-#![expect(dead_code)]
-
 use core::ops::{Deref, DerefMut};
 
 use kernel::device;
diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 960982174d834c7c66a47ecfb3a15bf47116b2c5..3945fd18499555ddd6fb2e0ea69535b40fcc4b08 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -8,9 +8,12 @@
 use kernel::prelude::*;
 use kernel::str::CString;
 
+use crate::dma::DmaObject;
 use crate::gpu;
 use crate::gpu::Chipset;
 
+pub(crate) mod fwsec;
+
 pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
 
 /// Structure encapsulating the firmware blobs required for the GPU to operate.
@@ -86,6 +89,21 @@ pub(crate) fn size(&self) -> usize {
     }
 }
 
+/// Patch the `ucode_dma` firmware at offset `sig_base_img` with `signature`.
+fn patch_signature(ucode_dma: &mut DmaObject, signature: &[u8], sig_base_img: usize) -> Result<()> {
+    if sig_base_img + signature.len() > ucode_dma.size() {
+        return Err(EINVAL);
+    }
+
+    // SAFETY: we are the only user of this object, so there cannot be any race.
+    let dst = unsafe { ucode_dma.start_ptr_mut().add(sig_base_img) };
+
+    // SAFETY: `signature` and `dst` are valid, properly aligned, and do not overlap.
+    unsafe { core::ptr::copy_nonoverlapping(signature.as_ptr(), dst, signature.len()) };
+
+    Ok(())
+}
+
 pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>);
 
 impl<const N: usize> ModInfoBuilder<N> {
diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs b/drivers/gpu/nova-core/firmware/fwsec.rs
new file mode 100644
index 0000000000000000000000000000000000000000..d7e3418b49ab4ad478e73c0d447a832ed68ecd0a
--- /dev/null
+++ b/drivers/gpu/nova-core/firmware/fwsec.rs
@@ -0,0 +1,359 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! FWSEC is a High Secure firmware that is extracted from the BIOS and performs the first step of
+//! the GSP startup by creating the WPR2 memory region and copying critical areas of the VBIOS into
+//! it after authenticating them, ensuring they haven't been tampered with. It runs on the GSP
+//! falcon.
+//!
+//! Before being run, it needs to be patched in two areas:
+//!
+//! - The command to be run, as this firmware can perform several tasks ;
+//! - The ucode signature, so the GSP falcon can run FWSEC in HS mode.
+
+use core::alloc::Layout;
+
+use kernel::bindings;
+use kernel::device::{self, Device};
+use kernel::prelude::*;
+use kernel::transmute::FromBytes;
+
+use crate::dma::DmaObject;
+use crate::driver::Bar0;
+use crate::falcon::gsp::Gsp;
+use crate::falcon::{Falcon, FalconBromParams, FalconFirmware, FalconLoadTarget};
+use crate::firmware::FalconUCodeDescV3;
+use crate::vbios::Vbios;
+
+const NVFW_FALCON_APPIF_ID_DMEMMAPPER: u32 = 0x4;
+
+#[repr(C)]
+#[derive(Debug)]
+struct FalconAppifHdrV1 {
+    version: u8,
+    header_size: u8,
+    entry_size: u8,
+    entry_count: u8,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FalconAppifHdrV1 {}
+
+#[repr(C, packed)]
+#[derive(Debug)]
+struct FalconAppifV1 {
+    id: u32,
+    dmem_base: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FalconAppifV1 {}
+
+#[derive(Debug)]
+#[repr(C, packed)]
+struct FalconAppifDmemmapperV3 {
+    signature: u32,
+    version: u16,
+    size: u16,
+    cmd_in_buffer_offset: u32,
+    cmd_in_buffer_size: u32,
+    cmd_out_buffer_offset: u32,
+    cmd_out_buffer_size: u32,
+    nvf_img_data_buffer_offset: u32,
+    nvf_img_data_buffer_size: u32,
+    printf_buffer_hdr: u32,
+    ucode_build_time_stamp: u32,
+    ucode_signature: u32,
+    init_cmd: u32,
+    ucode_feature: u32,
+    ucode_cmd_mask0: u32,
+    ucode_cmd_mask1: u32,
+    multi_tgt_tbl: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FalconAppifDmemmapperV3 {}
+
+#[derive(Debug)]
+#[repr(C, packed)]
+struct ReadVbios {
+    ver: u32,
+    hdr: u32,
+    addr: u64,
+    size: u32,
+    flags: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for ReadVbios {}
+
+#[derive(Debug)]
+#[repr(C, packed)]
+struct FrtsRegion {
+    ver: u32,
+    hdr: u32,
+    addr: u32,
+    size: u32,
+    ftype: u32,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FrtsRegion {}
+
+const NVFW_FRTS_CMD_REGION_TYPE_FB: u32 = 2;
+
+#[repr(C, packed)]
+struct FrtsCmd {
+    read_vbios: ReadVbios,
+    frts_region: FrtsRegion,
+}
+// SAFETY: any byte sequence is valid for this struct.
+unsafe impl FromBytes for FrtsCmd {}
+
+const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS: u32 = 0x15;
+const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB: u32 = 0x19;
+
+/// Command for the [`FwsecFirmware`] to execute.
+pub(crate) enum FwsecCommand {
+    /// Asks [`FwsecFirmware`] to carve out the WPR2 area and place a verified copy of the VBIOS
+    /// image into it.
+    Frts { frts_addr: u64, frts_size: u64 },
+    /// Asks [`FwsecFirmware`] to load pre-OS apps on the PMU.
+    #[expect(dead_code)]
+    Sb,
+}
+
+/// Reinterpret the area starting from `offset` in `fw` as an instance of `T` (which must implement
+/// [`FromBytes`]) and return a reference to it.
+///
+/// # Safety
+///
+/// Callers must ensure that the region of memory returned is not written for as long as the
+/// returned reference is alive.
+///
+/// TODO: Remove this and `transmute_mut` once we have a way to transmute objects implementing
+/// FromBytes, e.g.:
+/// https://lore.kernel.org/lkml/20250330234039.29814-1-christiansantoslima21@gmail.com/
+unsafe fn transmute<'a, 'b, T: Sized + FromBytes>(
+    fw: &'a DmaObject,
+    offset: usize,
+) -> Result<&'b T> {
+    if offset + core::mem::size_of::<T>() > fw.size() {
+        return Err(EINVAL);
+    }
+    if (fw.start_ptr() as usize + offset) % core::mem::align_of::<T>() != 0 {
+        return Err(EINVAL);
+    }
+
+    // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is
+    // large enough the contains an instance of `T`, which implements `FromBytes`.
+    Ok(unsafe { &*(fw.start_ptr().add(offset) as *const T) })
+}
+
+/// Reinterpret the area starting from `offset` in `fw` as a mutable instance of `T` (which must
+/// implement [`FromBytes`]) and return a reference to it.
+///
+/// # Safety
+///
+/// Callers must ensure that the region of memory returned is not read or written for as long as
+/// the returned reference is alive.
+unsafe fn transmute_mut<'a, 'b, T: Sized + FromBytes>(
+    fw: &'a mut DmaObject,
+    offset: usize,
+) -> Result<&'b mut T> {
+    if offset + core::mem::size_of::<T>() > fw.size() {
+        return Err(EINVAL);
+    }
+    if (fw.start_ptr_mut() as usize + offset) % core::mem::align_of::<T>() != 0 {
+        return Err(EINVAL);
+    }
+
+    // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is
+    // large enough the contains an instance of `T`, which implements `FromBytes`.
+    Ok(unsafe { &mut *(fw.start_ptr_mut().add(offset) as *mut T) })
+}
+
+/// Patch the Fwsec firmware image in `fw` to run the command `cmd`.
+fn patch_command(fw: &mut DmaObject, v3_desc: &FalconUCodeDescV3, cmd: FwsecCommand) -> Result<()> {
+    let hdr_offset = (v3_desc.imem_load_size + v3_desc.interface_offset) as usize;
+    // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared `fw` with
+    // the hardware yet.
+    let hdr: &FalconAppifHdrV1 = unsafe { transmute(fw, hdr_offset) }?;
+
+    if hdr.version != 1 {
+        return Err(EINVAL);
+    }
+
+    // Find the DMEM mapper section in the firmware.
+    for i in 0..hdr.entry_count as usize {
+        let app: &FalconAppifV1 =
+            // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared
+            // `fw` with the hardware yet.
+            unsafe {
+                transmute(
+                    fw,
+                    hdr_offset + hdr.header_size as usize + i * hdr.entry_size as usize
+                )
+            }?;
+
+        if app.id != NVFW_FALCON_APPIF_ID_DMEMMAPPER {
+            continue;
+        }
+
+        let dmem_mapper: &mut FalconAppifDmemmapperV3 =
+            // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared
+            // `fw` with the hardware yet.
+            unsafe { transmute_mut(fw, (v3_desc.imem_load_size + app.dmem_base) as usize) }?;
+
+        // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared `fw`
+        // with the hardware yet.
+        let frts_cmd: &mut FrtsCmd = unsafe {
+            transmute_mut(
+                fw,
+                (v3_desc.imem_load_size + dmem_mapper.cmd_in_buffer_offset) as usize,
+            )
+        }?;
+
+        frts_cmd.read_vbios = ReadVbios {
+            ver: 1,
+            hdr: core::mem::size_of::<ReadVbios>() as u32,
+            addr: 0,
+            size: 0,
+            flags: 2,
+        };
+
+        dmem_mapper.init_cmd = match cmd {
+            FwsecCommand::Frts {
+                frts_addr,
+                frts_size,
+            } => {
+                frts_cmd.frts_region = FrtsRegion {
+                    ver: 1,
+                    hdr: core::mem::size_of::<FrtsRegion>() as u32,
+                    addr: (frts_addr >> 12) as u32,
+                    size: (frts_size >> 12) as u32,
+                    ftype: NVFW_FRTS_CMD_REGION_TYPE_FB,
+                };
+
+                NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS
+            }
+            FwsecCommand::Sb => NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB,
+        };
+
+        // Return early as we found and patched the DMEMMAPPER region.
+        return Ok(());
+    }
+
+    Err(ENOTSUPP)
+}
+
+/// Firmware extracted from the VBIOS and responsible for e.g. carving out the WPR2 region as the
+/// first step of the GSP bootflow.
+pub(crate) struct FwsecFirmware {
+    desc: FalconUCodeDescV3,
+    ucode: DmaObject,
+}
+
+impl FalconFirmware for FwsecFirmware {
+    type Target = Gsp;
+
+    fn dma_handle(&self) -> bindings::dma_addr_t {
+        self.ucode.dma_handle()
+    }
+
+    fn imem_load(&self) -> FalconLoadTarget {
+        FalconLoadTarget {
+            src_start: 0,
+            dst_start: self.desc.imem_phys_base,
+            len: self.desc.imem_load_size,
+        }
+    }
+
+    fn dmem_load(&self) -> FalconLoadTarget {
+        FalconLoadTarget {
+            src_start: self.desc.imem_load_size,
+            dst_start: self.desc.dmem_phys_base,
+            len: Layout::from_size_align(self.desc.dmem_load_size as usize, 256)
+                // Cannot panic, as 256 is non-zero and a power of 2.
+                .unwrap()
+                .pad_to_align()
+                .size() as u32,
+        }
+    }
+
+    fn brom_params(&self) -> FalconBromParams {
+        FalconBromParams {
+            pkc_data_offset: self.desc.pkc_data_offset,
+            engine_id_mask: self.desc.engine_id_mask,
+            ucode_id: self.desc.ucode_id,
+        }
+    }
+
+    fn boot_addr(&self) -> u32 {
+        0
+    }
+}
+
+impl FwsecFirmware {
+    /// Extract the Fwsec firmware from `bios` and patch it to run with the `cmd` command.
+    pub(crate) fn new(
+        falcon: &Falcon<Gsp>,
+        dev: &Device<device::Bound>,
+        bar: &Bar0,
+        bios: &Vbios,
+        cmd: FwsecCommand,
+    ) -> Result<Self> {
+        let v3_desc = bios.fwsec_header(dev)?;
+        let ucode = bios.fwsec_ucode(dev)?;
+
+        let mut ucode_dma = DmaObject::from_data(dev, ucode)?;
+        patch_command(&mut ucode_dma, v3_desc, cmd)?;
+
+        const SIG_SIZE: usize = 96 * 4;
+        let signatures = bios.fwsec_sigs(dev)?;
+        let sig_base_img = (v3_desc.imem_load_size + v3_desc.pkc_data_offset) as usize;
+
+        if v3_desc.signature_count != 0 {
+            // Patch signature.
+            let desc_sig_versions = v3_desc.signature_versions as u32;
+            let reg_fuse_version = falcon.get_signature_reg_fuse_version(
+                bar,
+                v3_desc.engine_id_mask,
+                v3_desc.ucode_id,
+            )?;
+            dev_dbg!(
+                dev,
+                "desc_sig_versions: {:#x}, reg_fuse_version: {}\n",
+                desc_sig_versions,
+                reg_fuse_version
+            );
+            let signature_idx = {
+                let reg_fuse_version_bit = 1 << reg_fuse_version;
+
+                // Check if the fuse version is supported by the firmware.
+                if desc_sig_versions & reg_fuse_version_bit == 0 {
+                    dev_warn!(
+                        dev,
+                        "no matching signature: {:#x} {:#x}\n",
+                        reg_fuse_version_bit,
+                        v3_desc.signature_versions
+                    );
+                    return Err(EINVAL);
+                }
+
+                // `desc_sig_versions` has one bit set per included signature. Thus, the index of
+                // the signature to patch is the number of bits in `desc_sig_versions` set to `1`
+                // before `reg_fuse_version_bit`.
+
+                // Mask of the bits of `desc_sig_versions` to preserve.
+                let reg_fuse_version_mask = reg_fuse_version_bit.wrapping_sub(1);
+
+                (desc_sig_versions & reg_fuse_version_mask).count_ones()
+            };
+
+            dev_dbg!(dev, "patching signature with index {}\n", signature_idx);
+            let signature_start = signature_idx as usize * SIG_SIZE;
+            let signature = &signatures[signature_start..signature_start + SIG_SIZE];
+            super::patch_signature(&mut ucode_dma, signature, sig_base_img)?;
+        }
+
+        Ok(FwsecFirmware {
+            desc: v3_desc.clone(),
+            ucode: ucode_dma,
+        })
+    }
+}
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index a3d96639706e808305cce66416778d2bf6e7e683..ad71402dbfced2aaa84800a32dbac276b9706803 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -6,6 +6,7 @@
 use crate::dma::DmaObject;
 use crate::driver::Bar0;
 use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
+use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
 use crate::firmware::Firmware;
 use crate::gsp::fb::FbLayout;
 use crate::regs;
@@ -197,7 +198,11 @@ pub(crate) fn new(
     ) -> Result<impl PinInit<Self>> {
         let bar = devres_bar.access(pdev.as_ref())?;
         let spec = Spec::new(bar)?;
-        let fw = Firmware::new(pdev.as_ref(), spec.chipset, "535.113.01")?;
+        let fw = Firmware::new(
+            pdev.as_ref(),
+            spec.chipset,
+            crate::firmware::FIRMWARE_VERSION,
+        )?;
 
         dev_info!(
             pdev.as_ref(),
@@ -243,7 +248,18 @@ pub(crate) fn new(
         let fb_layout = FbLayout::new(spec.chipset, bar)?;
         dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout);
 
-        let _bios = Vbios::new(pdev, bar)?;
+        let bios = Vbios::new(pdev, bar)?;
+
+        let _fwsec_frts = FwsecFirmware::new(
+            &gsp_falcon,
+            pdev.as_ref(),
+            bar,
+            &bios,
+            FwsecCommand::Frts {
+                frts_addr: fb_layout.frts.start,
+                frts_size: fb_layout.frts.end - fb_layout.frts.start,
+            },
+        )?;
 
         Ok(pin_init!(Self {
             spec,
diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
index cd55d8dbf8e12d532f776d7544c7e5f2a865d6f8..71846dbbd512b6996d9f3a3c76aecde9d62460c0 100644
--- a/drivers/gpu/nova-core/vbios.rs
+++ b/drivers/gpu/nova-core/vbios.rs
@@ -2,9 +2,6 @@
 
 //! VBIOS extraction and parsing.
 
-// To be removed when all code is used.
-#![expect(dead_code)]
-
 use crate::driver::Bar0;
 use crate::firmware::FalconUCodeDescV3;
 use core::convert::TryFrom;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 19/19] gpu: nova-core: load and run FWSEC-FRTS
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (17 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS Alexandre Courbot
@ 2025-05-07 13:52 ` Alexandre Courbot
  2025-05-14 16:42   ` Danilo Krummrich
  2025-05-13 13:10 ` [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Danilo Krummrich
  19 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-07 13:52 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet
  Cc: John Hubbard, Ben Skeggs, Joel Fernandes, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Alexandre Courbot

With all the required pieces in place, load FWSEC-FRTS onto the GSP
falcon, run it, and check that it successfully carved out the WPR2
region out of framebuffer memory.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/falcon.rs |  3 ---
 drivers/gpu/nova-core/gpu.rs    | 57 ++++++++++++++++++++++++++++++++++++++++-
 drivers/gpu/nova-core/regs.rs   | 15 +++++++++++
 3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index b6552496ad9c5f8eeda4beaa757c7393f495072b..810a7b4ef0d55fb7bd4bfb7833d5da7df091203c 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -2,9 +2,6 @@
 
 //! Falcon microprocessor base support
 
-// To be removed when all code is used.
-#![expect(dead_code)]
-
 use core::time::Duration;
 use hal::FalconHal;
 use kernel::bindings;
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index ad71402dbfced2aaa84800a32dbac276b9706803..896986333b76d1192c07093289eddab98fe3f26d 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -250,7 +250,7 @@ pub(crate) fn new(
 
         let bios = Vbios::new(pdev, bar)?;
 
-        let _fwsec_frts = FwsecFirmware::new(
+        let fwsec_frts = FwsecFirmware::new(
             &gsp_falcon,
             pdev.as_ref(),
             bar,
@@ -261,6 +261,61 @@ pub(crate) fn new(
             },
         )?;
 
+        // Check that the WPR2 region does not already exists - if it does, the GPU needs to be
+        // reset.
+        if regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() != 0 {
+            dev_err!(
+                pdev.as_ref(),
+                "WPR2 region already exists - GPU needs to be reset to proceed\n"
+            );
+            return Err(EBUSY);
+        }
+
+        // Reset falcon, load FWSEC-FRTS, and run it.
+        gsp_falcon.reset(bar)?;
+        gsp_falcon.dma_load(bar, &fwsec_frts)?;
+        let (mbox0, _) = gsp_falcon.boot(bar, Some(0), None)?;
+        if mbox0 != 0 {
+            dev_err!(pdev.as_ref(), "FWSEC firmware returned error {}\n", mbox0);
+            return Err(EINVAL);
+        }
+
+        // SCRATCH_E contains FWSEC-FRTS' error code, if any.
+        let frts_status = regs::NV_PBUS_SW_SCRATCH_0E::read(bar).frts_err_code();
+        if frts_status != 0 {
+            dev_err!(
+                pdev.as_ref(),
+                "FWSEC-FRTS returned with error code {:#x}",
+                frts_status
+            );
+            return Err(EINVAL);
+        }
+
+        // Check the WPR2 has been created as we requested.
+        let (wpr2_lo, wpr2_hi) = (
+            (regs::NV_PFB_PRI_MMU_WPR2_ADDR_LO::read(bar).lo_val() as u64) << 12,
+            (regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() as u64) << 12,
+        );
+        if wpr2_hi == 0 {
+            dev_err!(
+                pdev.as_ref(),
+                "WPR2 region not created after running FWSEC-FRTS\n"
+            );
+
+            return Err(ENOTTY);
+        } else if wpr2_lo != fb_layout.frts.start {
+            dev_err!(
+                pdev.as_ref(),
+                "WPR2 region created at unexpected address {:#x} ; expected {:#x}\n",
+                wpr2_lo,
+                fb_layout.frts.start,
+            );
+            return Err(EINVAL);
+        }
+
+        dev_info!(pdev.as_ref(), "WPR2: {:#x}-{:#x}\n", wpr2_lo, wpr2_hi);
+        dev_info!(pdev.as_ref(), "GPU instance built\n");
+
         Ok(pin_init!(Self {
             spec,
             bar: devres_bar,
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 15ec9b7e69694ff198b5353d562fc1aff5eefd3f..3acec36f2d5701af4a752808e86d71d5f200359b 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -43,6 +43,13 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
     }
 }
 
+/* PBUS */
+
+// TODO: this is an array of registers.
+register!(NV_PBUS_SW_SCRATCH_0E@0x00001438  {
+    31:16   frts_err_code as u16;
+});
+
 /* PFB */
 
 register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR @ 0x00100c10 {
@@ -59,6 +66,14 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
     30:30   ecc_mode_enabled as bool;
 });
 
+register!(NV_PFB_PRI_MMU_WPR2_ADDR_LO@0x001fa824  {
+    31:4    lo_val as u32;
+});
+
+register!(NV_PFB_PRI_MMU_WPR2_ADDR_HI@0x001fa828  {
+    31:4    hi_val as u32;
+});
+
 /* PGC6 */
 
 register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation
  2025-05-07 13:52 ` [PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation Alexandre Courbot
@ 2025-05-13 12:15   ` Danilo Krummrich
  0 siblings, 0 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 12:15 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:28PM +0900, Alexandre Courbot wrote:
> These properties are very useful to have and should be accessible.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  rust/kernel/dma.rs | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index 605e01e35715667f93297fd9ec49d8e7032e0910..18602d771054fceb80c29278b1945254312ed7c6 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -201,6 +201,20 @@ pub fn alloc_coherent(
>          CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
>      }
>  
> +    /// Returns the number of elements `T` in this allocation.
> +    ///
> +    /// Note that this is not the size of the allocation in bytes, which is provided by
> +    /// [`Self::size`].
> +    pub fn count(&self) -> usize {
> +        self.count
> +    }
> +
> +    /// Returns the size in bytes of this allocation.
> +    pub fn size(&self) -> usize {
> +        // This is guaranteed not to overflow as the same operation has been done in `alloc_attrs`.

Please make this an invariant of CoherentAllocation and refer to that instead.

> +        self.count * core::mem::size_of::<T>()
> +    }
> +
>      /// Returns the base address to the allocated region in the CPU's virtual address space.
>      pub fn start_ptr(&self) -> *const T {
>          self.cpu_addr
> 
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization
  2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
                   ` (18 preceding siblings ...)
  2025-05-07 13:52 ` [PATCH v3 19/19] gpu: nova-core: load and " Alexandre Courbot
@ 2025-05-13 13:10 ` Danilo Krummrich
  19 siblings, 0 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 13:10 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran

On Wed, May 07, 2025 at 10:52:27PM +0900, Alexandre Courbot wrote:
>       gpu: nova-core: derive useful traits for Chipset
>       gpu: nova-core: add missing GA100 definition
>       gpu: nova-core: take bound device in Gpu::new
>       gpu: nova-core: define registers layout using helper macro

    [ Fix typo in commit message. - Danilo ]

>       gpu: nova-core: fix layout of NV_PMC_BOOT_0

    [ Use Result from kernel::prelude. - Danilo ]

>       gpu: nova-core: move Firmware to firmware module

    [ Don't require a bound device, remove pub visibility from Firmware
      fields, use FIRMWARE_VERSION consistently. - Danilo ]

Applied the patches above to nova-next, thanks!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion
  2025-05-07 13:52 ` [PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion Alexandre Courbot
@ 2025-05-13 14:07   ` Danilo Krummrich
  2025-05-16 12:16     ` Alexandre Courbot
  0 siblings, 1 reply; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 14:07 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:36PM +0900, Alexandre Courbot wrote:
> Upon reset, the GPU executes the GFW_BOOT firmware in order to
> initialize its base parameters such as clocks. The driver must ensure
> that this step is completed before using the hardware.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/devinit.rs   | 38 ++++++++++++++++++++++++++++++++++++++
>  drivers/gpu/nova-core/driver.rs    |  2 +-
>  drivers/gpu/nova-core/gpu.rs       |  5 +++++
>  drivers/gpu/nova-core/nova_core.rs |  1 +
>  drivers/gpu/nova-core/regs.rs      | 11 +++++++++++
>  5 files changed, 56 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/nova-core/devinit.rs b/drivers/gpu/nova-core/devinit.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..5be2e0344fb651e5e53c9223aefeb5b2d95b8de1
> --- /dev/null
> +++ b/drivers/gpu/nova-core/devinit.rs
> @@ -0,0 +1,38 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Methods for device initialization.
> +
> +use kernel::bindings;
> +use kernel::prelude::*;
> +
> +use crate::driver::Bar0;
> +use crate::regs;
> +
> +/// Wait for devinit FW completion.
> +///
> +/// Upon reset, the GPU runs some firmware code to setup its core parameters. Most of the GPU is
> +/// considered unusable until this step is completed, so it must be waited on very early during
> +/// driver initialization.
> +pub(crate) fn wait_gfw_boot_completion(bar: &Bar0) -> Result<()> {
> +    let mut timeout = 2000;
> +
> +    loop {
> +        let gfw_booted = regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK::read(bar)
> +            .read_protection_level0()
> +            && (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05::read(bar).value() & 0xff) == 0xff;

What does it mean when the first 8 bits are set? Why don't we have a mask and
value to compare for that?

> +
> +        if gfw_booted {
> +            return Ok(());
> +        }
> +
> +        if timeout == 0 {
> +            return Err(ETIMEDOUT);
> +        }
> +        timeout -= 1;

NIT: This means the timeout is ~4s; can we start with timeout == 4000 and decrement
with the number of ms passed to msleep()?

Anyways, this should go away with read_poll_timeout() anyways.

> +
> +        // TODO: use `read_poll_timeout` once it is available.
> +        // (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/)
> +        // SAFETY: msleep should be safe to call with any parameter.
> +        unsafe { bindings::msleep(2) };
> +    }
> +}
> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
> index a08fb6599267a960f0e07b6efd0e3b6cdc296aa4..752ba4b0fcfe8d835d366570bb2f807840a196da 100644
> --- a/drivers/gpu/nova-core/driver.rs
> +++ b/drivers/gpu/nova-core/driver.rs
> @@ -10,7 +10,7 @@ pub(crate) struct NovaCore {
>      pub(crate) gpu: Gpu,
>  }
>  
> -const BAR0_SIZE: usize = 8;
> +const BAR0_SIZE: usize = 0x1000000;

This means that we'll fail probing the card if BAR0 is not at least 16MiB.
AFAIK, that should be fine. However, can you make this a separate patch please?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 10/19] gpu: nova-core: add DMA object struct
  2025-05-07 13:52 ` [PATCH v3 10/19] gpu: nova-core: add DMA object struct Alexandre Courbot
@ 2025-05-13 14:25   ` Danilo Krummrich
  0 siblings, 0 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 14:25 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:37PM +0900, Alexandre Courbot wrote:
> Since we will need to allocate lots of distinct memory chunks to be
> shared between GPU and CPU, introduce a type dedicated to that. It is a
> light wrapper around CoherentAllocation.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/dma.rs       | 60 ++++++++++++++++++++++++++++++++++++++
>  drivers/gpu/nova-core/nova_core.rs |  1 +
>  2 files changed, 61 insertions(+)
> 
> diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..9d90ae01d0044eaab4ddbc3eba216741d7a623ef
> --- /dev/null
> +++ b/drivers/gpu/nova-core/dma.rs
> @@ -0,0 +1,60 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Simple DMA object wrapper.
> +
> +// To be removed when all code is used.
> +#![expect(dead_code)]
> +
> +use core::ops::{Deref, DerefMut};
> +
> +use kernel::device;
> +use kernel::dma::CoherentAllocation;
> +use kernel::page::PAGE_SIZE;
> +use kernel::prelude::*;
> +
> +pub(crate) struct DmaObject {
> +    dma: CoherentAllocation<u8>,
> +}
> +
> +impl DmaObject {
> +    pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Self> {
> +        let len = core::alloc::Layout::from_size_align(len, PAGE_SIZE)
> +            .map_err(|_| EINVAL)?
> +            .pad_to_align()
> +            .size();
> +        let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?;
> +
> +        Ok(Self { dma })
> +    }
> +
> +    pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
> +        Self::new(dev, data.len()).map(|mut dma_obj| {
> +            // SAFETY:
> +            // - The copied data fits within the size of the allocated object.
> +            // - We have just created this object and there is no other user at this stage.
> +            unsafe {
> +                core::ptr::copy_nonoverlapping(
> +                    data.as_ptr(),
> +                    dma_obj.dma.start_ptr_mut(),
> +                    data.len(),
> +                );
> +            }

This will be replaced with CoherentAllocation::write() I suppose? Please add a
corresponding TODO.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 11/19] gpu: nova-core: register sysmem flush page
  2025-05-07 13:52 ` [PATCH v3 11/19] gpu: nova-core: register sysmem flush page Alexandre Courbot
@ 2025-05-13 14:47   ` Danilo Krummrich
  0 siblings, 0 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 14:47 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:38PM +0900, Alexandre Courbot wrote:
> A page of system memory is reserved so sysmembar can perform a read on

NIT: Use imperative mood.

> it if a system write occurred since the last flush. Do this early as it
> can be required to e.g. reset the GPU falcons.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/gpu.rs  | 45 +++++++++++++++++++++++++++++++++++++++++--
>  drivers/gpu/nova-core/regs.rs | 10 ++++++++++
>  2 files changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index a9fcf74717791dc7e23678869bf84f61e51873e2..c338da69ecbc2200f1ef3061a4d62971b021e3eb 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -3,6 +3,7 @@
>  use kernel::{device, devres::Devres, error::code::*, pci, prelude::*};
>  
>  use crate::devinit;
> +use crate::dma::DmaObject;
>  use crate::driver::Bar0;
>  use crate::firmware::Firmware;
>  use crate::regs;
> @@ -158,12 +159,32 @@ fn new(bar: &Bar0) -> Result<Spec> {
>  }
>  
>  /// Structure holding the resources required to operate the GPU.
> -#[pin_data]
> +#[pin_data(PinnedDrop)]
>  pub(crate) struct Gpu {
>      spec: Spec,
>      /// MMIO mapping of PCI BAR 0
>      bar: Devres<Bar0>,
>      fw: Firmware,
> +    // System memory page required for flushing all pending GPU-side memory writes done through
> +    // PCIE into system memory.

Please make this a doc-comment.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 12/19] gpu: nova-core: add helper function to wait on condition
  2025-05-07 13:52 ` [PATCH v3 12/19] gpu: nova-core: add helper function to wait on condition Alexandre Courbot
@ 2025-05-13 14:50   ` Danilo Krummrich
  0 siblings, 0 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 14:50 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:39PM +0900, Alexandre Courbot wrote:
> While programming the hardware, we frequently need to busy-wait until
> a condition (like a given bit of a register to switch value) happens.
> 
> Add a basic `wait_on` helper function to wait on such conditions
> expressed as a closure, with a timeout argument.
> 
> This is temporary as we will switch to `read_poll_timeout` [1] once it
> is available.
> 
> [1] https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/util.rs | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
> index 332a64cfc6a9d7d787fbdc228887c0be53a97160..afb525228431a2645afe7bb34988e9537757b1d7 100644
> --- a/drivers/gpu/nova-core/util.rs
> +++ b/drivers/gpu/nova-core/util.rs
> @@ -1,5 +1,10 @@
>  // SPDX-License-Identifier: GPL-2.0
>  
> +use core::time::Duration;
> +
> +use kernel::prelude::*;
> +use kernel::time::Ktime;
> +
>  pub(crate) const fn to_lowercase_bytes<const N: usize>(s: &str) -> [u8; N] {
>      let src = s.as_bytes();
>      let mut dst = [0; N];
> @@ -19,3 +24,28 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str {
>          Err(_) => kernel::build_error!("Bytes are not valid UTF-8."),
>      }
>  }
> +
> +/// Wait until `cond` is true or `timeout` elapsed.
> +///
> +/// When `cond` evaluates to `Some`, its return value is returned.
> +///
> +/// `Err(ETIMEDOUT)` is returned if `timeout` has been reached without `cond` evaluating to
> +/// `Some`.
> +///
> +/// TODO: replace with `read_poll_timeout` once it is available.
> +/// (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/)
> +#[expect(dead_code)]
> +pub(crate) fn wait_on<R, F: Fn() -> Option<R>>(timeout: Duration, cond: F) -> Result<R> {
> +    let start_time = Ktime::ktime_get();
> +
> +    loop {
> +        if let Some(ret) = cond() {
> +            return Ok(ret);
> +        }
> +
> +        let cur_time = Ktime::ktime_get();
> +        if (cur_time - start_time).to_ns() > timeout.as_nanos() as i64 {
> +            return Err(ETIMEDOUT);
> +        }
> +    }
> +}

NIT: Can't you use this for wait_gfw_boot_completion() too?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code
  2025-05-07 13:52 ` [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code Alexandre Courbot
@ 2025-05-13 16:19   ` Danilo Krummrich
  2025-05-16 12:19     ` Alexandre Courbot
  0 siblings, 1 reply; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 16:19 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:40PM +0900, Alexandre Courbot wrote:
> Add the common Falcon code and HAL for Ampere GPUs, and instantiate the
> GSP and SEC2 Falcons that will be required to boot the GSP.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/falcon.rs           | 540 ++++++++++++++++++++++++++++++
>  drivers/gpu/nova-core/falcon/gsp.rs       |  22 ++
>  drivers/gpu/nova-core/falcon/hal.rs       |  56 ++++
>  drivers/gpu/nova-core/falcon/hal/ga102.rs | 120 +++++++
>  drivers/gpu/nova-core/falcon/sec2.rs      |   8 +
>  drivers/gpu/nova-core/gpu.rs              |  11 +
>  drivers/gpu/nova-core/nova_core.rs        |   1 +
>  drivers/gpu/nova-core/regs.rs             | 125 +++++++
>  drivers/gpu/nova-core/util.rs             |   1 -
>  9 files changed, 883 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..b6552496ad9c5f8eeda4beaa757c7393f495072b
> --- /dev/null
> +++ b/drivers/gpu/nova-core/falcon.rs
> @@ -0,0 +1,540 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Falcon microprocessor base support
> +
> +// To be removed when all code is used.
> +#![expect(dead_code)]
> +
> +use core::time::Duration;
> +use hal::FalconHal;
> +use kernel::bindings;
> +use kernel::device;
> +use kernel::prelude::*;
> +use kernel::sync::Arc;
> +use kernel::types::ARef;
> +
> +use crate::driver::Bar0;
> +use crate::gpu::Chipset;
> +use crate::regs;
> +use crate::util;
> +
> +pub(crate) mod gsp;
> +mod hal;
> +pub(crate) mod sec2;
> +
> +/// Revision number of a falcon core, used in the [`crate::regs::NV_PFALCON_FALCON_HWCFG1`]
> +/// register.
> +#[repr(u8)]
> +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
> +pub(crate) enum FalconCoreRev {
> +    #[default]
> +    Rev1 = 1,
> +    Rev2 = 2,
> +    Rev3 = 3,
> +    Rev4 = 4,
> +    Rev5 = 5,
> +    Rev6 = 6,
> +    Rev7 = 7,
> +}
> +
> +impl TryFrom<u8> for FalconCoreRev {
> +    type Error = Error;
> +
> +    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {

Here and below, please use Result<T>, which comes from kernel::prelude.

> +        use FalconCoreRev::*;
> +
> +        let rev = match value {
> +            1 => Rev1,
> +            2 => Rev2,
> +            3 => Rev3,
> +            4 => Rev4,
> +            5 => Rev5,
> +            6 => Rev6,
> +            7 => Rev7,
> +            _ => return Err(EINVAL),
> +        };
> +
> +        Ok(rev)
> +    }
> +}
> +
> +/// Revision subversion number of a falcon core, used in the
> +/// [`crate::regs::NV_PFALCON_FALCON_HWCFG1`] register.
> +#[repr(u8)]
> +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
> +pub(crate) enum FalconCoreRevSubversion {
> +    #[default]
> +    Subversion0 = 0,
> +    Subversion1 = 1,
> +    Subversion2 = 2,
> +    Subversion3 = 3,
> +}
> +
> +impl TryFrom<u8> for FalconCoreRevSubversion {
> +    type Error = Error;
> +
> +    fn try_from(value: u8) -> Result<Self> {
> +        use FalconCoreRevSubversion::*;
> +
> +        let sub_version = match value & 0b11 {
> +            0 => Subversion0,
> +            1 => Subversion1,
> +            2 => Subversion2,
> +            3 => Subversion3,
> +            _ => return Err(EINVAL),
> +        };
> +
> +        Ok(sub_version)
> +    }
> +}
> +
> +/// Security model of a falcon core, used in the [`crate::regs::NV_PFALCON_FALCON_HWCFG1`]
> +/// register.
> +#[repr(u8)]
> +#[derive(Debug, Default, Copy, Clone)]
> +pub(crate) enum FalconSecurityModel {
> +    /// Non-Secure: runs unsigned code without privileges.
> +    #[default]
> +    None = 0,
> +    /// Low-secure: runs unsigned code with some privileges. Can only be entered from `Heavy` mode.
> +    Light = 2,
> +    /// High-Secure: runs signed code with full privileges.
> +    Heavy = 3,
> +}
> +
> +impl TryFrom<u8> for FalconSecurityModel {
> +    type Error = Error;
> +
> +    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
> +        use FalconSecurityModel::*;
> +
> +        let sec_model = match value {
> +            0 => None,
> +            2 => Light,
> +            3 => Heavy,
> +            _ => return Err(EINVAL),
> +        };
> +
> +        Ok(sec_model)
> +    }
> +}
> +
> +/// Signing algorithm for a given firmware, used in the [`crate::regs::NV_PFALCON2_FALCON_MOD_SEL`]
> +/// register.
> +#[repr(u8)]
> +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)]
> +pub(crate) enum FalconModSelAlgo {
> +    /// RSA3K.
> +    #[default]
> +    Rsa3k = 1,
> +}
> +
> +impl TryFrom<u8> for FalconModSelAlgo {
> +    type Error = Error;
> +
> +    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
> +        match value {
> +            1 => Ok(FalconModSelAlgo::Rsa3k),
> +            _ => Err(EINVAL),
> +        }
> +    }
> +}
> +
> +/// Valid values for the `size` field of the [`crate::regs::NV_PFALCON_FALCON_DMATRFCMD`] register.
> +#[repr(u8)]
> +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)]
> +pub(crate) enum DmaTrfCmdSize {
> +    /// 256 bytes transfer.
> +    #[default]
> +    Size256B = 0x6,
> +}
> +
> +impl TryFrom<u8> for DmaTrfCmdSize {
> +    type Error = Error;
> +
> +    fn try_from(value: u8) -> Result<Self> {
> +        match value {
> +            0x6 => Ok(Self::Size256B),
> +            _ => Err(EINVAL),
> +        }
> +    }
> +}
> +
> +/// Currently active core on a dual falcon/riscv (Peregrine) controller.
> +#[derive(Debug, Clone, Copy, PartialEq, Eq)]
> +pub(crate) enum PeregrineCoreSelect {
> +    /// Falcon core is active.
> +    Falcon = 0,
> +    /// RISC-V core is active.
> +    Riscv = 1,
> +}
> +
> +impl From<bool> for PeregrineCoreSelect {
> +    fn from(value: bool) -> Self {
> +        match value {
> +            false => PeregrineCoreSelect::Falcon,
> +            true => PeregrineCoreSelect::Riscv,
> +        }
> +    }
> +}
> +
> +/// Different types of memory present in a falcon core.
> +#[derive(Debug, Clone, Copy, PartialEq, Eq)]
> +pub(crate) enum FalconMem {
> +    /// Instruction Memory.
> +    Imem,
> +    /// Data Memory.
> +    Dmem,
> +}
> +
> +/// Target/source of a DMA transfer to/from falcon memory.
> +#[derive(Debug, Clone, Default)]
> +pub(crate) enum FalconFbifTarget {
> +    /// VRAM.
> +    #[default]
> +    LocalFb = 0,
> +    /// Coherent system memory.
> +    CoherentSysmem = 1,
> +    /// Non-coherent system memory.
> +    NoncoherentSysmem = 2,
> +}
> +
> +impl TryFrom<u8> for FalconFbifTarget {
> +    type Error = Error;
> +
> +    fn try_from(value: u8) -> core::result::Result<Self, Self::Error> {
> +        let res = match value {
> +            0 => Self::LocalFb,
> +            1 => Self::CoherentSysmem,
> +            2 => Self::NoncoherentSysmem,
> +            _ => return Err(EINVAL),
> +        };
> +
> +        Ok(res)
> +    }
> +}
> +
> +/// Type of memory addresses to use.
> +#[derive(Debug, Clone, Default)]
> +pub(crate) enum FalconFbifMemType {
> +    /// Physical memory addresses.
> +    #[default]
> +    Virtual = 0,
> +    /// Virtual memory addresses.
> +    Physical = 1,

NIT: Looks like the doc-comments are the wrong way around.

> +}
> +
> +impl From<bool> for FalconFbifMemType {
> +    fn from(value: bool) -> Self {
> +        match value {
> +            false => Self::Virtual,
> +            true => Self::Physical,
> +        }
> +    }
> +}

This conversion seems a bit odd; how is false a canonical value to convert to
FalconFbifMemType::Virtual and true for FalconFbifMemType:::Physical?

Oh, I see. It comes from a register field of a single bit. Maybe it's worth
adding a brief comment.

> +
> +/// Trait defining the parameters of a given Falcon instance.
> +pub(crate) trait FalconEngine: Sync {
> +    /// Base I/O address for the falcon, relative from which its registers are accessed.
> +    const BASE: usize;
> +}
> +
> +/// Represents a portion of the firmware to be loaded into a particular memory (e.g. IMEM or DMEM).
> +#[derive(Debug)]
> +pub(crate) struct FalconLoadTarget {
> +    /// Offset from the start of the source object to copy from.
> +    pub(crate) src_start: u32,
> +    /// Offset from the start of the destination memory to copy into.
> +    pub(crate) dst_start: u32,
> +    /// Number of bytes to copy.
> +    pub(crate) len: u32,
> +}
> +
> +/// Parameters for the falcon boot ROM.
> +#[derive(Debug)]
> +pub(crate) struct FalconBromParams {
> +    /// Offset in `DMEM`` of the firmware's signature.
> +    pub(crate) pkc_data_offset: u32,
> +    /// Mask of engines valid for this firmware.
> +    pub(crate) engine_id_mask: u16,
> +    /// ID of the ucode used to infer a fuse register to validate the signature.
> +    pub(crate) ucode_id: u8,
> +}
> +
> +/// Trait for a falcon firmware.
> +pub(crate) trait FalconFirmware {
> +    /// Engine on which this firmware is to be loaded.
> +    type Target: FalconEngine;
> +
> +    /// Returns the DMA handle of the object containing the firmware.
> +    fn dma_handle(&self) -> bindings::dma_addr_t;
> +
> +    /// Returns the load parameters for `IMEM`.
> +    fn imem_load(&self) -> FalconLoadTarget;
> +
> +    /// Returns the load parameters for `DMEM`.
> +    fn dmem_load(&self) -> FalconLoadTarget;
> +
> +    /// Returns the parameters to write into the BROM registers.
> +    fn brom_params(&self) -> FalconBromParams;
> +
> +    /// Returns the start address of the firmware.
> +    fn boot_addr(&self) -> u32;
> +}
> +
> +/// Contains the base parameters common to all Falcon instances.
> +pub(crate) struct Falcon<E: FalconEngine> {
> +    hal: Arc<dyn FalconHal<E>>,

TODO: Convert to Box, once Box supports type coercion?

Nevermind, you have the comment in create_falcon_hal().

> +    dev: ARef<device::Device>,
> +}
> +
> +impl<E: FalconEngine + 'static> Falcon<E> {
> +    /// Create a new falcon instance.
> +    ///
> +    /// `need_riscv` is set to `true` if the caller expects the falcon to be a dual falcon/riscv
> +    /// controller.
> +    pub(crate) fn new(
> +        dev: &device::Device,
> +        chipset: Chipset,
> +        bar: &Bar0,
> +        need_riscv: bool,
> +    ) -> Result<Self> {
> +        let hwcfg1 = regs::NV_PFALCON_FALCON_HWCFG1::read(bar, E::BASE);
> +        // Ensure that the revision and security model contain valid values.
> +        let _rev = hwcfg1.core_rev()?;
> +        let _sec_model = hwcfg1.security_model()?;

Why not just `_`?

> +
> +        if need_riscv {
> +            let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
> +            if !hwcfg2.riscv() {
> +                dev_err!(
> +                    dev,
> +                    "riscv support requested on a controller that does not support it\n"
> +                );
> +                return Err(EINVAL);
> +            }
> +        }
> +
> +        Ok(Self {
> +            hal: hal::create_falcon_hal(chipset)?,
> +            dev: dev.into(),
> +        })
> +    }
> +
> +    /// Wait for memory scrubbing to complete.
> +    fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result<()> {

Here and below, please use `Result` instead of `Result<()>`.

> +        util::wait_on(Duration::from_millis(20), || {
> +            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
> +            if r.mem_scrubbing() {
> +                Some(())
> +            } else {
> +                None
> +            }
> +        })
> +    }
> +
> +    /// Reset the falcon engine.
> +    fn reset_eng(&self, bar: &Bar0) -> Result<()> {
> +        let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
> +
> +        // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set
> +        // RESET_READY so a non-failing timeout is used.

Should we still warn about it?

> +        let _ = util::wait_on(Duration::from_micros(150), || {

Do we know for sure that if RESET_READY is not set after 150us, it won't ever be
set? If the answer to that is yes, and we also do not want to warn about
RESET_READY not being set, why even bother trying to read it in the first place?

> +            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
> +            if r.reset_ready() {
> +                Some(())
> +            } else {
> +                None
> +            }
> +        });
> +
> +        regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(true));
> +
> +        let _: Result<()> = util::wait_on(Duration::from_micros(10), || None);

Can we please get an abstraction for udelay() for this?

> +
> +        regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(false));
> +
> +        self.reset_wait_mem_scrubbing(bar)?;
> +
> +        Ok(())
> +    }
> +
> +    /// Reset the controller, select the falcon core, and wait for memory scrubbing to complete.
> +    pub(crate) fn reset(&self, bar: &Bar0) -> Result<()> {
> +        self.reset_eng(bar)?;
> +        self.hal.select_core(self, bar)?;
> +        self.reset_wait_mem_scrubbing(bar)?;
> +
> +        regs::NV_PFALCON_FALCON_RM::default()
> +            .set_value(regs::NV_PMC_BOOT_0::read(bar).into())
> +            .write(bar, E::BASE);
> +
> +        Ok(())
> +    }
> +
> +    /// Perform a DMA write according to `load_offsets` from `dma_handle` into the falcon's
> +    /// `target_mem`.
> +    ///
> +    /// `sec` is set if the loaded firmware is expected to run in secure mode.
> +    fn dma_wr(
> +        &self,
> +        bar: &Bar0,
> +        dma_handle: bindings::dma_addr_t,
> +        target_mem: FalconMem,
> +        load_offsets: FalconLoadTarget,
> +        sec: bool,
> +    ) -> Result<()> {
> +        const DMA_LEN: u32 = 256;
> +
> +        // For IMEM, we want to use the start offset as a virtual address tag for each page, since
> +        // code addresses in the firmware (and the boot vector) are virtual.
> +        //
> +        // For DMEM we can fold the start offset into the DMA handle.
> +        let (src_start, dma_start) = match target_mem {
> +            FalconMem::Imem => (load_offsets.src_start, dma_handle),
> +            FalconMem::Dmem => (
> +                0,
> +                dma_handle + load_offsets.src_start as bindings::dma_addr_t,
> +            ),
> +        };
> +        if dma_start % DMA_LEN as bindings::dma_addr_t > 0 {
> +            dev_err!(
> +                self.dev,
> +                "DMA transfer start addresses must be a multiple of {}",
> +                DMA_LEN
> +            );
> +            return Err(EINVAL);
> +        }
> +        if load_offsets.len % DMA_LEN > 0 {
> +            dev_err!(
> +                self.dev,
> +                "DMA transfer length must be a multiple of {}",
> +                DMA_LEN
> +            );
> +            return Err(EINVAL);
> +        }
> +
> +        // Set up the base source DMA address.
> +
> +        regs::NV_PFALCON_FALCON_DMATRFBASE::default()
> +            .set_base((dma_start >> 8) as u32)
> +            .write(bar, E::BASE);
> +        regs::NV_PFALCON_FALCON_DMATRFBASE1::default()
> +            .set_base((dma_start >> 40) as u16)
> +            .write(bar, E::BASE);
> +
> +        let cmd = regs::NV_PFALCON_FALCON_DMATRFCMD::default()
> +            .set_size(DmaTrfCmdSize::Size256B)
> +            .set_imem(target_mem == FalconMem::Imem)
> +            .set_sec(if sec { 1 } else { 0 });
> +
> +        for pos in (0..load_offsets.len).step_by(DMA_LEN as usize) {
> +            // Perform a transfer of size `DMA_LEN`.
> +            regs::NV_PFALCON_FALCON_DMATRFMOFFS::default()
> +                .set_offs(load_offsets.dst_start + pos)
> +                .write(bar, E::BASE);
> +            regs::NV_PFALCON_FALCON_DMATRFFBOFFS::default()
> +                .set_offs(src_start + pos)
> +                .write(bar, E::BASE);
> +            cmd.write(bar, E::BASE);
> +
> +            // Wait for the transfer to complete.
> +            util::wait_on(Duration::from_millis(2000), || {
> +                let r = regs::NV_PFALCON_FALCON_DMATRFCMD::read(bar, E::BASE);
> +                if r.idle() {
> +                    Some(())
> +                } else {
> +                    None
> +                }
> +            })?;
> +        }
> +
> +        Ok(())
> +    }
> +
> +    /// Perform a DMA load into `IMEM` and `DMEM` of `fw`, and prepare the falcon to run it.
> +    pub(crate) fn dma_load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result<()> {
> +        let dma_handle = fw.dma_handle();
> +
> +        regs::NV_PFALCON_FBIF_CTL::alter(bar, E::BASE, |v| v.set_allow_phys_no_ctx(true));
> +        regs::NV_PFALCON_FALCON_DMACTL::default().write(bar, E::BASE);
> +        regs::NV_PFALCON_FBIF_TRANSCFG::alter(bar, E::BASE, |v| {
> +            v.set_target(FalconFbifTarget::CoherentSysmem)
> +                .set_mem_type(FalconFbifMemType::Physical)
> +        });
> +
> +        self.dma_wr(bar, dma_handle, FalconMem::Imem, fw.imem_load(), true)?;
> +        self.dma_wr(bar, dma_handle, FalconMem::Dmem, fw.dmem_load(), true)?;
> +
> +        self.hal.program_brom(self, bar, &fw.brom_params())?;
> +
> +        // Set `BootVec` to start of non-secure code.
> +        regs::NV_PFALCON_FALCON_BOOTVEC::default()
> +            .set_value(fw.boot_addr())
> +            .write(bar, E::BASE);
> +
> +        Ok(())
> +    }
> +
> +    /// Start running the loaded firmware.
> +    ///
> +    /// `mbox0` and `mbox1` are optional parameters to write into the `MBOX0` and `MBOX1` registers
> +    /// prior to running.
> +    ///
> +    /// Returns `MBOX0` and `MBOX1` after the firmware has stopped running.
> +    pub(crate) fn boot(
> +        &self,
> +        bar: &Bar0,
> +        mbox0: Option<u32>,
> +        mbox1: Option<u32>,
> +    ) -> Result<(u32, u32)> {
> +        if let Some(mbox0) = mbox0 {
> +            regs::NV_PFALCON_FALCON_MAILBOX0::default()
> +                .set_value(mbox0)
> +                .write(bar, E::BASE);
> +        }
> +
> +        if let Some(mbox1) = mbox1 {
> +            regs::NV_PFALCON_FALCON_MAILBOX1::default()
> +                .set_value(mbox1)
> +                .write(bar, E::BASE);
> +        }
> +
> +        match regs::NV_PFALCON_FALCON_CPUCTL::read(bar, E::BASE).alias_en() {
> +            true => regs::NV_PFALCON_FALCON_CPUCTL_ALIAS::default()
> +                .set_startcpu(true)
> +                .write(bar, E::BASE),
> +            false => regs::NV_PFALCON_FALCON_CPUCTL::default()
> +                .set_startcpu(true)
> +                .write(bar, E::BASE),
> +        }
> +
> +        util::wait_on(Duration::from_secs(2), || {
> +            let r = regs::NV_PFALCON_FALCON_CPUCTL::read(bar, E::BASE);
> +            if r.halted() {
> +                Some(())
> +            } else {
> +                None
> +            }
> +        })?;
> +
> +        let (mbox0, mbox1) = (
> +            regs::NV_PFALCON_FALCON_MAILBOX0::read(bar, E::BASE).value(),
> +            regs::NV_PFALCON_FALCON_MAILBOX1::read(bar, E::BASE).value(),
> +        );
> +
> +        Ok((mbox0, mbox1))
> +    }
> +
> +    /// Returns the fused version of the signature to use in order to run a HS firmware on this
> +    /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header.
> +    pub(crate) fn get_signature_reg_fuse_version(
> +        &self,
> +        bar: &Bar0,
> +        engine_id_mask: u16,
> +        ucode_id: u8,
> +    ) -> Result<u32> {
> +        self.hal
> +            .get_signature_reg_fuse_version(self, bar, engine_id_mask, ucode_id)
> +    }
> +}
> diff --git a/drivers/gpu/nova-core/falcon/gsp.rs b/drivers/gpu/nova-core/falcon/gsp.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..f74aeadaee9ae96bb1961d3c55b2cf1999943377
> --- /dev/null
> +++ b/drivers/gpu/nova-core/falcon/gsp.rs
> @@ -0,0 +1,22 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use crate::{
> +    driver::Bar0,
> +    falcon::{Falcon, FalconEngine},
> +    regs,
> +};
> +
> +pub(crate) struct Gsp;
> +impl FalconEngine for Gsp {
> +    const BASE: usize = 0x00110000;
> +}
> +
> +impl Falcon<Gsp> {
> +    /// Clears the SWGEN0 bit in the Falcon's IRQ status clear register to
> +    /// allow GSP to signal CPU for processing new messages in message queue.
> +    pub(crate) fn clear_swgen0_intr(&self, bar: &Bar0) {
> +        regs::NV_PFALCON_FALCON_IRQSCLR::default()
> +            .set_swgen0(true)
> +            .write(bar, Gsp::BASE);
> +    }
> +}
> diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..0b0ab8174caafeec3a2a2ba8211a740b7feadb75
> --- /dev/null
> +++ b/drivers/gpu/nova-core/falcon/hal.rs
> @@ -0,0 +1,56 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use kernel::prelude::*;
> +use kernel::sync::Arc;
> +
> +use crate::driver::Bar0;
> +use crate::falcon::{Falcon, FalconBromParams, FalconEngine};
> +use crate::gpu::Chipset;
> +
> +mod ga102;
> +
> +/// Hardware Abstraction Layer for Falcon cores.
> +///
> +/// Implements chipset-specific low-level operations. The trait is generic against [`FalconEngine`]
> +/// so its `BASE` parameter can be used in order to avoid runtime bound checks when accessing
> +/// registers.
> +pub(crate) trait FalconHal<E: FalconEngine>: Sync {
> +    // Activates the Falcon core if the engine is a risvc/falcon dual engine.
> +    fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result<()> {
> +        Ok(())
> +    }
> +
> +    /// Returns the fused version of the signature to use in order to run a HS firmware on this
> +    /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header.
> +    fn get_signature_reg_fuse_version(
> +        &self,
> +        falcon: &Falcon<E>,
> +        bar: &Bar0,
> +        engine_id_mask: u16,
> +        ucode_id: u8,
> +    ) -> Result<u32>;
> +
> +    // Program the boot ROM registers prior to starting a secure firmware.
> +    fn program_brom(&self, falcon: &Falcon<E>, bar: &Bar0, params: &FalconBromParams)
> +        -> Result<()>;
> +}
> +
> +/// Returns a boxed falcon HAL adequate for the passed `chipset`.
> +///
> +/// We use this function and a heap-allocated trait object instead of statically defined trait
> +/// objects because of the two-dimensional (Chipset, Engine) lookup required to return the
> +/// requested HAL.
> +///
> +/// TODO: replace the return type with `KBox` once it gains the ability to host trait objects.
> +pub(crate) fn create_falcon_hal<E: FalconEngine + 'static>(
> +    chipset: Chipset,
> +) -> Result<Arc<dyn FalconHal<E>>> {
> +    let hal = match chipset {
> +        Chipset::GA102 | Chipset::GA103 | Chipset::GA104 | Chipset::GA106 | Chipset::GA107 => {
> +            Arc::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as Arc<dyn FalconHal<E>>
> +        }
> +        _ => return Err(ENOTSUPP),
> +    };
> +
> +    Ok(hal)
> +}
> diff --git a/drivers/gpu/nova-core/falcon/hal/ga102.rs b/drivers/gpu/nova-core/falcon/hal/ga102.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..68d807c192a16a908f40b65f9b03d107f4042b63
> --- /dev/null
> +++ b/drivers/gpu/nova-core/falcon/hal/ga102.rs
> @@ -0,0 +1,120 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use core::marker::PhantomData;
> +use core::time::Duration;
> +
> +use kernel::{device, prelude::*};
> +
> +use crate::driver::Bar0;
> +use crate::falcon::{
> +    Falcon, FalconBromParams, FalconEngine, FalconModSelAlgo, PeregrineCoreSelect,
> +};
> +use crate::regs;
> +use crate::util;
> +
> +use super::FalconHal;
> +
> +fn select_core_ga102<E: FalconEngine>(bar: &Bar0) -> Result<()> {
> +    let bcr_ctrl = regs::NV_PRISCV_RISCV_BCR_CTRL::read(bar, E::BASE);
> +    if bcr_ctrl.core_select() != PeregrineCoreSelect::Falcon {
> +        regs::NV_PRISCV_RISCV_BCR_CTRL::default()
> +            .set_core_select(PeregrineCoreSelect::Falcon)
> +            .write(bar, E::BASE);
> +
> +        util::wait_on(Duration::from_millis(10), || {
> +            let r = regs::NV_PRISCV_RISCV_BCR_CTRL::read(bar, E::BASE);
> +            if r.valid() {
> +                Some(())
> +            } else {
> +                None
> +            }
> +        })?;
> +    }
> +
> +    Ok(())
> +}
> +
> +fn get_signature_reg_fuse_version_ga102(
> +    dev: &device::Device,
> +    bar: &Bar0,
> +    engine_id_mask: u16,
> +    ucode_id: u8,
> +) -> Result<u32> {
> +    // The ucode fuse versions are contained in the FUSE_OPT_FPF_<ENGINE>_UCODE<X>_VERSION
> +    // registers, which are an array. Our register definition macros do not allow us to manage them
> +    // properly, so we need to hardcode their addresses for now.
> +
> +    // Each engine has 16 ucode version registers numbered from 1 to 16.
> +    if ucode_id == 0 || ucode_id > 16 {
> +        dev_warn!(dev, "invalid ucode id {:#x}", ucode_id);

Given that this is an error condition, this should be dev_err!() I suppose.

> +        return Err(EINVAL);
> +    }
> +    let reg_fuse = if engine_id_mask & 0x0001 != 0 {
> +        // NV_FUSE_OPT_FPF_SEC2_UCODE1_VERSION
> +        0x824140
> +    } else if engine_id_mask & 0x0004 != 0 {
> +        // NV_FUSE_OPT_FPF_NVDEC_UCODE1_VERSION
> +        0x824100
> +    } else if engine_id_mask & 0x0400 != 0 {
> +        // NV_FUSE_OPT_FPF_GSP_UCODE1_VERSION
> +        0x8241c0
> +    } else {
> +        dev_warn!(dev, "unexpected engine_id_mask {:#x}", engine_id_mask);

s/dev_warn/dev_err/

> +        return Err(EINVAL);
> +    } + ((ucode_id - 1) as usize * core::mem::size_of::<u32>());

I feel like this calculation deserves a comment.

> +
> +    let reg_fuse_version = bar.read32(reg_fuse);

I feel like the calculation of reg_fuse should be abstracted with a dedicated
type in regs.rs. that takes the magic number derived from the engine_id_mask
(which I assume is chip specific) and the ucode_id.

> +
> +    // Equivalent of Find Last Set bit.
> +    Ok(u32::BITS - reg_fuse_version.leading_zeros())

Maybe we should create a generic helper for that?

> +}
> +
> +fn program_brom_ga102<E: FalconEngine>(bar: &Bar0, params: &FalconBromParams) -> Result<()> {
> +    regs::NV_PFALCON2_FALCON_BROM_PARAADDR::default()
> +        .set_value(params.pkc_data_offset)
> +        .write(bar, E::BASE);
> +    regs::NV_PFALCON2_FALCON_BROM_ENGIDMASK::default()
> +        .set_value(params.engine_id_mask as u32)
> +        .write(bar, E::BASE);
> +    regs::NV_PFALCON2_FALCON_BROM_CURR_UCODE_ID::default()
> +        .set_ucode_id(params.ucode_id)
> +        .write(bar, E::BASE);
> +    regs::NV_PFALCON2_FALCON_MOD_SEL::default()
> +        .set_algo(FalconModSelAlgo::Rsa3k)
> +        .write(bar, E::BASE);
> +
> +    Ok(())
> +}
> +
> +pub(super) struct Ga102<E: FalconEngine>(PhantomData<E>);
> +
> +impl<E: FalconEngine> Ga102<E> {
> +    pub(super) fn new() -> Self {
> +        Self(PhantomData)
> +    }
> +}
> +
> +impl<E: FalconEngine> FalconHal<E> for Ga102<E> {
> +    fn select_core(&self, _falcon: &Falcon<E>, bar: &Bar0) -> Result<()> {
> +        select_core_ga102::<E>(bar)
> +    }
> +
> +    fn get_signature_reg_fuse_version(
> +        &self,
> +        falcon: &Falcon<E>,
> +        bar: &Bar0,
> +        engine_id_mask: u16,
> +        ucode_id: u8,
> +    ) -> Result<u32> {
> +        get_signature_reg_fuse_version_ga102(&falcon.dev, bar, engine_id_mask, ucode_id)
> +    }
> +
> +    fn program_brom(
> +        &self,
> +        _falcon: &Falcon<E>,
> +        bar: &Bar0,
> +        params: &FalconBromParams,
> +    ) -> Result<()> {
> +        program_brom_ga102::<E>(bar, params)
> +    }
> +}
> diff --git a/drivers/gpu/nova-core/falcon/sec2.rs b/drivers/gpu/nova-core/falcon/sec2.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..c1efdaa7c4e1b8c04c4e041aae3b61a8b65f656b
> --- /dev/null
> +++ b/drivers/gpu/nova-core/falcon/sec2.rs
> @@ -0,0 +1,8 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use crate::falcon::FalconEngine;
> +
> +pub(crate) struct Sec2;
> +impl FalconEngine for Sec2 {
> +    const BASE: usize = 0x00840000;
> +}
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index c338da69ecbc2200f1ef3061a4d62971b021e3eb..ece13594fba687f3f714e255b5436e72d80dece3 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -5,6 +5,7 @@
>  use crate::devinit;
>  use crate::dma::DmaObject;
>  use crate::driver::Bar0;
> +use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
>  use crate::firmware::Firmware;
>  use crate::regs;
>  use crate::util;
> @@ -227,6 +228,16 @@ pub(crate) fn new(
>              page
>          };
>  
> +        let gsp_falcon = Falcon::<Gsp>::new(
> +            pdev.as_ref(),
> +            spec.chipset,
> +            bar,
> +            spec.chipset > Chipset::GA100,
> +        )?;
> +        gsp_falcon.clear_swgen0_intr(bar);
> +
> +        let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;

Just `_` instead? Also, please add a comment why it is important to create this
instance even though it's never used.

> +
>          Ok(pin_init!(Self {
>              spec,
>              bar: devres_bar,
> diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
> index 1c7333e9fabe357f3ecbc6944ca98b66fa17c9a5..8342482a1aa16da2e69f7d99143c1549a82c969e 100644
> --- a/drivers/gpu/nova-core/nova_core.rs
> +++ b/drivers/gpu/nova-core/nova_core.rs
> @@ -5,6 +5,7 @@
>  mod devinit;
>  mod dma;
>  mod driver;
> +mod falcon;
>  mod firmware;
>  mod gpu;
>  mod regs;
> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
> index 218cb6441eb0e5c6e5b52eabba006163eec0c8b4..b5c6eeb6ed873a06b4aefcb375f4944eb0b20597 100644
> --- a/drivers/gpu/nova-core/regs.rs
> +++ b/drivers/gpu/nova-core/regs.rs
> @@ -9,6 +9,10 @@
>  #[macro_use]
>  mod macros;
>  
> +use crate::falcon::{
> +    DmaTrfCmdSize, FalconCoreRev, FalconCoreRevSubversion, FalconFbifMemType, FalconFbifTarget,
> +    FalconModSelAlgo, FalconSecurityModel, PeregrineCoreSelect,
> +};
>  use crate::gpu::{Architecture, Chipset};
>  
>  /* PMC */
> @@ -59,3 +63,124 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
>  register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05 @ 0x00118234 {
>      31:0    value as u32;
>  });
> +
> +/* PFALCON */
> +
> +register!(NV_PFALCON_FALCON_IRQSCLR @ +0x00000004 {
> +    4:4     halt as bool;
> +    6:6     swgen0 as bool;
> +});
> +
> +register!(NV_PFALCON_FALCON_MAILBOX0 @ +0x00000040 {
> +    31:0    value as u32;
> +});
> +
> +register!(NV_PFALCON_FALCON_MAILBOX1 @ +0x00000044 {
> +    31:0    value as u32;
> +});
> +
> +register!(NV_PFALCON_FALCON_RM @ +0x00000084 {
> +    31:0    value as u32;
> +});
> +
> +register!(NV_PFALCON_FALCON_HWCFG2 @ +0x000000f4 {
> +    10:10   riscv as bool;
> +    12:12   mem_scrubbing as bool;
> +    31:31   reset_ready as bool, "Signal indicating that reset is completed (GA102+)";
> +});
> +
> +register!(NV_PFALCON_FALCON_CPUCTL @ +0x00000100 {
> +    1:1     startcpu as bool;
> +    4:4     halted as bool;
> +    6:6     alias_en as bool;
> +});
> +
> +register!(NV_PFALCON_FALCON_BOOTVEC @ +0x00000104 {
> +    31:0    value as u32;
> +});
> +
> +register!(NV_PFALCON_FALCON_DMACTL @ +0x0000010c {
> +    0:0     require_ctx as bool;
> +    1:1     dmem_scrubbing as bool;
> +    2:2     imem_scrubbing as bool;
> +    6:3     dmaq_num as u8;
> +    7:7     secure_stat as bool;
> +});
> +
> +register!(NV_PFALCON_FALCON_DMATRFBASE @ +0x00000110 {
> +    31:0    base as u32;
> +});
> +
> +register!(NV_PFALCON_FALCON_DMATRFMOFFS @ +0x00000114 {
> +    23:0    offs as u32;
> +});
> +
> +register!(NV_PFALCON_FALCON_DMATRFCMD @ +0x00000118 {
> +    0:0     full as bool;
> +    1:1     idle as bool;
> +    3:2     sec as u8;
> +    4:4     imem as bool;
> +    5:5     is_write as bool;
> +    10:8    size as u8 ?=> DmaTrfCmdSize;
> +    14:12   ctxdma as u8;
> +    16:16   set_dmtag as u8;
> +});
> +
> +register!(NV_PFALCON_FALCON_DMATRFFBOFFS @ +0x0000011c {
> +    31:0    offs as u32;
> +});
> +
> +register!(NV_PFALCON_FALCON_DMATRFBASE1 @ +0x00000128 {
> +    8:0     base as u16;
> +});
> +
> +register!(NV_PFALCON_FALCON_HWCFG1 @ +0x0000012c {
> +    3:0     core_rev as u8 ?=> FalconCoreRev, "Core revision";
> +    5:4     security_model as u8 ?=> FalconSecurityModel, "Security model";
> +    7:6     core_rev_subversion as u8 ?=> FalconCoreRevSubversion, "Core revision subversion";
> +});
> +
> +register!(NV_PFALCON_FALCON_CPUCTL_ALIAS @ +0x00000130 {
> +    1:1     startcpu as bool;
> +});
> +
> +// Actually known as `NV_PSEC_FALCON_ENGINE` and `NV_PGSP_FALCON_ENGINE` depending on the falcon
> +// instance.
> +register!(NV_PFALCON_FALCON_ENGINE @ +0x000003c0 {
> +    0:0     reset as bool;
> +});
> +
> +// TODO: this is an array of registers.
> +register!(NV_PFALCON_FBIF_TRANSCFG @ +0x00000600 {
> +    1:0     target as u8 ?=> FalconFbifTarget;
> +    2:2     mem_type as bool => FalconFbifMemType;
> +});
> +
> +register!(NV_PFALCON_FBIF_CTL @ +0x00000624 {
> +    7:7     allow_phys_no_ctx as bool;
> +});
> +
> +register!(NV_PFALCON2_FALCON_MOD_SEL @ +0x00001180 {
> +    7:0     algo as u8 ?=> FalconModSelAlgo;
> +});
> +
> +register!(NV_PFALCON2_FALCON_BROM_CURR_UCODE_ID @ +0x00001198 {
> +    7:0    ucode_id as u8;
> +});
> +
> +register!(NV_PFALCON2_FALCON_BROM_ENGIDMASK @ +0x0000119c {
> +    31:0    value as u32;
> +});
> +
> +// TODO: this is an array of registers.
> +register!(NV_PFALCON2_FALCON_BROM_PARAADDR @ +0x00001210 {
> +    31:0    value as u32;
> +});
> +
> +/* PRISCV */
> +
> +register!(NV_PRISCV_RISCV_BCR_CTRL @ +0x00001668 {
> +    0:0     valid as bool;
> +    4:4     core_select as bool => PeregrineCoreSelect;
> +    8:8     br_fetch as bool;
> +});
> diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
> index afb525228431a2645afe7bb34988e9537757b1d7..81fcfff1f6f437d2f6a2130ce2249fbf4c1501be 100644
> --- a/drivers/gpu/nova-core/util.rs
> +++ b/drivers/gpu/nova-core/util.rs
> @@ -34,7 +34,6 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str {
>  ///
>  /// TODO: replace with `read_poll_timeout` once it is available.
>  /// (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/)
> -#[expect(dead_code)]
>  pub(crate) fn wait_on<R, F: Fn() -> Option<R>>(timeout: Duration, cond: F) -> Result<R> {
>      let start_time = Ktime::ktime_get();
>  
> 
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region
  2025-05-07 13:52 ` [PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region Alexandre Courbot
@ 2025-05-13 16:41   ` Danilo Krummrich
  2025-05-17 13:42     ` Alexandre Courbot
  0 siblings, 1 reply; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 16:41 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:44PM +0900, Alexandre Courbot wrote:
> FWSEC-FRTS is run with the desired address of the FRTS region as
> parameter, which we need to compute depending on some hardware
> parameters.
> 
> Do this in a `FbLayout` structure, that will be later extended to
> describe more memory regions used to boot the GSP.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/gpu.rs       |   4 ++
>  drivers/gpu/nova-core/gsp.rs       |   3 ++
>  drivers/gpu/nova-core/gsp/fb.rs    | 108 +++++++++++++++++++++++++++++++++++++
>  drivers/gpu/nova-core/nova_core.rs |   1 +
>  drivers/gpu/nova-core/regs.rs      |  27 ++++++++++
>  5 files changed, 143 insertions(+)
> 
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index 4bf7f72247e5320935a517270b5a0e1ec2becfec..a3d96639706e808305cce66416778d2bf6e7e683 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -7,6 +7,7 @@
>  use crate::driver::Bar0;
>  use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
>  use crate::firmware::Firmware;
> +use crate::gsp::fb::FbLayout;
>  use crate::regs;
>  use crate::util;
>  use crate::vbios::Vbios;
> @@ -239,6 +240,9 @@ pub(crate) fn new(
>  
>          let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
>  
> +        let fb_layout = FbLayout::new(spec.chipset, bar)?;
> +        dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout);
> +
>          let _bios = Vbios::new(pdev, bar)?;
>  
>          Ok(pin_init!(Self {
> diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..27616a9d2b7069b18661fc97811fa1cac285b8f8
> --- /dev/null
> +++ b/drivers/gpu/nova-core/gsp.rs
> @@ -0,0 +1,3 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +pub(crate) mod fb;
> diff --git a/drivers/gpu/nova-core/gsp/fb.rs b/drivers/gpu/nova-core/gsp/fb.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..f28ded59469d52daf39e5d19c09efd7bf08fee92
> --- /dev/null
> +++ b/drivers/gpu/nova-core/gsp/fb.rs
> @@ -0,0 +1,108 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use core::ops::Range;
> +
> +use kernel::prelude::*;
> +
> +use crate::driver::Bar0;
> +use crate::gpu::Chipset;
> +use crate::regs;
> +
> +fn align_down(value: u64, align: u64) -> u64 {
> +    value & !(align - 1)
> +}

Can this go in the previous patch, i.e. "rust: num: Add an upward alignment
helper for usize"?

> +
> +/// Layout of the GPU framebuffer memory.
> +///
> +/// Contains ranges of GPU memory reserved for a given purpose during the GSP bootup process.
> +#[derive(Debug)]
> +#[expect(dead_code)]
> +pub(crate) struct FbLayout {
> +    pub fb: Range<u64>,
> +
> +    pub vga_workspace: Range<u64>,
> +    pub bios: Range<u64>,
> +
> +    pub frts: Range<u64>,

Please remove the empty lines.

> +}
> +
> +impl FbLayout {
> +    pub(crate) fn new(chipset: Chipset, bar: &Bar0) -> Result<Self> {
> +        let fb = {
> +            let fb_size = vidmem_size(bar, chipset);
> +
> +            0..fb_size
> +        };
> +        let fb_len = fb.end - fb.start;

Isn't this the same as fb_size? Why not just write it as

	let fb_size = vidmem_size(bar, chipset);
	let fb = 0..fb_size;

> +
> +        let vga_workspace = {
> +            let vga_base = vga_workspace_addr(bar, fb_len, chipset);
> +
> +            vga_base..fb.end
> +        };
> +
> +        let bios = vga_workspace.clone();

Why? And why store the same thing twice in FbLayout? If it's really needed,
clone it in the constructor below and add a comment why it's the same.

> +
> +        let frts = {
> +            const FRTS_DOWN_ALIGN: u64 = 0x20000;
> +            const FRTS_SIZE: u64 = 0x100000;
> +            let frts_base = align_down(vga_workspace.start, FRTS_DOWN_ALIGN) - FRTS_SIZE;
> +
> +            frts_base..frts_base + FRTS_SIZE
> +        };
> +
> +        Ok(Self {
> +            fb,
> +            vga_workspace,
> +            bios,
> +            frts,
> +        })
> +    }
> +}

I'd probably wrap those helpers below in

	mod hal { ... }

or even a new file fb/hal.rs to make their purpose obvious.

> +/// Returns `true` if the display is disabled.
> +fn display_disabled(bar: &Bar0, chipset: Chipset) -> bool {
> +    if chipset >= Chipset::GA100 {
> +        regs::NV_FUSE_STATUS_OPT_DISPLAY_MAXWELL::read(bar).display_disabled()
> +    } else {
> +        regs::NV_FUSE_STATUS_OPT_DISPLAY_AMPERE::read(bar).display_disabled()
> +    }
> +}
> +
> +/// Returns the video memory size in bytes.
> +fn vidmem_size(bar: &Bar0, chipset: Chipset) -> u64 {
> +    if chipset >= Chipset::GA102 {

Is GA102 intentional or should this also be GA100?

> +        (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_42::read(bar).value() as u64) << 20

Why isn't this shift part of the register abstraction?

> +    } else {
> +        let local_mem_range = regs::NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE::read(bar);
> +        let size =
> +            (local_mem_range.lower_mag() as u64) << ((local_mem_range.lower_scale() as u64) + 20);

Same here. Please make this part of the register abstraction as it is done in
previous patches.

> +
> +        if local_mem_range.ecc_mode_enabled() {
> +            size / 16 * 15
> +        } else {
> +            size
> +        }
> +    }
> +}
> +
> +/// Returns the vga workspace address.
> +fn vga_workspace_addr(bar: &Bar0, fb_size: u64, chipset: Chipset) -> u64 {
> +    let base = fb_size - 0x100000;

What's this offset? How do you guarantee that this never underflows?

> +    let vga_workspace_base = if display_disabled(bar, chipset) {
> +        regs::NV_PDISP_VGA_WORKSPACE_BASE::read(bar)
> +    } else {
> +        return base;
> +    };
> +
> +    if !vga_workspace_base.status_valid() {
> +        return base;
> +    }
> +
> +    let addr = (vga_workspace_base.addr() as u64) << 16;

Where does this shift come from? Shouldn't this be part of the register
abstraction?

> +    if addr < base {
> +        fb_size - 0x20000

What's this offset? Can this ever underflow?

> +    } else {
> +        addr
> +    }
> +}
> diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
> index ff6d0b40c18f36af4c7e2d5c839fdf77dba23321..202e978e56f024de3ae8b178e65b63c2cea244e1 100644
> --- a/drivers/gpu/nova-core/nova_core.rs
> +++ b/drivers/gpu/nova-core/nova_core.rs
> @@ -8,6 +8,7 @@
>  mod falcon;
>  mod firmware;
>  mod gpu;
> +mod gsp;
>  mod regs;
>  mod util;
>  mod vbios;
> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
> index b5c6eeb6ed873a06b4aefcb375f4944eb0b20597..15ec9b7e69694ff198b5353d562fc1aff5eefd3f 100644
> --- a/drivers/gpu/nova-core/regs.rs
> +++ b/drivers/gpu/nova-core/regs.rs
> @@ -53,6 +53,12 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
>      23:0    adr_63_40 as u32;
>  });
>  
> +register!(NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE @ 0x00100ce0 {
> +    3:0     lower_scale as u8;
> +    9:4     lower_mag as u8;
> +    30:30   ecc_mode_enabled as bool;
> +});
> +
>  /* PGC6 */
>  
>  register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 {
> @@ -64,6 +70,27 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> {
>      31:0    value as u32;
>  });
>  
> +register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_42 @ 0x001183a4 {
> +    31:0    value as u32;
> +});
> +
> +/* PDISP */
> +
> +register!(NV_PDISP_VGA_WORKSPACE_BASE @ 0x00625f04 {
> +    3:3     status_valid as bool;
> +    31:8    addr as u32;
> +});
> +
> +/* FUSE */
> +
> +register!(NV_FUSE_STATUS_OPT_DISPLAY_MAXWELL @ 0x00021c04 {
> +    0:0     display_disabled as bool;
> +});
> +
> +register!(NV_FUSE_STATUS_OPT_DISPLAY_AMPERE @ 0x00820c04 {
> +    0:0     display_disabled as bool;
> +});
> +
>  /* PFALCON */
>  
>  register!(NV_PFALCON_FALCON_IRQSCLR @ +0x00000004 {
> 
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-07 13:52 ` [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot Alexandre Courbot
@ 2025-05-13 17:19   ` Danilo Krummrich
  2025-05-20  7:55     ` Joel Fernandes
  2025-05-14 16:23   ` Danilo Krummrich
  2025-05-16 20:38   ` Timur Tabi
  2 siblings, 1 reply; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-13 17:19 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran

On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
> From: Joel Fernandes <joelagnelf@nvidia.com>
> 
> Add support for navigating and setting up vBIOS ucode data required for
> GSP to boot. The main data extracted from the vBIOS is the FWSEC-FRTS
> firmware which runs on the GSP processor. This firmware runs in high
> secure mode, and sets up the WPR2 (Write protected region) before the
> Booter runs on the SEC2 processor.
> 
> Also add log messages to show the BIOS images.
> 
> [102141.013287] NovaCore: Found BIOS image at offset 0x0, size: 0xfe00, type: PciAt
> [102141.080692] NovaCore: Found BIOS image at offset 0xfe00, size: 0x14800, type: Efi
> [102141.098443] NovaCore: Found BIOS image at offset 0x24600, size: 0x5600, type: FwSec
> [102141.415095] NovaCore: Found BIOS image at offset 0x29c00, size: 0x60800, type: FwSec
> 
> Tested on my Ampere GA102 and boot is successful.
> 
> [applied changes by Alex Courbot for fwsec signatures]
> [applied feedback from Alex Courbot and Timur Tabi]
> [applied changes related to code reorg, prints etc from Danilo Krummrich]
> [acourbot@nvidia.com: fix clippy warnings]
> [acourbot@nvidia.com: remove now-unneeded Devres acquisition]
> [acourbot@nvidia.com: fix read_more to read `len` bytes, not u32s]
> 
> Cc: Alexandre Courbot <acourbot@nvidia.com>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Shirish Baskaran <sbaskaran@nvidia.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Timur Tabi <ttabi@nvidia.com>
> Cc: Ben Skeggs <bskeggs@nvidia.com>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware.rs  |    2 -
>  drivers/gpu/nova-core/gpu.rs       |    3 +
>  drivers/gpu/nova-core/nova_core.rs |    1 +
>  drivers/gpu/nova-core/vbios.rs     | 1147 ++++++++++++++++++++++++++++++++++++
>  4 files changed, 1151 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
> index 1eb216307cd01d975b3d5beda1dc516f34b4b3f2..960982174d834c7c66a47ecfb3a15bf47116b2c5 100644
> --- a/drivers/gpu/nova-core/firmware.rs
> +++ b/drivers/gpu/nova-core/firmware.rs
> @@ -80,8 +80,6 @@ pub(crate) struct FalconUCodeDescV3 {
>      _reserved: u16,
>  }
>  
> -// To be removed once that code is used.
> -#[expect(dead_code)]
>  impl FalconUCodeDescV3 {
>      pub(crate) fn size(&self) -> usize {
>          ((self.hdr & 0xffff0000) >> 16) as usize
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index ece13594fba687f3f714e255b5436e72d80dece3..4bf7f72247e5320935a517270b5a0e1ec2becfec 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -9,6 +9,7 @@
>  use crate::firmware::Firmware;
>  use crate::regs;
>  use crate::util;
> +use crate::vbios::Vbios;
>  use core::fmt;
>  
>  macro_rules! define_chipset {
> @@ -238,6 +239,8 @@ pub(crate) fn new(
>  
>          let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
>  
> +        let _bios = Vbios::new(pdev, bar)?;

Please add a comment why, even though unused, it is important to create this
instance.

Also, please use `_` if it's not intended to ever be used.

> +
>          Ok(pin_init!(Self {
>              spec,
>              bar: devres_bar,
> diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
> index 8342482a1aa16da2e69f7d99143c1549a82c969e..ff6d0b40c18f36af4c7e2d5c839fdf77dba23321 100644
> --- a/drivers/gpu/nova-core/nova_core.rs
> +++ b/drivers/gpu/nova-core/nova_core.rs
> @@ -10,6 +10,7 @@
>  mod gpu;
>  mod regs;
>  mod util;
> +mod vbios;
>  
>  kernel::module_pci_driver! {
>      type: driver::NovaCore,
> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..cd55d8dbf8e12d532f776d7544c7e5f2a865d6f8
> --- /dev/null
> +++ b/drivers/gpu/nova-core/vbios.rs
> @@ -0,0 +1,1147 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! VBIOS extraction and parsing.
> +
> +// To be removed when all code is used.
> +#![expect(dead_code)]
> +
> +use crate::driver::Bar0;
> +use crate::firmware::FalconUCodeDescV3;
> +use core::convert::TryFrom;
> +use kernel::device;
> +use kernel::error::Result;
> +use kernel::num::NumAlign;
> +use kernel::pci;
> +use kernel::prelude::*;
> +
> +/// The offset of the VBIOS ROM in the BAR0 space.
> +const ROM_OFFSET: usize = 0x300000;
> +/// The maximum length of the VBIOS ROM to scan into.
> +const BIOS_MAX_SCAN_LEN: usize = 0x100000;
> +/// The size to read ahead when parsing initial BIOS image headers.
> +const BIOS_READ_AHEAD_SIZE: usize = 1024;
> +/// The bit in the last image indicator byte for the PCI Data Structure that
> +/// indicates the last image. Bit 0-6 are reserved, bit 7 is last image bit.
> +const LAST_IMAGE_BIT_MASK: u8 = 0x80;
> +
> +// PMU lookup table entry types. Used to locate PMU table entries
> +// in the Fwsec image, corresponding to falcon ucodes.
> +#[expect(dead_code)]
> +const FALCON_UCODE_ENTRY_APPID_FIRMWARE_SEC_LIC: u8 = 0x05;
> +#[expect(dead_code)]
> +const FALCON_UCODE_ENTRY_APPID_FWSEC_DBG: u8 = 0x45;
> +const FALCON_UCODE_ENTRY_APPID_FWSEC_PROD: u8 = 0x85;
> +
> +/// Vbios Reader for constructing the VBIOS data
> +struct VbiosIterator<'a> {
> +    pdev: &'a pci::Device,
> +    bar0: &'a Bar0,
> +    // VBIOS data vector: As BIOS images are scanned, they are added to this vector
> +    // for reference or copying into other data structures. It is the entire
> +    // scanned contents of the VBIOS which progressively extends. It is used
> +    // so that we do not re-read any contents that are already read as we use
> +    // the cumulative length read so far, and re-read any gaps as we extend
> +    // the length.
> +    data: KVec<u8>,
> +    current_offset: usize, // Current offset for iterator
> +    last_found: bool,      // Whether the last image has been found
> +}
> +
> +impl<'a> VbiosIterator<'a> {
> +    fn new(pdev: &'a pci::Device, bar0: &'a Bar0) -> Result<Self> {
> +        Ok(Self {
> +            pdev,
> +            bar0,
> +            data: KVec::new(),
> +            current_offset: 0,
> +            last_found: false,
> +        })
> +    }
> +
> +    /// Read bytes from the ROM at the current end of the data vector
> +    fn read_more(&mut self, len: usize) -> Result {
> +        let current_len = self.data.len();
> +        let start = ROM_OFFSET + current_len;
> +
> +        // Ensure length is a multiple of 4 for 32-bit reads
> +        if len % core::mem::size_of::<u32>() != 0 {
> +            dev_err!(
> +                self.pdev.as_ref(),
> +                "VBIOS read length {} is not a multiple of 4\n",
> +                len
> +            );
> +            return Err(EINVAL);
> +        }
> +
> +        self.data.reserve(len, GFP_KERNEL)?;
> +        // Read ROM data bytes and push directly to vector
> +        for i in (0..len).step_by(core::mem::size_of::<u32>()) {
> +            // Read 32-bit word from the VBIOS ROM
> +            let word = self.bar0.try_read32(start + i)?;
> +
> +            // Convert the u32 to a 4 byte array and push each byte
> +            word.to_ne_bytes()
> +                .iter()
> +                .try_for_each(|&b| self.data.push(b, GFP_KERNEL))?;
> +        }
> +
> +        Ok(())
> +    }
> +
> +    /// Read bytes at a specific offset, filling any gap
> +    fn read_more_at_offset(&mut self, offset: usize, len: usize) -> Result {
> +        if offset > BIOS_MAX_SCAN_LEN {
> +            dev_err!(self.pdev.as_ref(), "Error: exceeded BIOS scan limit.\n");
> +            return Err(EINVAL);
> +        }
> +
> +        // If offset is beyond current data size, fill the gap first
> +        let current_len = self.data.len();
> +        let gap_bytes = offset.saturating_sub(current_len);
> +
> +        // Now read the requested bytes at the offset
> +        self.read_more(gap_bytes + len)
> +    }
> +
> +    /// Read a BIOS image at a specific offset and create a BiosImage from it.
> +    /// self.data is extended as needed and a new BiosImage is returned.
> +    /// @context is a string describing the operation for error reporting
> +    fn read_bios_image_at_offset(
> +        &mut self,
> +        offset: usize,
> +        len: usize,
> +        context: &str,
> +    ) -> Result<BiosImage> {
> +        let data_len = self.data.len();
> +        if offset + len > data_len {
> +            self.read_more_at_offset(offset, len).inspect_err(|e| {
> +                dev_err!(
> +                    self.pdev.as_ref(),
> +                    "Failed to read more at offset {:#x}: {:?}\n",
> +                    offset,
> +                    e
> +                )
> +            })?;
> +        }
> +
> +        BiosImage::new(self.pdev, &self.data[offset..offset + len]).inspect_err(|err| {
> +            dev_err!(
> +                self.pdev.as_ref(),
> +                "Failed to {} at offset {:#x}: {:?}\n",
> +                context,
> +                offset,
> +                err
> +            )
> +        })
> +    }
> +}
> +
> +impl<'a> Iterator for VbiosIterator<'a> {
> +    type Item = Result<BiosImage>;
> +
> +    /// Iterate over all VBIOS images until the last image is detected or offset
> +    /// exceeds scan limit.
> +    fn next(&mut self) -> Option<Self::Item> {
> +        if self.last_found {
> +            return None;
> +        }
> +
> +        if self.current_offset > BIOS_MAX_SCAN_LEN {
> +            dev_err!(
> +                self.pdev.as_ref(),
> +                "Error: exceeded BIOS scan limit, stopping scan\n"
> +            );
> +            return None;
> +        }
> +
> +        // Parse image headers first to get image size
> +        let image_size = match self
> +            .read_bios_image_at_offset(
> +                self.current_offset,
> +                BIOS_READ_AHEAD_SIZE,
> +                "parse initial BIOS image headers",
> +            )
> +            .and_then(|image| image.image_size_bytes())
> +        {
> +            Ok(size) => size,
> +            Err(e) => return Some(Err(e)),
> +        };
> +
> +        // Now create a new BiosImage with the full image data
> +        let full_image = match self.read_bios_image_at_offset(
> +            self.current_offset,
> +            image_size,
> +            "parse full BIOS image",
> +        ) {
> +            Ok(image) => image,
> +            Err(e) => return Some(Err(e)),
> +        };
> +
> +        self.last_found = full_image.is_last();
> +
> +        // Advance to next image (aligned to 512 bytes)
> +        self.current_offset += image_size;
> +        self.current_offset = self.current_offset.align_up(512);
> +
> +        Some(Ok(full_image))
> +    }
> +}
> +
> +pub(crate) struct Vbios {
> +    pub fwsec_image: Option<FwSecBiosImage>,

Please use pub(crate) instead or provide an accessor.

Also, this shouldn't be an Option, see below comment in Vbios::new().

> +}
> +
> +impl Vbios {
> +    /// Probe for VBIOS extraction
> +    /// Once the VBIOS object is built, bar0 is not read for vbios purposes anymore.
> +    pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> Result<Vbios> {
> +        // Images to extract from iteration
> +        let mut pci_at_image: Option<PciAtBiosImage> = None;
> +        let mut first_fwsec_image: Option<FwSecBiosImage> = None;
> +        let mut second_fwsec_image: Option<FwSecBiosImage> = None;
> +
> +        // Parse all VBIOS images in the ROM
> +        for image_result in VbiosIterator::new(pdev, bar0)? {
> +            let full_image = image_result?;
> +
> +            dev_info!(

Let's use dev_dbg!() instaed.

> +                pdev.as_ref(),
> +                "Found BIOS image: size: {:#x}, type: {}, last: {}\n",
> +                full_image.image_size_bytes()?,
> +                full_image.image_type_str(),
> +                full_image.is_last()
> +            );
> +
> +            // Get references to images we will need after the loop, in order to
> +            // setup the falcon data offset.
> +            match full_image {
> +                BiosImage::PciAt(image) => {
> +                    pci_at_image = Some(image);
> +                }
> +                BiosImage::FwSec(image) => {
> +                    if first_fwsec_image.is_none() {
> +                        first_fwsec_image = Some(image);
> +                    } else {
> +                        second_fwsec_image = Some(image);
> +                    }
> +                }
> +                // For now we don't need to handle these
> +                BiosImage::Efi(_image) => {}
> +                BiosImage::Nbsi(_image) => {}
> +            }
> +        }
> +
> +        // Using all the images, setup the falcon data pointer in Fwsec.
> +        // We need mutable access here, so we handle the Option manually.
> +        let final_fwsec_image = {
> +            let mut second = second_fwsec_image; // Take ownership of the option
> +
> +            if let (Some(second), Some(first), Some(pci_at)) =
> +                (second.as_mut(), first_fwsec_image, pci_at_image)
> +            {
> +                second
> +                    .setup_falcon_data(pdev, &pci_at, &first)
> +                    .inspect_err(|e| {
> +                        dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
> +                    })?;
> +            } else {
> +                dev_err!(
> +                    pdev.as_ref(),
> +                    "Missing required images for falcon data setup, skipping\n"
> +                );
> +                return Err(EINVAL);

This means that if second == None we fail, which makes sense, so why store an
Option in Vbios? All methods of Vbios fail if fwsec_image == None.

> +            }
> +            second
> +        };

I think this should be:

	let mut second = second_fwsec_image;
	
	if let (Some(second), Some(first), Some(pci_at)) =
	    (second.as_mut(), first_fwsec_image, pci_at_image)
	{
	    second
	        .setup_falcon_data(pdev, &pci_at, &first)
	        .inspect_err(|e| {
	            dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
	        })?;
	
	    Ok(Vbios(second)
	} else {
	    dev_err!(
	        pdev.as_ref(),
	        "Missing required images for falcon data setup, skipping\n"
	    );
	
	    Err(EINVAL)
	}

where Vbios can just be

	pub(crate) struct Vbios(FwSecBiosImage);

> +
> +        Ok(Vbios {
> +            fwsec_image: final_fwsec_image,
> +        })
> +    }
> +
> +    pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> {
> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
> +        image.fwsec_header(pdev)
> +    }
> +
> +    pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> {
> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
> +        image.fwsec_ucode(pdev, image.fwsec_header(pdev)?)
> +    }
> +
> +    pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> {
> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
> +        image.fwsec_sigs(pdev, image.fwsec_header(pdev)?)
> +    }

Those then become infallible, e.g.

	pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> &[u8] {
	    self.0.fwsec_sigs(pdev, self.fwsec_header(pdev))
	}

> +}

<snip>

I have to continue with the rest of this patch later on.

- Danilo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-07 13:52 ` [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot Alexandre Courbot
  2025-05-13 17:19   ` Danilo Krummrich
@ 2025-05-14 16:23   ` Danilo Krummrich
  2025-05-19 22:59     ` Joel Fernandes
  2025-05-20  7:18     ` Joel Fernandes
  2025-05-16 20:38   ` Timur Tabi
  2 siblings, 2 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-14 16:23 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran

On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
> +/// PCI Data Structure as defined in PCI Firmware Specification
> +#[derive(Debug, Clone)]
> +#[repr(C)]
> +struct PcirStruct {
> +    /// PCI Data Structure signature ("PCIR" or "NPDS")
> +    pub signature: [u8; 4],
> +    /// PCI Vendor ID (e.g., 0x10DE for NVIDIA)
> +    pub vendor_id: u16,
> +    /// PCI Device ID
> +    pub device_id: u16,
> +    /// Device List Pointer
> +    pub device_list_ptr: u16,
> +    /// PCI Data Structure Length
> +    pub pci_data_struct_len: u16,
> +    /// PCI Data Structure Revision
> +    pub pci_data_struct_rev: u8,
> +    /// Class code (3 bytes, 0x03 for display controller)
> +    pub class_code: [u8; 3],
> +    /// Size of this image in 512-byte blocks
> +    pub image_len: u16,
> +    /// Revision Level of the Vendor's ROM
> +    pub vendor_rom_rev: u16,
> +    /// ROM image type (0x00 = PC-AT compatible, 0x03 = EFI, 0x70 = NBSI)
> +    pub code_type: u8,
> +    /// Last image indicator (0x00 = Not last image, 0x80 = Last image)
> +    pub last_image: u8,
> +    /// Maximum Run-time Image Length (units of 512 bytes)
> +    pub max_runtime_image_len: u16,
> +}

Here and in a couple more cases below, please don't use pub for fields of
private structures.

> +
> +impl PcirStruct {
> +    fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> {
> +        if data.len() < core::mem::size_of::<PcirStruct>() {
> +            dev_err!(pdev.as_ref(), "Not enough data for PcirStruct\n");
> +            return Err(EINVAL);
> +        }
> +
> +        let mut signature = [0u8; 4];
> +        signature.copy_from_slice(&data[0..4]);
> +
> +        // Signature should be "PCIR" (0x52494350) or "NPDS" (0x5344504e)
> +        if &signature != b"PCIR" && &signature != b"NPDS" {
> +            dev_err!(
> +                pdev.as_ref(),
> +                "Invalid signature for PcirStruct: {:?}\n",
> +                signature
> +            );
> +            return Err(EINVAL);
> +        }
> +
> +        let mut class_code = [0u8; 3];
> +        class_code.copy_from_slice(&data[13..16]);
> +
> +        Ok(PcirStruct {
> +            signature,
> +            vendor_id: u16::from_le_bytes([data[4], data[5]]),
> +            device_id: u16::from_le_bytes([data[6], data[7]]),
> +            device_list_ptr: u16::from_le_bytes([data[8], data[9]]),
> +            pci_data_struct_len: u16::from_le_bytes([data[10], data[11]]),
> +            pci_data_struct_rev: data[12],
> +            class_code,
> +            image_len: u16::from_le_bytes([data[16], data[17]]),
> +            vendor_rom_rev: u16::from_le_bytes([data[18], data[19]]),
> +            code_type: data[20],
> +            last_image: data[21],
> +            max_runtime_image_len: u16::from_le_bytes([data[22], data[23]]),
> +        })

Quite some of those fields seem unused, do we still want to have them? Same for
other structures below.

> +    }
> +
> +    /// Check if this is the last image in the ROM
> +    fn is_last(&self) -> bool {
> +        self.last_image & LAST_IMAGE_BIT_MASK != 0
> +    }
> +
> +    /// Calculate image size in bytes
> +    fn image_size_bytes(&self) -> Result<usize> {
> +        if self.image_len > 0 {
> +            // Image size is in 512-byte blocks
> +            Ok(self.image_len as usize * 512)
> +        } else {
> +            Err(EINVAL)
> +        }
> +    }
> +}
> +
> +/// BIOS Information Table (BIT) Header
> +/// This is the head of the BIT table, that is used to locate the Falcon data.
> +/// The BIT table (with its header) is in the PciAtBiosImage and the falcon data
> +/// it is pointing to is in the FwSecBiosImage.
> +#[derive(Debug, Clone, Copy)]
> +#[expect(dead_code)]
> +struct BitHeader {
> +    /// 0h: BIT Header Identifier (BMP=0x7FFF/BIT=0xB8FF)
> +    pub id: u16,
> +    /// 2h: BIT Header Signature ("BIT\0")
> +    pub signature: [u8; 4],
> +    /// 6h: Binary Coded Decimal Version, ex: 0x0100 is 1.00.
> +    pub bcd_version: u16,
> +    /// 8h: Size of BIT Header (in bytes)
> +    pub header_size: u8,
> +    /// 9h: Size of BIT Tokens (in bytes)
> +    pub token_size: u8,
> +    /// 10h: Number of token entries that follow
> +    pub token_entries: u8,
> +    /// 11h: BIT Header Checksum
> +    pub checksum: u8,
> +}
> +
> +impl BitHeader {
> +    fn new(data: &[u8]) -> Result<Self> {
> +        if data.len() < 12 {
> +            return Err(EINVAL);
> +        }
> +
> +        let mut signature = [0u8; 4];
> +        signature.copy_from_slice(&data[2..6]);
> +
> +        // Check header ID and signature
> +        let id = u16::from_le_bytes([data[0], data[1]]);
> +        if id != 0xB8FF || &signature != b"BIT\0" {
> +            return Err(EINVAL);
> +        }
> +
> +        Ok(BitHeader {
> +            id,
> +            signature,
> +            bcd_version: u16::from_le_bytes([data[6], data[7]]),
> +            header_size: data[8],
> +            token_size: data[9],
> +            token_entries: data[10],
> +            checksum: data[11],
> +        })
> +    }
> +}
> +
> +/// BIT Token Entry: Records in the BIT table followed by the BIT header
> +#[derive(Debug, Clone, Copy)]
> +#[expect(dead_code)]
> +struct BitToken {
> +    /// 00h: Token identifier
> +    pub id: u8,
> +    /// 01h: Version of the token data
> +    pub data_version: u8,
> +    /// 02h: Size of token data in bytes
> +    pub data_size: u16,
> +    /// 04h: Offset to the token data
> +    pub data_offset: u16,
> +}
> +
> +// Define the token ID for the Falcon data
> +pub(in crate::vbios) const BIT_TOKEN_ID_FALCON_DATA: u8 = 0x70;

This can just be private.

> +
> +impl BitToken {
> +    /// Find a BIT token entry by BIT ID in a PciAtBiosImage
> +    pub(in crate::vbios) fn from_id(image: &PciAtBiosImage, token_id: u8) -> Result<Self> {

Same here.

<snip>

> +struct PciAtBiosImage {
> +    base: BiosImageBase,
> +    bit_header: Option<BitHeader>,
> +    bit_offset: Option<usize>,

Why are those Options? AFAICS, this structure is only ever created from

	impl TryFrom<BiosImageBase> for PciAtBiosImage

and there you fail if you can't find the bit header anyways.

Also BitToken::from_id fails if bit_header == None, and it doesn't seem to be
used anywhere else.

I think we should remove the Option wrapper for both.

<snip>

> +/// The PmuLookupTableEntry structure is used to find the PmuLookupTableEntry
> +/// for a given application ID. The table of entries is pointed to by the falcon
> +/// data pointer in the BIT table, and is used to locate the Falcon Ucode.
> +#[expect(dead_code)]
> +struct PmuLookupTable {
> +    version: u8,
> +    header_len: u8,
> +    entry_len: u8,
> +    entry_count: u8,
> +    table_data: KVec<u8>,
> +}
> +
> +impl PmuLookupTable {
> +    fn new(data: &[u8]) -> Result<Self> {
> +        if data.len() < 4 {
> +            return Err(EINVAL);
> +        }
> +
> +        let header_len = data[1] as usize;
> +        let entry_len = data[2] as usize;
> +        let entry_count = data[3] as usize;
> +
> +        let required_bytes = header_len + (entry_count * entry_len);
> +
> +        if data.len() < required_bytes {
> +            return Err(EINVAL);
> +        }
> +
> +        // Create a copy of only the table data
> +        let mut table_data = KVec::new();
> +
> +        // "last_entry_bytes" is a debugging aid.
> +        let mut last_entry_bytes: Option<KVec<u8>> = if cfg!(debug_assertions) {
> +            Some(KVec::new())
> +        } else {
> +            None
> +        };
> +
> +        for &byte in &data[header_len..required_bytes] {
> +            table_data.push(byte, GFP_KERNEL)?;

This should just be

	table_data.extend_from_slice(&data[header_len..required_bytes], GFP_KERNEL)?;

so you don't need the loop and potentially lots of re-allocations.

Subsequently you can implement the debugging stuff as

	if cfg!(debug_assertions) {
	    let mut last_entry_bytes = KVec::new();
	
	    for &byte in &data[header_len..required_bytes] {
	        // Debugging (dumps the table data to dmesg):
	        last_entry_bytes.push(byte, GFP_KERNEL)?;
	
	        let last_entry_bytes_len = last_entry_bytes.len();
	        if last_entry_bytes_len == entry_len {
	            pr_info!("Last entry bytes: {:02x?}\n", &last_entry_bytes[..]);
	            last_entry_bytes = KVec::new();
	        }
	    }
	}

In general, I feel like this patch utilizes the Option type way too much and
often without actual need. Can you please also double check?

> +
> +            if cfg!(debug_assertions) {
> +                // Debugging (dumps the table data to dmesg):
> +                if let Some(ref mut last_entry_bytes) = last_entry_bytes {
> +                    last_entry_bytes.push(byte, GFP_KERNEL)?;
> +
> +                    let last_entry_bytes_len = last_entry_bytes.len();
> +                    if last_entry_bytes_len == entry_len {
> +                        pr_info!("Last entry bytes: {:02x?}\n", &last_entry_bytes[..]);

Please use dev_dbg!().

> +                        *last_entry_bytes = KVec::new();
> +                    }
> +                }
> +            }
> +        }
> +
> +        Ok(PmuLookupTable {
> +            version: data[0],
> +            header_len: header_len as u8,
> +            entry_len: entry_len as u8,
> +            entry_count: entry_count as u8,
> +            table_data,
> +        })
> +    }
> +
> +    fn lookup_index(&self, idx: u8) -> Result<PmuLookupTableEntry> {
> +        if idx >= self.entry_count {
> +            return Err(EINVAL);
> +        }
> +
> +        let index = (idx as usize) * self.entry_len as usize;
> +        PmuLookupTableEntry::new(&self.table_data[index..])
> +    }
> +
> +    // find entry by type value
> +    fn find_entry_by_type(&self, entry_type: u8) -> Result<PmuLookupTableEntry> {
> +        for i in 0..self.entry_count {
> +            let entry = self.lookup_index(i)?;
> +            if entry.application_id == entry_type {
> +                return Ok(entry);
> +            }
> +        }
> +
> +        Err(EINVAL)
> +    }
> +}
> +
> +/// The FwSecBiosImage structure contains the PMU table and the Falcon Ucode.
> +/// The PMU table contains voltage/frequency tables as well as a pointer to the
> +/// Falcon Ucode.
> +impl FwSecBiosImage {
> +    fn setup_falcon_data(
> +        &mut self,
> +        pdev: &pci::Device,
> +        pci_at_image: &PciAtBiosImage,
> +        first_fwsec_image: &FwSecBiosImage,
> +    ) -> Result<()> {

Just Result will do.

> +        let mut offset = pci_at_image.falcon_data_ptr(pdev)? as usize;
> +
> +        // The falcon data pointer assumes that the PciAt and FWSEC images
> +        // are contiguous in memory. However, testing shows the EFI image sits in
> +        // between them. So calculate the offset from the end of the PciAt image
> +        // rather than the start of it. Compensate.
> +        offset -= pci_at_image.base.data.len();
> +
> +        // The offset is now from the start of the first Fwsec image, however
> +        // the offset points to a location in the second Fwsec image. Since
> +        // the fwsec images are contiguous, subtract the length of the first Fwsec
> +        // image from the offset to get the offset to the start of the second
> +        // Fwsec image.
> +        offset -= first_fwsec_image.base.data.len();
> +
> +        self.falcon_data_offset = Some(offset);
> +
> +        // The PmuLookupTable starts at the offset of the falcon data pointer
> +        self.pmu_lookup_table = Some(PmuLookupTable::new(&self.base.data[offset..])?);
> +
> +        match self
> +            .pmu_lookup_table
> +            .as_ref()
> +            .ok_or(EINVAL)?
> +            .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD)
> +        {
> +            Ok(entry) => {
> +                let mut ucode_offset = entry.data as usize;
> +                ucode_offset -= pci_at_image.base.data.len();
> +                ucode_offset -= first_fwsec_image.base.data.len();
> +                self.falcon_ucode_offset = Some(ucode_offset);
> +                if cfg!(debug_assertions) {
> +                    // Print the v3_desc header for debugging
> +                    let v3_desc = self.fwsec_header(pdev.as_ref())?;
> +                    pr_info!("PmuLookupTableEntry v3_desc: {:#?}\n", v3_desc);
> +                }
> +            }
> +            Err(e) => {
> +                dev_err!(
> +                    pdev.as_ref(),
> +                    "PmuLookupTableEntry not found, error: {:?}\n",
> +                    e
> +                );
> +            }
> +        }
> +        Ok(())
> +    }
> +
> +    /// TODO: These were borrowed from the old code for integrating this module
> +    /// with the outside world. They should be cleaned up and integrated properly.

Okay, won't review for now then. :)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS
  2025-05-07 13:52 ` [PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS Alexandre Courbot
@ 2025-05-14 16:38   ` Danilo Krummrich
  2025-05-19 14:24     ` Alexandre Courbot
  0 siblings, 1 reply; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-14 16:38 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:45PM +0900, Alexandre Courbot wrote:
> The FWSEC firmware needs to be extracted from the VBIOS and patched with
> the desired command, as well as the right signature. Do this so we are
> ready to load and run this firmware into the GSP falcon and create the
> FRTS region.
> 
> [joelagnelf@nvidia.com: give better names to FalconAppifHdrV1's fields]
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/dma.rs            |   3 -
>  drivers/gpu/nova-core/firmware.rs       |  18 ++
>  drivers/gpu/nova-core/firmware/fwsec.rs | 359 ++++++++++++++++++++++++++++++++
>  drivers/gpu/nova-core/gpu.rs            |  20 +-
>  drivers/gpu/nova-core/vbios.rs          |   3 -
>  5 files changed, 395 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
> index 9d90ae01d0044eaab4ddbc3eba216741d7a623ef..a12d0dff574aa38fb5eb8f4d759611af2f8ba3ec 100644
> --- a/drivers/gpu/nova-core/dma.rs
> +++ b/drivers/gpu/nova-core/dma.rs
> @@ -2,9 +2,6 @@
>  
>  //! Simple DMA object wrapper.
>  
> -// To be removed when all code is used.
> -#![expect(dead_code)]
> -
>  use core::ops::{Deref, DerefMut};
>  
>  use kernel::device;
> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
> index 960982174d834c7c66a47ecfb3a15bf47116b2c5..3945fd18499555ddd6fb2e0ea69535b40fcc4b08 100644
> --- a/drivers/gpu/nova-core/firmware.rs
> +++ b/drivers/gpu/nova-core/firmware.rs
> @@ -8,9 +8,12 @@
>  use kernel::prelude::*;
>  use kernel::str::CString;
>  
> +use crate::dma::DmaObject;
>  use crate::gpu;
>  use crate::gpu::Chipset;
>  
> +pub(crate) mod fwsec;
> +
>  pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
>  
>  /// Structure encapsulating the firmware blobs required for the GPU to operate.
> @@ -86,6 +89,21 @@ pub(crate) fn size(&self) -> usize {
>      }
>  }
>  
> +/// Patch the `ucode_dma` firmware at offset `sig_base_img` with `signature`.
> +fn patch_signature(ucode_dma: &mut DmaObject, signature: &[u8], sig_base_img: usize) -> Result<()> {
> +    if sig_base_img + signature.len() > ucode_dma.size() {
> +        return Err(EINVAL);
> +    }
> +
> +    // SAFETY: we are the only user of this object, so there cannot be any race.
> +    let dst = unsafe { ucode_dma.start_ptr_mut().add(sig_base_img) };
> +
> +    // SAFETY: `signature` and `dst` are valid, properly aligned, and do not overlap.
> +    unsafe { core::ptr::copy_nonoverlapping(signature.as_ptr(), dst, signature.len()) };
> +
> +    Ok(())
> +}

Why is this not in firmware/fwsec.rs, like patch_command()?

Also, please wrap the ucode DmaObject in its own type, i.e.
`struct UcodeDma(DmaObject)` and make the patch_*() functions methods of this
type. They're only applicable for the ucode DmaObject.

> +
>  pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>);
>  
>  impl<const N: usize> ModInfoBuilder<N> {
> diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs b/drivers/gpu/nova-core/firmware/fwsec.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..d7e3418b49ab4ad478e73c0d447a832ed68ecd0a
> --- /dev/null
> +++ b/drivers/gpu/nova-core/firmware/fwsec.rs
> @@ -0,0 +1,359 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! FWSEC is a High Secure firmware that is extracted from the BIOS and performs the first step of
> +//! the GSP startup by creating the WPR2 memory region and copying critical areas of the VBIOS into
> +//! it after authenticating them, ensuring they haven't been tampered with. It runs on the GSP
> +//! falcon.
> +//!
> +//! Before being run, it needs to be patched in two areas:
> +//!
> +//! - The command to be run, as this firmware can perform several tasks ;
> +//! - The ucode signature, so the GSP falcon can run FWSEC in HS mode.
> +
> +use core::alloc::Layout;
> +
> +use kernel::bindings;
> +use kernel::device::{self, Device};
> +use kernel::prelude::*;
> +use kernel::transmute::FromBytes;
> +
> +use crate::dma::DmaObject;
> +use crate::driver::Bar0;
> +use crate::falcon::gsp::Gsp;
> +use crate::falcon::{Falcon, FalconBromParams, FalconFirmware, FalconLoadTarget};
> +use crate::firmware::FalconUCodeDescV3;
> +use crate::vbios::Vbios;
> +
> +const NVFW_FALCON_APPIF_ID_DMEMMAPPER: u32 = 0x4;
> +
> +#[repr(C)]
> +#[derive(Debug)]
> +struct FalconAppifHdrV1 {
> +    version: u8,
> +    header_size: u8,
> +    entry_size: u8,
> +    entry_count: u8,
> +}
> +// SAFETY: any byte sequence is valid for this struct.
> +unsafe impl FromBytes for FalconAppifHdrV1 {}
> +
> +#[repr(C, packed)]
> +#[derive(Debug)]
> +struct FalconAppifV1 {
> +    id: u32,
> +    dmem_base: u32,
> +}
> +// SAFETY: any byte sequence is valid for this struct.
> +unsafe impl FromBytes for FalconAppifV1 {}
> +
> +#[derive(Debug)]
> +#[repr(C, packed)]
> +struct FalconAppifDmemmapperV3 {
> +    signature: u32,
> +    version: u16,
> +    size: u16,
> +    cmd_in_buffer_offset: u32,
> +    cmd_in_buffer_size: u32,
> +    cmd_out_buffer_offset: u32,
> +    cmd_out_buffer_size: u32,
> +    nvf_img_data_buffer_offset: u32,
> +    nvf_img_data_buffer_size: u32,
> +    printf_buffer_hdr: u32,
> +    ucode_build_time_stamp: u32,
> +    ucode_signature: u32,
> +    init_cmd: u32,
> +    ucode_feature: u32,
> +    ucode_cmd_mask0: u32,
> +    ucode_cmd_mask1: u32,
> +    multi_tgt_tbl: u32,
> +}
> +// SAFETY: any byte sequence is valid for this struct.
> +unsafe impl FromBytes for FalconAppifDmemmapperV3 {}
> +
> +#[derive(Debug)]
> +#[repr(C, packed)]
> +struct ReadVbios {
> +    ver: u32,
> +    hdr: u32,
> +    addr: u64,
> +    size: u32,
> +    flags: u32,
> +}
> +// SAFETY: any byte sequence is valid for this struct.
> +unsafe impl FromBytes for ReadVbios {}
> +
> +#[derive(Debug)]
> +#[repr(C, packed)]
> +struct FrtsRegion {
> +    ver: u32,
> +    hdr: u32,
> +    addr: u32,
> +    size: u32,
> +    ftype: u32,
> +}
> +// SAFETY: any byte sequence is valid for this struct.
> +unsafe impl FromBytes for FrtsRegion {}
> +
> +const NVFW_FRTS_CMD_REGION_TYPE_FB: u32 = 2;
> +
> +#[repr(C, packed)]
> +struct FrtsCmd {
> +    read_vbios: ReadVbios,
> +    frts_region: FrtsRegion,
> +}
> +// SAFETY: any byte sequence is valid for this struct.
> +unsafe impl FromBytes for FrtsCmd {}
> +
> +const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS: u32 = 0x15;
> +const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB: u32 = 0x19;
> +
> +/// Command for the [`FwsecFirmware`] to execute.
> +pub(crate) enum FwsecCommand {
> +    /// Asks [`FwsecFirmware`] to carve out the WPR2 area and place a verified copy of the VBIOS
> +    /// image into it.
> +    Frts { frts_addr: u64, frts_size: u64 },
> +    /// Asks [`FwsecFirmware`] to load pre-OS apps on the PMU.
> +    #[expect(dead_code)]
> +    Sb,
> +}
> +
> +/// Reinterpret the area starting from `offset` in `fw` as an instance of `T` (which must implement
> +/// [`FromBytes`]) and return a reference to it.
> +///
> +/// # Safety
> +///
> +/// Callers must ensure that the region of memory returned is not written for as long as the
> +/// returned reference is alive.
> +///
> +/// TODO: Remove this and `transmute_mut` once we have a way to transmute objects implementing
> +/// FromBytes, e.g.:
> +/// https://lore.kernel.org/lkml/20250330234039.29814-1-christiansantoslima21@gmail.com/
> +unsafe fn transmute<'a, 'b, T: Sized + FromBytes>(
> +    fw: &'a DmaObject,
> +    offset: usize,
> +) -> Result<&'b T> {
> +    if offset + core::mem::size_of::<T>() > fw.size() {
> +        return Err(EINVAL);
> +    }
> +    if (fw.start_ptr() as usize + offset) % core::mem::align_of::<T>() != 0 {
> +        return Err(EINVAL);
> +    }
> +
> +    // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is
> +    // large enough the contains an instance of `T`, which implements `FromBytes`.
> +    Ok(unsafe { &*(fw.start_ptr().add(offset) as *const T) })
> +}
> +
> +/// Reinterpret the area starting from `offset` in `fw` as a mutable instance of `T` (which must
> +/// implement [`FromBytes`]) and return a reference to it.
> +///
> +/// # Safety
> +///
> +/// Callers must ensure that the region of memory returned is not read or written for as long as
> +/// the returned reference is alive.
> +unsafe fn transmute_mut<'a, 'b, T: Sized + FromBytes>(
> +    fw: &'a mut DmaObject,
> +    offset: usize,
> +) -> Result<&'b mut T> {
> +    if offset + core::mem::size_of::<T>() > fw.size() {
> +        return Err(EINVAL);
> +    }
> +    if (fw.start_ptr_mut() as usize + offset) % core::mem::align_of::<T>() != 0 {
> +        return Err(EINVAL);
> +    }
> +
> +    // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is
> +    // large enough the contains an instance of `T`, which implements `FromBytes`.
> +    Ok(unsafe { &mut *(fw.start_ptr_mut().add(offset) as *mut T) })
> +}
> +
> +/// Patch the Fwsec firmware image in `fw` to run the command `cmd`.
> +fn patch_command(fw: &mut DmaObject, v3_desc: &FalconUCodeDescV3, cmd: FwsecCommand) -> Result<()> {
> +    let hdr_offset = (v3_desc.imem_load_size + v3_desc.interface_offset) as usize;
> +    // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared `fw` with
> +    // the hardware yet.
> +    let hdr: &FalconAppifHdrV1 = unsafe { transmute(fw, hdr_offset) }?;
> +
> +    if hdr.version != 1 {
> +        return Err(EINVAL);
> +    }
> +
> +    // Find the DMEM mapper section in the firmware.
> +    for i in 0..hdr.entry_count as usize {
> +        let app: &FalconAppifV1 =
> +            // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared
> +            // `fw` with the hardware yet.
> +            unsafe {
> +                transmute(
> +                    fw,
> +                    hdr_offset + hdr.header_size as usize + i * hdr.entry_size as usize
> +                )
> +            }?;
> +
> +        if app.id != NVFW_FALCON_APPIF_ID_DMEMMAPPER {
> +            continue;
> +        }
> +
> +        let dmem_mapper: &mut FalconAppifDmemmapperV3 =
> +            // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared
> +            // `fw` with the hardware yet.
> +            unsafe { transmute_mut(fw, (v3_desc.imem_load_size + app.dmem_base) as usize) }?;
> +
> +        // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared `fw`
> +        // with the hardware yet.
> +        let frts_cmd: &mut FrtsCmd = unsafe {
> +            transmute_mut(
> +                fw,
> +                (v3_desc.imem_load_size + dmem_mapper.cmd_in_buffer_offset) as usize,
> +            )
> +        }?;
> +
> +        frts_cmd.read_vbios = ReadVbios {
> +            ver: 1,
> +            hdr: core::mem::size_of::<ReadVbios>() as u32,
> +            addr: 0,
> +            size: 0,
> +            flags: 2,
> +        };
> +
> +        dmem_mapper.init_cmd = match cmd {
> +            FwsecCommand::Frts {
> +                frts_addr,
> +                frts_size,
> +            } => {
> +                frts_cmd.frts_region = FrtsRegion {
> +                    ver: 1,
> +                    hdr: core::mem::size_of::<FrtsRegion>() as u32,
> +                    addr: (frts_addr >> 12) as u32,
> +                    size: (frts_size >> 12) as u32,
> +                    ftype: NVFW_FRTS_CMD_REGION_TYPE_FB,
> +                };
> +
> +                NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS
> +            }
> +            FwsecCommand::Sb => NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB,
> +        };
> +
> +        // Return early as we found and patched the DMEMMAPPER region.
> +        return Ok(());
> +    }
> +
> +    Err(ENOTSUPP)
> +}
> +
> +/// Firmware extracted from the VBIOS and responsible for e.g. carving out the WPR2 region as the
> +/// first step of the GSP bootflow.
> +pub(crate) struct FwsecFirmware {
> +    desc: FalconUCodeDescV3,
> +    ucode: DmaObject,
> +}
> +
> +impl FalconFirmware for FwsecFirmware {
> +    type Target = Gsp;
> +
> +    fn dma_handle(&self) -> bindings::dma_addr_t {
> +        self.ucode.dma_handle()
> +    }
> +
> +    fn imem_load(&self) -> FalconLoadTarget {
> +        FalconLoadTarget {
> +            src_start: 0,
> +            dst_start: self.desc.imem_phys_base,
> +            len: self.desc.imem_load_size,
> +        }
> +    }
> +
> +    fn dmem_load(&self) -> FalconLoadTarget {
> +        FalconLoadTarget {
> +            src_start: self.desc.imem_load_size,
> +            dst_start: self.desc.dmem_phys_base,
> +            len: Layout::from_size_align(self.desc.dmem_load_size as usize, 256)
> +                // Cannot panic, as 256 is non-zero and a power of 2.
> +                .unwrap()
> +                .pad_to_align()
> +                .size() as u32,
> +        }
> +    }
> +
> +    fn brom_params(&self) -> FalconBromParams {
> +        FalconBromParams {
> +            pkc_data_offset: self.desc.pkc_data_offset,
> +            engine_id_mask: self.desc.engine_id_mask,
> +            ucode_id: self.desc.ucode_id,
> +        }
> +    }
> +
> +    fn boot_addr(&self) -> u32 {
> +        0
> +    }
> +}
> +
> +impl FwsecFirmware {
> +    /// Extract the Fwsec firmware from `bios` and patch it to run with the `cmd` command.
> +    pub(crate) fn new(
> +        falcon: &Falcon<Gsp>,
> +        dev: &Device<device::Bound>,
> +        bar: &Bar0,
> +        bios: &Vbios,
> +        cmd: FwsecCommand,
> +    ) -> Result<Self> {
> +        let v3_desc = bios.fwsec_header(dev)?;
> +        let ucode = bios.fwsec_ucode(dev)?;
> +
> +        let mut ucode_dma = DmaObject::from_data(dev, ucode)?;
> +        patch_command(&mut ucode_dma, v3_desc, cmd)?;
> +
> +        const SIG_SIZE: usize = 96 * 4;

96 * 4? :-)

> +        let signatures = bios.fwsec_sigs(dev)?;
> +        let sig_base_img = (v3_desc.imem_load_size + v3_desc.pkc_data_offset) as usize;
> +
> +        if v3_desc.signature_count != 0 {
> +            // Patch signature.
> +            let desc_sig_versions = v3_desc.signature_versions as u32;
> +            let reg_fuse_version = falcon.get_signature_reg_fuse_version(
> +                bar,
> +                v3_desc.engine_id_mask,
> +                v3_desc.ucode_id,
> +            )?;
> +            dev_dbg!(
> +                dev,
> +                "desc_sig_versions: {:#x}, reg_fuse_version: {}\n",
> +                desc_sig_versions,
> +                reg_fuse_version
> +            );
> +            let signature_idx = {
> +                let reg_fuse_version_bit = 1 << reg_fuse_version;
> +
> +                // Check if the fuse version is supported by the firmware.
> +                if desc_sig_versions & reg_fuse_version_bit == 0 {
> +                    dev_warn!(
> +                        dev,
> +                        "no matching signature: {:#x} {:#x}\n",
> +                        reg_fuse_version_bit,
> +                        v3_desc.signature_versions
> +                    );

Looks like this should be dev_err!().

> +                    return Err(EINVAL);
> +                }
> +
> +                // `desc_sig_versions` has one bit set per included signature. Thus, the index of
> +                // the signature to patch is the number of bits in `desc_sig_versions` set to `1`
> +                // before `reg_fuse_version_bit`.
> +
> +                // Mask of the bits of `desc_sig_versions` to preserve.
> +                let reg_fuse_version_mask = reg_fuse_version_bit.wrapping_sub(1);
> +
> +                (desc_sig_versions & reg_fuse_version_mask).count_ones()
> +            };
> +
> +            dev_dbg!(dev, "patching signature with index {}\n", signature_idx);
> +            let signature_start = signature_idx as usize * SIG_SIZE;
> +            let signature = &signatures[signature_start..signature_start + SIG_SIZE];
> +            super::patch_signature(&mut ucode_dma, signature, sig_base_img)?;
> +        }
> +
> +        Ok(FwsecFirmware {
> +            desc: v3_desc.clone(),
> +            ucode: ucode_dma,
> +        })
> +    }
> +}
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index a3d96639706e808305cce66416778d2bf6e7e683..ad71402dbfced2aaa84800a32dbac276b9706803 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -6,6 +6,7 @@
>  use crate::dma::DmaObject;
>  use crate::driver::Bar0;
>  use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
> +use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware};
>  use crate::firmware::Firmware;
>  use crate::gsp::fb::FbLayout;
>  use crate::regs;
> @@ -197,7 +198,11 @@ pub(crate) fn new(
>      ) -> Result<impl PinInit<Self>> {
>          let bar = devres_bar.access(pdev.as_ref())?;
>          let spec = Spec::new(bar)?;
> -        let fw = Firmware::new(pdev.as_ref(), spec.chipset, "535.113.01")?;
> +        let fw = Firmware::new(
> +            pdev.as_ref(),
> +            spec.chipset,
> +            crate::firmware::FIRMWARE_VERSION,
> +        )?;
>  
>          dev_info!(
>              pdev.as_ref(),
> @@ -243,7 +248,18 @@ pub(crate) fn new(
>          let fb_layout = FbLayout::new(spec.chipset, bar)?;
>          dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout);
>  
> -        let _bios = Vbios::new(pdev, bar)?;
> +        let bios = Vbios::new(pdev, bar)?;
> +
> +        let _fwsec_frts = FwsecFirmware::new(
> +            &gsp_falcon,
> +            pdev.as_ref(),
> +            bar,
> +            &bios,
> +            FwsecCommand::Frts {
> +                frts_addr: fb_layout.frts.start,
> +                frts_size: fb_layout.frts.end - fb_layout.frts.start,
> +            },
> +        )?;
>  
>          Ok(pin_init!(Self {
>              spec,
> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
> index cd55d8dbf8e12d532f776d7544c7e5f2a865d6f8..71846dbbd512b6996d9f3a3c76aecde9d62460c0 100644
> --- a/drivers/gpu/nova-core/vbios.rs
> +++ b/drivers/gpu/nova-core/vbios.rs
> @@ -2,9 +2,6 @@
>  
>  //! VBIOS extraction and parsing.
>  
> -// To be removed when all code is used.
> -#![expect(dead_code)]
> -
>  use crate::driver::Bar0;
>  use crate::firmware::FalconUCodeDescV3;
>  use core::convert::TryFrom;
> 
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 19/19] gpu: nova-core: load and run FWSEC-FRTS
  2025-05-07 13:52 ` [PATCH v3 19/19] gpu: nova-core: load and " Alexandre Courbot
@ 2025-05-14 16:42   ` Danilo Krummrich
  0 siblings, 0 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-14 16:42 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed, May 07, 2025 at 10:52:46PM +0900, Alexandre Courbot wrote:
> +
> +        dev_info!(pdev.as_ref(), "WPR2: {:#x}-{:#x}\n", wpr2_lo, wpr2_hi);
> +        dev_info!(pdev.as_ref(), "GPU instance built\n");

Please use dev_dbg!().

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion
  2025-05-13 14:07   ` Danilo Krummrich
@ 2025-05-16 12:16     ` Alexandre Courbot
  0 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-16 12:16 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Tue May 13, 2025 at 11:07 PM JST, Danilo Krummrich wrote:
> On Wed, May 07, 2025 at 10:52:36PM +0900, Alexandre Courbot wrote:
>> Upon reset, the GPU executes the GFW_BOOT firmware in order to
>> initialize its base parameters such as clocks. The driver must ensure
>> that this step is completed before using the hardware.
>> 
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/devinit.rs   | 38 ++++++++++++++++++++++++++++++++++++++
>>  drivers/gpu/nova-core/driver.rs    |  2 +-
>>  drivers/gpu/nova-core/gpu.rs       |  5 +++++
>>  drivers/gpu/nova-core/nova_core.rs |  1 +
>>  drivers/gpu/nova-core/regs.rs      | 11 +++++++++++
>>  5 files changed, 56 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/nova-core/devinit.rs b/drivers/gpu/nova-core/devinit.rs
>> new file mode 100644
>> index 0000000000000000000000000000000000000000..5be2e0344fb651e5e53c9223aefeb5b2d95b8de1
>> --- /dev/null
>> +++ b/drivers/gpu/nova-core/devinit.rs
>> @@ -0,0 +1,38 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +//! Methods for device initialization.
>> +
>> +use kernel::bindings;
>> +use kernel::prelude::*;
>> +
>> +use crate::driver::Bar0;
>> +use crate::regs;
>> +
>> +/// Wait for devinit FW completion.
>> +///
>> +/// Upon reset, the GPU runs some firmware code to setup its core parameters. Most of the GPU is
>> +/// considered unusable until this step is completed, so it must be waited on very early during
>> +/// driver initialization.
>> +pub(crate) fn wait_gfw_boot_completion(bar: &Bar0) -> Result<()> {
>> +    let mut timeout = 2000;
>> +
>> +    loop {
>> +        let gfw_booted = regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK::read(bar)
>> +            .read_protection_level0()
>> +            && (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05::read(bar).value() & 0xff) == 0xff;
>
> What does it mean when the first 8 bits are set? Why don't we have a mask and
> value to compare for that?

Since this is a scratch register, the particular behavior of the bits
was documented in a separate OpenRM file - I finally found the
explanation after looking it up.

This appears to be a GFW boot progress counter, with 0xff meaninig that
boot is completed. I have added a (documented) specialization of this
register as well as a helper function to make this clear.

>
>> +
>> +        if gfw_booted {
>> +            return Ok(());
>> +        }
>> +
>> +        if timeout == 0 {
>> +            return Err(ETIMEDOUT);
>> +        }
>> +        timeout -= 1;
>
> NIT: This means the timeout is ~4s; can we start with timeout == 4000 and decrement
> with the number of ms passed to msleep()?
>
> Anyways, this should go away with read_poll_timeout() anyways.

Yes, the timeout code was a bit weird. Actually as you pointed out, we
can also use the `wait_on` helper introduced later in this series.

>
>> +
>> +        // TODO: use `read_poll_timeout` once it is available.
>> +        // (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori@gmail.com/)
>> +        // SAFETY: msleep should be safe to call with any parameter.
>> +        unsafe { bindings::msleep(2) };
>> +    }
>> +}
>> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
>> index a08fb6599267a960f0e07b6efd0e3b6cdc296aa4..752ba4b0fcfe8d835d366570bb2f807840a196da 100644
>> --- a/drivers/gpu/nova-core/driver.rs
>> +++ b/drivers/gpu/nova-core/driver.rs
>> @@ -10,7 +10,7 @@ pub(crate) struct NovaCore {
>>      pub(crate) gpu: Gpu,
>>  }
>>  
>> -const BAR0_SIZE: usize = 8;
>> +const BAR0_SIZE: usize = 0x1000000;
>
> This means that we'll fail probing the card if BAR0 is not at least 16MiB.
> AFAIK, that should be fine. However, can you make this a separate patch please?

Sure!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code
  2025-05-13 16:19   ` Danilo Krummrich
@ 2025-05-16 12:19     ` Alexandre Courbot
  2025-05-16 12:26       ` Danilo Krummrich
  0 siblings, 1 reply; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-16 12:19 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed May 14, 2025 at 1:19 AM JST, Danilo Krummrich wrote:
<snip>
>> +        util::wait_on(Duration::from_millis(20), || {
>> +            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
>> +            if r.mem_scrubbing() {
>> +                Some(())
>> +            } else {
>> +                None
>> +            }
>> +        })
>> +    }
>> +
>> +    /// Reset the falcon engine.
>> +    fn reset_eng(&self, bar: &Bar0) -> Result<()> {
>> +        let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
>> +
>> +        // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set
>> +        // RESET_READY so a non-failing timeout is used.
>
> Should we still warn about it?

OpenRM does not (as this is apparently a workaround to a HW bug?) so I
don't think we need to.

>
>> +        let _ = util::wait_on(Duration::from_micros(150), || {
>
> Do we know for sure that if RESET_READY is not set after 150us, it won't ever be
> set? If the answer to that is yes, and we also do not want to warn about
> RESET_READY not being set, why even bother trying to read it in the first place?

My guess is because this would the expected behavior if the bug wasn't
there. My GPU (Ampere) does wait until the timeout, but we can expect
newer GPUs to not have this problem and return earlier.

>
>> +            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
>> +            if r.reset_ready() {
>> +                Some(())
>> +            } else {
>> +                None
>> +            }
>> +        });
>> +
>> +        regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(true));
>> +
>> +        let _: Result<()> = util::wait_on(Duration::from_micros(10), || None);
>
> Can we please get an abstraction for udelay() for this?

Should it be local to nova-core, or be generally available? I refrained
from doing this because there is work going on regarding timer and I
thought it would cover things like udelay() as well. I'll add a TODO
item for now but please let me know if you have something different in
mind.

<snip>
>> +fn get_signature_reg_fuse_version_ga102(
>> +    dev: &device::Device,
>> +    bar: &Bar0,
>> +    engine_id_mask: u16,
>> +    ucode_id: u8,
>> +) -> Result<u32> {
>> +    // The ucode fuse versions are contained in the FUSE_OPT_FPF_<ENGINE>_UCODE<X>_VERSION
>> +    // registers, which are an array. Our register definition macros do not allow us to manage them
>> +    // properly, so we need to hardcode their addresses for now.
>> +
>> +    // Each engine has 16 ucode version registers numbered from 1 to 16.
>> +    if ucode_id == 0 || ucode_id > 16 {
>> +        dev_warn!(dev, "invalid ucode id {:#x}", ucode_id);
>
> Given that this is an error condition, this should be dev_err!() I suppose.
>
>> +        return Err(EINVAL);
>> +    }
>> +    let reg_fuse = if engine_id_mask & 0x0001 != 0 {
>> +        // NV_FUSE_OPT_FPF_SEC2_UCODE1_VERSION
>> +        0x824140
>> +    } else if engine_id_mask & 0x0004 != 0 {
>> +        // NV_FUSE_OPT_FPF_NVDEC_UCODE1_VERSION
>> +        0x824100
>> +    } else if engine_id_mask & 0x0400 != 0 {
>> +        // NV_FUSE_OPT_FPF_GSP_UCODE1_VERSION
>> +        0x8241c0
>> +    } else {
>> +        dev_warn!(dev, "unexpected engine_id_mask {:#x}", engine_id_mask);
>
> s/dev_warn/dev_err/
>
>> +        return Err(EINVAL);
>> +    } + ((ucode_id - 1) as usize * core::mem::size_of::<u32>());
>
> I feel like this calculation deserves a comment.

Agreed, added one. Also reorganized the code to make the calculation a
bit more obvious.

>
>> +
>> +    let reg_fuse_version = bar.read32(reg_fuse);
>
> I feel like the calculation of reg_fuse should be abstracted with a dedicated
> type in regs.rs. that takes the magic number derived from the engine_id_mask
> (which I assume is chip specific) and the ucode_id.

We would need proper support for register arrays to manage the ucode_id
offset, so I'm afraid this one will be hard to get rid of. What kind of
type did you have in mind?

One thing we can do though, is expose the offset of each register as a
register type constant, and use that instead of the hardcoded values
currently in this code - that part at least will be cleaner.

>
>> +
>> +    // Equivalent of Find Last Set bit.
>> +    Ok(u32::BITS - reg_fuse_version.leading_zeros())
>
> Maybe we should create a generic helper for that?

Good idea.

>
>> +}
>> +
>> +fn program_brom_ga102<E: FalconEngine>(bar: &Bar0, params: &FalconBromParams) -> Result<()> {
>> +    regs::NV_PFALCON2_FALCON_BROM_PARAADDR::default()
>> +        .set_value(params.pkc_data_offset)
>> +        .write(bar, E::BASE);
>> +    regs::NV_PFALCON2_FALCON_BROM_ENGIDMASK::default()
>> +        .set_value(params.engine_id_mask as u32)
>> +        .write(bar, E::BASE);
>> +    regs::NV_PFALCON2_FALCON_BROM_CURR_UCODE_ID::default()
>> +        .set_ucode_id(params.ucode_id)
>> +        .write(bar, E::BASE);
>> +    regs::NV_PFALCON2_FALCON_MOD_SEL::default()
>> +        .set_algo(FalconModSelAlgo::Rsa3k)
>> +        .write(bar, E::BASE);
>> +
>> +    Ok(())
>> +}
>> +
>> +pub(super) struct Ga102<E: FalconEngine>(PhantomData<E>);
>> +
>> +impl<E: FalconEngine> Ga102<E> {
>> +    pub(super) fn new() -> Self {
>> +        Self(PhantomData)
>> +    }
>> +}
>> +
>> +impl<E: FalconEngine> FalconHal<E> for Ga102<E> {
>> +    fn select_core(&self, _falcon: &Falcon<E>, bar: &Bar0) -> Result<()> {
>> +        select_core_ga102::<E>(bar)
>> +    }
>> +
>> +    fn get_signature_reg_fuse_version(
>> +        &self,
>> +        falcon: &Falcon<E>,
>> +        bar: &Bar0,
>> +        engine_id_mask: u16,
>> +        ucode_id: u8,
>> +    ) -> Result<u32> {
>> +        get_signature_reg_fuse_version_ga102(&falcon.dev, bar, engine_id_mask, ucode_id)
>> +    }
>> +
>> +    fn program_brom(
>> +        &self,
>> +        _falcon: &Falcon<E>,
>> +        bar: &Bar0,
>> +        params: &FalconBromParams,
>> +    ) -> Result<()> {
>> +        program_brom_ga102::<E>(bar, params)
>> +    }
>> +}
>> diff --git a/drivers/gpu/nova-core/falcon/sec2.rs b/drivers/gpu/nova-core/falcon/sec2.rs
>> new file mode 100644
>> index 0000000000000000000000000000000000000000..c1efdaa7c4e1b8c04c4e041aae3b61a8b65f656b
>> --- /dev/null
>> +++ b/drivers/gpu/nova-core/falcon/sec2.rs
>> @@ -0,0 +1,8 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +use crate::falcon::FalconEngine;
>> +
>> +pub(crate) struct Sec2;
>> +impl FalconEngine for Sec2 {
>> +    const BASE: usize = 0x00840000;
>> +}
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index c338da69ecbc2200f1ef3061a4d62971b021e3eb..ece13594fba687f3f714e255b5436e72d80dece3 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -5,6 +5,7 @@
>>  use crate::devinit;
>>  use crate::dma::DmaObject;
>>  use crate::driver::Bar0;
>> +use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon};
>>  use crate::firmware::Firmware;
>>  use crate::regs;
>>  use crate::util;
>> @@ -227,6 +228,16 @@ pub(crate) fn new(
>>              page
>>          };
>>  
>> +        let gsp_falcon = Falcon::<Gsp>::new(
>> +            pdev.as_ref(),
>> +            spec.chipset,
>> +            bar,
>> +            spec.chipset > Chipset::GA100,
>> +        )?;
>> +        gsp_falcon.clear_swgen0_intr(bar);
>> +
>> +        let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
>
> Just `_` instead? Also, please add a comment why it is important to create this
> instance even though it's never used.

It is not really important now, more a way to exercise the code until
we need to run Booter. The variable will be renamed to `sec2_falcon`
eventually, so I'd like to keep that name in the placeholder.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code
  2025-05-16 12:19     ` Alexandre Courbot
@ 2025-05-16 12:26       ` Danilo Krummrich
  0 siblings, 0 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-16 12:26 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Fri, May 16, 2025 at 09:19:45PM +0900, Alexandre Courbot wrote:
> On Wed May 14, 2025 at 1:19 AM JST, Danilo Krummrich wrote:
> <snip>
> >> +        util::wait_on(Duration::from_millis(20), || {
> >> +            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
> >> +            if r.mem_scrubbing() {
> >> +                Some(())
> >> +            } else {
> >> +                None
> >> +            }
> >> +        })
> >> +    }
> >> +
> >> +    /// Reset the falcon engine.
> >> +    fn reset_eng(&self, bar: &Bar0) -> Result<()> {
> >> +        let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
> >> +
> >> +        // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set
> >> +        // RESET_READY so a non-failing timeout is used.
> >
> > Should we still warn about it?
> 
> OpenRM does not (as this is apparently a workaround to a HW bug?) so I
> don't think we need to.
> 
> >
> >> +        let _ = util::wait_on(Duration::from_micros(150), || {
> >
> > Do we know for sure that if RESET_READY is not set after 150us, it won't ever be
> > set? If the answer to that is yes, and we also do not want to warn about
> > RESET_READY not being set, why even bother trying to read it in the first place?
> 
> My guess is because this would the expected behavior if the bug wasn't
> there. My GPU (Ampere) does wait until the timeout, but we can expect
> newer GPUs to not have this problem and return earlier.

Ok, let's keep it then.

> >
> >> +            let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE);
> >> +            if r.reset_ready() {
> >> +                Some(())
> >> +            } else {
> >> +                None
> >> +            }
> >> +        });
> >> +
> >> +        regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(true));
> >> +
> >> +        let _: Result<()> = util::wait_on(Duration::from_micros(10), || None);
> >
> > Can we please get an abstraction for udelay() for this?
> 
> Should it be local to nova-core, or be generally available? I refrained
> from doing this because there is work going on regarding timer and I
> thought it would cover things like udelay() as well. I'll add a TODO
> item for now but please let me know if you have something different in
> mind.

Not local to nova-core, but in the generic abstraction. I don't think the
generic abstraction posted on the mailing list contains udelay(). Should be
trivial to add it with a subsequent patch though.

A TODO should be fine for now.

> >> +    let reg_fuse_version = bar.read32(reg_fuse);
> >
> > I feel like the calculation of reg_fuse should be abstracted with a dedicated
> > type in regs.rs. that takes the magic number derived from the engine_id_mask
> > (which I assume is chip specific) and the ucode_id.
> 
> We would need proper support for register arrays to manage the ucode_id
> offset, so I'm afraid this one will be hard to get rid of. What kind of
> type did you have in mind?
> 
> One thing we can do though, is expose the offset of each register as a
> register type constant, and use that instead of the hardcoded values
> currently in this code - that part at least will be cleaner.

Let's do that then for now.

> >> +        let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
> >
> > Just `_` instead? Also, please add a comment why it is important to create this
> > instance even though it's never used.
> 
> It is not really important now, more a way to exercise the code until
> we need to run Booter. The variable will be renamed to `sec2_falcon`
> eventually, so I'd like to keep that name in the placeholder.

Ok, seems reasonable.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-07 13:52 ` [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot Alexandre Courbot
  2025-05-13 17:19   ` Danilo Krummrich
  2025-05-14 16:23   ` Danilo Krummrich
@ 2025-05-16 20:38   ` Timur Tabi
  2025-05-20  6:35     ` Joel Fernandes
  2 siblings, 1 reply; 51+ messages in thread
From: Timur Tabi @ 2025-05-16 20:38 UTC (permalink / raw)
  To: dakr@kernel.org, a.hindborg@kernel.org, ojeda@kernel.org,
	boqun.feng@gmail.com, simona@ffwll.ch, tmgross@umich.edu,
	alex.gaynor@gmail.com, tzimmermann@suse.de, corbet@lwn.net,
	mripard@kernel.org, maarten.lankhorst@linux.intel.com,
	benno.lossin@proton.me, bjorn3_gh@protonmail.com,
	airlied@gmail.com, aliceryhl@google.com, Alexandre Courbot,
	gary@garyguo.net
  Cc: Shirish Baskaran, Alistair Popple, John Hubbard,
	rust-for-linux@vger.kernel.org, dri-devel@lists.freedesktop.org,
	nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	Joel Fernandes, Ben Skeggs

On Wed, 2025-05-07 at 22:52 +0900, Alexandre Courbot wrote:
> +impl FwSecBiosImage {
> +    fn setup_falcon_data(
> +        &mut self,
> +        pdev: &pci::Device,
> +        pci_at_image: &PciAtBiosImage,
> +        first_fwsec_image: &FwSecBiosImage,
> +    ) -> Result<()> {
> +        let mut offset = pci_at_image.falcon_data_ptr(pdev)? as usize;
> +
> +        // The falcon data pointer assumes that the PciAt and FWSEC images
> +        // are contiguous in memory. However, testing shows the EFI image sits in
> +        // between them. So calculate the offset from the end of the PciAt image
> +        // rather than the start of it. Compensate.
> +        offset -= pci_at_image.base.data.len();
> +
> +        // The offset is now from the start of the first Fwsec image, however
> +        // the offset points to a location in the second Fwsec image. Since
> +        // the fwsec images are contiguous, subtract the length of the first Fwsec
> +        // image from the offset to get the offset to the start of the second
> +        // Fwsec image.
> +        offset -= first_fwsec_image.base.data.len();
> +
> +        self.falcon_data_offset = Some(offset);
> +
> +        // The PmuLookupTable starts at the offset of the falcon data pointer
> +        self.pmu_lookup_table = Some(PmuLookupTable::new(&self.base.data[offset..])?);
> +
> +        match self
> +            .pmu_lookup_table
> +            .as_ref()
> +            .ok_or(EINVAL)?
> +            .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD)
> +        {
> +            Ok(entry) => {
> +                let mut ucode_offset = entry.data as usize;
> +                ucode_offset -= pci_at_image.base.data.len();
> +                ucode_offset -= first_fwsec_image.base.data.len();
> +                self.falcon_ucode_offset = Some(ucode_offset);
> +                if cfg!(debug_assertions) {
> +                    // Print the v3_desc header for debugging
> +                    let v3_desc = self.fwsec_header(pdev.as_ref())?;
> +                    pr_info!("PmuLookupTableEntry v3_desc: {:#?}\n", v3_desc);
> +                }
> +            }
> +            Err(e) => {
> +                dev_err!(
> +                    pdev.as_ref(),
> +                    "PmuLookupTableEntry not found, error: {:?}\n",
> +                    e
> +                );

Shouldn't you return an error here?

If not, then maybe this should probably be dev_warn.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region
  2025-05-13 16:41   ` Danilo Krummrich
@ 2025-05-17 13:42     ` Alexandre Courbot
  0 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-17 13:42 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Wed May 14, 2025 at 1:41 AM JST, Danilo Krummrich wrote:
<snip>
>> diff --git a/drivers/gpu/nova-core/gsp/fb.rs b/drivers/gpu/nova-core/gsp/fb.rs
>> new file mode 100644
>> index 0000000000000000000000000000000000000000..f28ded59469d52daf39e5d19c09efd7bf08fee92
>> --- /dev/null
>> +++ b/drivers/gpu/nova-core/gsp/fb.rs
>> @@ -0,0 +1,108 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +use core::ops::Range;
>> +
>> +use kernel::prelude::*;
>> +
>> +use crate::driver::Bar0;
>> +use crate::gpu::Chipset;
>> +use crate::regs;
>> +
>> +fn align_down(value: u64, align: u64) -> u64 {
>> +    value & !(align - 1)
>> +}
>
> Can this go in the previous patch, i.e. "rust: num: Add an upward alignment
> helper for usize"?

Yes, let me try to consolidate things around the `num` module. Not sure
why it didn't occur to me to add that one there.

>
>> +
>> +/// Layout of the GPU framebuffer memory.
>> +///
>> +/// Contains ranges of GPU memory reserved for a given purpose during the GSP bootup process.
>> +#[derive(Debug)]
>> +#[expect(dead_code)]
>> +pub(crate) struct FbLayout {
>> +    pub fb: Range<u64>,
>> +
>> +    pub vga_workspace: Range<u64>,
>> +    pub bios: Range<u64>,
>> +
>> +    pub frts: Range<u64>,
>
> Please remove the empty lines.
>
>> +}
>> +
>> +impl FbLayout {
>> +    pub(crate) fn new(chipset: Chipset, bar: &Bar0) -> Result<Self> {
>> +        let fb = {
>> +            let fb_size = vidmem_size(bar, chipset);
>> +
>> +            0..fb_size
>> +        };
>> +        let fb_len = fb.end - fb.start;
>
> Isn't this the same as fb_size? Why not just write it as
>
> 	let fb_size = vidmem_size(bar, chipset);
> 	let fb = 0..fb_size;

It is the same indeed, and fb_size and fb_len are semantically the same
thing so no need to have both.

>
>> +
>> +        let vga_workspace = {
>> +            let vga_base = vga_workspace_addr(bar, fb_len, chipset);
>> +
>> +            vga_base..fb.end
>> +        };
>> +
>> +        let bios = vga_workspace.clone();
>
> Why? And why store the same thing twice in FbLayout? If it's really needed,
> clone it in the constructor below and add a comment why it's the same.

The bios field does not seem to be used at the moment anyway, so let me
remove it.

>
>> +
>> +        let frts = {
>> +            const FRTS_DOWN_ALIGN: u64 = 0x20000;
>> +            const FRTS_SIZE: u64 = 0x100000;
>> +            let frts_base = align_down(vga_workspace.start, FRTS_DOWN_ALIGN) - FRTS_SIZE;
>> +
>> +            frts_base..frts_base + FRTS_SIZE
>> +        };
>> +
>> +        Ok(Self {
>> +            fb,
>> +            vga_workspace,
>> +            bios,
>> +            frts,
>> +        })
>> +    }
>> +}
>
> I'd probably wrap those helpers below in
>
> 	mod hal { ... }
>
> or even a new file fb/hal.rs to make their purpose obvious.

Do we want a module here? I'm fine with it, but these methods are
already private anyway and putting them under a module would require
them to have `pub(super)` visibility.

... or maybe we should have an actual HAL here with dynamic dispatch,
similar to what we have in the falcon module. That's what OpenRM does
actually. Let me look into that.

>
>> +/// Returns `true` if the display is disabled.
>> +fn display_disabled(bar: &Bar0, chipset: Chipset) -> bool {
>> +    if chipset >= Chipset::GA100 {
>> +        regs::NV_FUSE_STATUS_OPT_DISPLAY_MAXWELL::read(bar).display_disabled()
>> +    } else {
>> +        regs::NV_FUSE_STATUS_OPT_DISPLAY_AMPERE::read(bar).display_disabled()
>> +    }
>> +}
>> +
>> +/// Returns the video memory size in bytes.
>> +fn vidmem_size(bar: &Bar0, chipset: Chipset) -> u64 {
>> +    if chipset >= Chipset::GA102 {
>
> Is GA102 intentional or should this also be GA100?

After double-checking with OpenRM GA102 is indeed correct.

>
>> +        (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_42::read(bar).value() as u64) << 20
>
> Why isn't this shift part of the register abstraction?

This value came from a scratch register, which interpretation depends on
context so I did not abstract the values. But OpenRM does, so let me add
a way to create alias registers so we can get adequate definitions here
as well.

>
>> +    } else {
>> +        let local_mem_range = regs::NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE::read(bar);
>> +        let size =
>> +            (local_mem_range.lower_mag() as u64) << ((local_mem_range.lower_scale() as u64) + 20);
>
> Same here. Please make this part of the register abstraction as it is done in
> previous patches.

Ack.

>
>> +
>> +        if local_mem_range.ecc_mode_enabled() {
>> +            size / 16 * 15
>> +        } else {
>> +            size
>> +        }
>> +    }
>> +}
>> +
>> +/// Returns the vga workspace address.
>> +fn vga_workspace_addr(bar: &Bar0, fb_size: u64, chipset: Chipset) -> u64 {
>> +    let base = fb_size - 0x100000;
>
> What's this offset? How do you guarantee that this never underflows?

Looked it up in OpenRM, it is the size of PRAMIN. I'll add a constant.

>
>> +    let vga_workspace_base = if display_disabled(bar, chipset) {
>> +        regs::NV_PDISP_VGA_WORKSPACE_BASE::read(bar)
>> +    } else {
>> +        return base;
>> +    };
>> +
>> +    if !vga_workspace_base.status_valid() {
>> +        return base;
>> +    }
>> +
>> +    let addr = (vga_workspace_base.addr() as u64) << 16;
>
> Where does this shift come from? Shouldn't this be part of the register
> abstraction?

Yes. Also added documentation in the field to explain that the field's
lowest 16 bits are truncated.

>
>> +    if addr < base {
>> +        fb_size - 0x20000
>
> What's this offset? Can this ever underflow?

This one is also defined as a constant in OpenRM. Let me replicate it
here.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS
  2025-05-14 16:38   ` Danilo Krummrich
@ 2025-05-19 14:24     ` Alexandre Courbot
  0 siblings, 0 replies; 51+ messages in thread
From: Alexandre Courbot @ 2025-05-19 14:24 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Joel Fernandes, Timur Tabi, Alistair Popple,
	linux-kernel, rust-for-linux, nouveau, dri-devel

On Thu May 15, 2025 at 1:38 AM JST, Danilo Krummrich wrote:
> On Wed, May 07, 2025 at 10:52:45PM +0900, Alexandre Courbot wrote:
>> The FWSEC firmware needs to be extracted from the VBIOS and patched with
>> the desired command, as well as the right signature. Do this so we are
>> ready to load and run this firmware into the GSP falcon and create the
>> FRTS region.
>> 
>> [joelagnelf@nvidia.com: give better names to FalconAppifHdrV1's fields]
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/dma.rs            |   3 -
>>  drivers/gpu/nova-core/firmware.rs       |  18 ++
>>  drivers/gpu/nova-core/firmware/fwsec.rs | 359 ++++++++++++++++++++++++++++++++
>>  drivers/gpu/nova-core/gpu.rs            |  20 +-
>>  drivers/gpu/nova-core/vbios.rs          |   3 -
>>  5 files changed, 395 insertions(+), 8 deletions(-)
>> 
>> diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
>> index 9d90ae01d0044eaab4ddbc3eba216741d7a623ef..a12d0dff574aa38fb5eb8f4d759611af2f8ba3ec 100644
>> --- a/drivers/gpu/nova-core/dma.rs
>> +++ b/drivers/gpu/nova-core/dma.rs
>> @@ -2,9 +2,6 @@
>>  
>>  //! Simple DMA object wrapper.
>>  
>> -// To be removed when all code is used.
>> -#![expect(dead_code)]
>> -
>>  use core::ops::{Deref, DerefMut};
>>  
>>  use kernel::device;
>> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
>> index 960982174d834c7c66a47ecfb3a15bf47116b2c5..3945fd18499555ddd6fb2e0ea69535b40fcc4b08 100644
>> --- a/drivers/gpu/nova-core/firmware.rs
>> +++ b/drivers/gpu/nova-core/firmware.rs
>> @@ -8,9 +8,12 @@
>>  use kernel::prelude::*;
>>  use kernel::str::CString;
>>  
>> +use crate::dma::DmaObject;
>>  use crate::gpu;
>>  use crate::gpu::Chipset;
>>  
>> +pub(crate) mod fwsec;
>> +
>>  pub(crate) const FIRMWARE_VERSION: &str = "535.113.01";
>>  
>>  /// Structure encapsulating the firmware blobs required for the GPU to operate.
>> @@ -86,6 +89,21 @@ pub(crate) fn size(&self) -> usize {
>>      }
>>  }
>>  
>> +/// Patch the `ucode_dma` firmware at offset `sig_base_img` with `signature`.
>> +fn patch_signature(ucode_dma: &mut DmaObject, signature: &[u8], sig_base_img: usize) -> Result<()> {
>> +    if sig_base_img + signature.len() > ucode_dma.size() {
>> +        return Err(EINVAL);
>> +    }
>> +
>> +    // SAFETY: we are the only user of this object, so there cannot be any race.
>> +    let dst = unsafe { ucode_dma.start_ptr_mut().add(sig_base_img) };
>> +
>> +    // SAFETY: `signature` and `dst` are valid, properly aligned, and do not overlap.
>> +    unsafe { core::ptr::copy_nonoverlapping(signature.as_ptr(), dst, signature.len()) };
>> +
>> +    Ok(())
>> +}
>
> Why is this not in firmware/fwsec.rs, like patch_command()?

Ah, there is no way to know it now, but this function will also be used
to patch the booter firmware that runs on sec2, so having it here makes
it available to both sub-modules. I'm fine with moving it into the fwsec
module temporarily if you prefer though.

>
> Also, please wrap the ucode DmaObject in its own type, i.e.
> `struct UcodeDma(DmaObject)` and make the patch_*() functions methods of this
> type. They're only applicable for the ucode DmaObject.

Indeed, good idea. We will event want to specialize that type against
the kind of firmware as not all patching methods may be applicable
depending on the firmware.

<snip>
>> +impl FwsecFirmware {
>> +    /// Extract the Fwsec firmware from `bios` and patch it to run with the `cmd` command.
>> +    pub(crate) fn new(
>> +        falcon: &Falcon<Gsp>,
>> +        dev: &Device<device::Bound>,
>> +        bar: &Bar0,
>> +        bios: &Vbios,
>> +        cmd: FwsecCommand,
>> +    ) -> Result<Self> {
>> +        let v3_desc = bios.fwsec_header(dev)?;
>> +        let ucode = bios.fwsec_ucode(dev)?;
>> +
>> +        let mut ucode_dma = DmaObject::from_data(dev, ucode)?;
>> +        patch_command(&mut ucode_dma, v3_desc, cmd)?;
>> +
>> +        const SIG_SIZE: usize = 96 * 4;
>
> 96 * 4? :-)

Mmmm let me look that up. ^_^; But I think it means that a signature is
made of 96 32-bit integers.

>
>> +        let signatures = bios.fwsec_sigs(dev)?;
>> +        let sig_base_img = (v3_desc.imem_load_size + v3_desc.pkc_data_offset) as usize;
>> +
>> +        if v3_desc.signature_count != 0 {
>> +            // Patch signature.
>> +            let desc_sig_versions = v3_desc.signature_versions as u32;
>> +            let reg_fuse_version = falcon.get_signature_reg_fuse_version(
>> +                bar,
>> +                v3_desc.engine_id_mask,
>> +                v3_desc.ucode_id,
>> +            )?;
>> +            dev_dbg!(
>> +                dev,
>> +                "desc_sig_versions: {:#x}, reg_fuse_version: {}\n",
>> +                desc_sig_versions,
>> +                reg_fuse_version
>> +            );
>> +            let signature_idx = {
>> +                let reg_fuse_version_bit = 1 << reg_fuse_version;
>> +
>> +                // Check if the fuse version is supported by the firmware.
>> +                if desc_sig_versions & reg_fuse_version_bit == 0 {
>> +                    dev_warn!(
>> +                        dev,
>> +                        "no matching signature: {:#x} {:#x}\n",
>> +                        reg_fuse_version_bit,
>> +                        v3_desc.signature_versions
>> +                    );
>
> Looks like this should be dev_err!().

Indeed, fixed.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-14 16:23   ` Danilo Krummrich
@ 2025-05-19 22:59     ` Joel Fernandes
  2025-05-20  7:18     ` Joel Fernandes
  1 sibling, 0 replies; 51+ messages in thread
From: Joel Fernandes @ 2025-05-19 22:59 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Timur Tabi, Alistair Popple, linux-kernel,
	rust-for-linux, nouveau, dri-devel, Shirish Baskaran

Hi Danilo,

On 5/14/2025 12:23 PM, Danilo Krummrich wrote:
> On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
>> +/// PCI Data Structure as defined in PCI Firmware Specification
>> +#[derive(Debug, Clone)]
>> +#[repr(C)]
>> +struct PcirStruct {
>> +    /// PCI Data Structure signature ("PCIR" or "NPDS")
>> +    pub signature: [u8; 4],
>> +    /// PCI Vendor ID (e.g., 0x10DE for NVIDIA)
>> +    pub vendor_id: u16,
>> +    /// PCI Device ID
>> +    pub device_id: u16,
>> +    /// Device List Pointer
>> +    pub device_list_ptr: u16,
>> +    /// PCI Data Structure Length
>> +    pub pci_data_struct_len: u16,
>> +    /// PCI Data Structure Revision
>> +    pub pci_data_struct_rev: u8,
>> +    /// Class code (3 bytes, 0x03 for display controller)
>> +    pub class_code: [u8; 3],
>> +    /// Size of this image in 512-byte blocks
>> +    pub image_len: u16,
>> +    /// Revision Level of the Vendor's ROM
>> +    pub vendor_rom_rev: u16,
>> +    /// ROM image type (0x00 = PC-AT compatible, 0x03 = EFI, 0x70 = NBSI)
>> +    pub code_type: u8,
>> +    /// Last image indicator (0x00 = Not last image, 0x80 = Last image)
>> +    pub last_image: u8,
>> +    /// Maximum Run-time Image Length (units of 512 bytes)
>> +    pub max_runtime_image_len: u16,
>> +}
> 
> Here and in a couple more cases below, please don't use pub for fields of
> private structures.

Fixed thanks.

>> +
>> +impl PcirStruct {
>> +    fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> {
>> +        if data.len() < core::mem::size_of::<PcirStruct>() {
>> +            dev_err!(pdev.as_ref(), "Not enough data for PcirStruct\n");
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        let mut signature = [0u8; 4];
>> +        signature.copy_from_slice(&data[0..4]);
>> +
>> +        // Signature should be "PCIR" (0x52494350) or "NPDS" (0x5344504e)
>> +        if &signature != b"PCIR" && &signature != b"NPDS" {
>> +            dev_err!(
>> +                pdev.as_ref(),
>> +                "Invalid signature for PcirStruct: {:?}\n",
>> +                signature
>> +            );
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        let mut class_code = [0u8; 3];
>> +        class_code.copy_from_slice(&data[13..16]);
>> +
>> +        Ok(PcirStruct {
>> +            signature,
>> +            vendor_id: u16::from_le_bytes([data[4], data[5]]),
>> +            device_id: u16::from_le_bytes([data[6], data[7]]),
>> +            device_list_ptr: u16::from_le_bytes([data[8], data[9]]),
>> +            pci_data_struct_len: u16::from_le_bytes([data[10], data[11]]),
>> +            pci_data_struct_rev: data[12],
>> +            class_code,
>> +            image_len: u16::from_le_bytes([data[16], data[17]]),
>> +            vendor_rom_rev: u16::from_le_bytes([data[18], data[19]]),
>> +            code_type: data[20],
>> +            last_image: data[21],
>> +            max_runtime_image_len: u16::from_le_bytes([data[22], data[23]]),
>> +        })
> 
> Quite some of those fields seem unused, do we still want to have them? Same for
> other structures below.

I think we discussed this in the previous posting as well. As such, I am not
keen on removing unused fields of structures part of 'standard' specifications
since I only see drawbacks of doing so:
1. Obfuscation
2. Replacement of the fields with some kind of padding so that size_of() works.
3. Goes in the opposite direction of documentation and transparency in how the
structures work.
4. Partially filling structures.


>> +
>> +    /// Check if this is the last image in the ROM
>> +    fn is_last(&self) -> bool {
>> +        self.last_image & LAST_IMAGE_BIT_MASK != 0
>> +    }
>> +
>> +    /// Calculate image size in bytes
>> +    fn image_size_bytes(&self) -> Result<usize> {
>> +        if self.image_len > 0 {
>> +            // Image size is in 512-byte blocks
>> +            Ok(self.image_len as usize * 512)
>> +        } else {
>> +            Err(EINVAL)
>> +        }
>> +    }
>> +}
>> +
>> +/// BIOS Information Table (BIT) Header
>> +/// This is the head of the BIT table, that is used to locate the Falcon data.
>> +/// The BIT table (with its header) is in the PciAtBiosImage and the falcon data
>> +/// it is pointing to is in the FwSecBiosImage.
>> +#[derive(Debug, Clone, Copy)]
>> +#[expect(dead_code)]
>> +struct BitHeader {
>> +    /// 0h: BIT Header Identifier (BMP=0x7FFF/BIT=0xB8FF)
>> +    pub id: u16,
>> +    /// 2h: BIT Header Signature ("BIT\0")
>> +    pub signature: [u8; 4],
>> +    /// 6h: Binary Coded Decimal Version, ex: 0x0100 is 1.00.
>> +    pub bcd_version: u16,
>> +    /// 8h: Size of BIT Header (in bytes)
>> +    pub header_size: u8,
>> +    /// 9h: Size of BIT Tokens (in bytes)
>> +    pub token_size: u8,
>> +    /// 10h: Number of token entries that follow
>> +    pub token_entries: u8,
>> +    /// 11h: BIT Header Checksum
>> +    pub checksum: u8,
>> +}
>> +
>> +impl BitHeader {
>> +    fn new(data: &[u8]) -> Result<Self> {
>> +        if data.len() < 12 {
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        let mut signature = [0u8; 4];
>> +        signature.copy_from_slice(&data[2..6]);
>> +
>> +        // Check header ID and signature
>> +        let id = u16::from_le_bytes([data[0], data[1]]);
>> +        if id != 0xB8FF || &signature != b"BIT\0" {
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        Ok(BitHeader {
>> +            id,
>> +            signature,
>> +            bcd_version: u16::from_le_bytes([data[6], data[7]]),
>> +            header_size: data[8],
>> +            token_size: data[9],
>> +            token_entries: data[10],
>> +            checksum: data[11],
>> +        })
>> +    }
>> +}
>> +
>> +/// BIT Token Entry: Records in the BIT table followed by the BIT header
>> +#[derive(Debug, Clone, Copy)]
>> +#[expect(dead_code)]
>> +struct BitToken {
>> +    /// 00h: Token identifier
>> +    pub id: u8,
>> +    /// 01h: Version of the token data
>> +    pub data_version: u8,
>> +    /// 02h: Size of token data in bytes
>> +    pub data_size: u16,
>> +    /// 04h: Offset to the token data
>> +    pub data_offset: u16,
>> +}
>> +
>> +// Define the token ID for the Falcon data
>> +pub(in crate::vbios) const BIT_TOKEN_ID_FALCON_DATA: u8 = 0x70;
> 
> This can just be private.

Yep, fixed.

>> +
>> +impl BitToken {
>> +    /// Find a BIT token entry by BIT ID in a PciAtBiosImage
>> +    pub(in crate::vbios) fn from_id(image: &PciAtBiosImage, token_id: u8) -> Result<Self> {
> 
> Same here.

Fixed.

> <snip>
> 
>> +struct PciAtBiosImage {
>> +    base: BiosImageBase,
>> +    bit_header: Option<BitHeader>,
>> +    bit_offset: Option<usize>,
> 
> Why are those Options? AFAICS, this structure is only ever created from
> 
> 	impl TryFrom<BiosImageBase> for PciAtBiosImage
> 
> and there you fail if you can't find the bit header anyways.
> 
> Also BitToken::from_id fails if bit_header == None, and it doesn't seem to be
> used anywhere else.
> 
> I think we should remove the Option wrapper for both.

Yes, thanks. That does simplify the code, I made the change and it works.


> 
>> +/// The PmuLookupTableEntry structure is used to find the PmuLookupTableEntry
>> +/// for a given application ID. The table of entries is pointed to by the falcon
>> +/// data pointer in the BIT table, and is used to locate the Falcon Ucode.
>> +#[expect(dead_code)]
>> +struct PmuLookupTable {
>> +    version: u8,
>> +    header_len: u8,
>> +    entry_len: u8,
>> +    entry_count: u8,
>> +    table_data: KVec<u8>,
>> +}
>> +
>> +impl PmuLookupTable {
>> +    fn new(data: &[u8]) -> Result<Self> {
>> +        if data.len() < 4 {
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        let header_len = data[1] as usize;
>> +        let entry_len = data[2] as usize;
>> +        let entry_count = data[3] as usize;
>> +
>> +        let required_bytes = header_len + (entry_count * entry_len);
>> +
>> +        if data.len() < required_bytes {
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        // Create a copy of only the table data
>> +        let mut table_data = KVec::new();
>> +
>> +        // "last_entry_bytes" is a debugging aid.
>> +        let mut last_entry_bytes: Option<KVec<u8>> = if cfg!(debug_assertions) {
>> +            Some(KVec::new())
>> +        } else {
>> +            None
>> +        };
>> +
>> +        for &byte in &data[header_len..required_bytes] {
>> +            table_data.push(byte, GFP_KERNEL)?;
> 
> This should just be
> 
> 	table_data.extend_from_slice(&data[header_len..required_bytes], GFP_KERNEL)?;
> 
> so you don't need the loop and potentially lots of re-allocations.
> 
> Subsequently you can implement the debugging stuff as
> 
> 	if cfg!(debug_assertions) {
> 	    let mut last_entry_bytes = KVec::new();
> 	
> 	    for &byte in &data[header_len..required_bytes] {
> 	        // Debugging (dumps the table data to dmesg):
> 	        last_entry_bytes.push(byte, GFP_KERNEL)?;
> 	
> 	        let last_entry_bytes_len = last_entry_bytes.len();
> 	        if last_entry_bytes_len == entry_len {
> 	            pr_info!("Last entry bytes: {:02x?}\n", &last_entry_bytes[..]);
> 	            last_entry_bytes = KVec::new();
> 	        }
> 	    }
> 	}

Ok, that's better, I took the opportunity to replace this code with:

(Sorry for wrapping)

        // Create a copy of only the table data
        let data_entries = &data[header_len..required_bytes];
        let table_data = {
            let mut ret = KVec::new();
            ret.extend_from_slice(&data_entries, GFP_KERNEL)?;
            ret
        };

        // Debug logging of entries (dumps the table data to dmesg)
        if cfg!(debug_assertions) {
            for i in 0..entry_count {
                pr_info!("PMU entry: {:02x?}\n", &data_entries[i * entry_len..(i
+ 1) * entry_len]);
            }
        }


> In general, I feel like this patch utilizes the Option type way too much and
> often without actual need. Can you please also double check?

Yeah, sorry, I'm somewhat new to rust. :-D. I am going through all my Options now.

I will continue addressing the rest of the comments and those in the other email
and will reply soon. Thanks!

 - Joel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-16 20:38   ` Timur Tabi
@ 2025-05-20  6:35     ` Joel Fernandes
  0 siblings, 0 replies; 51+ messages in thread
From: Joel Fernandes @ 2025-05-20  6:35 UTC (permalink / raw)
  To: Timur Tabi, dakr@kernel.org, a.hindborg@kernel.org,
	ojeda@kernel.org, boqun.feng@gmail.com, simona@ffwll.ch,
	tmgross@umich.edu, alex.gaynor@gmail.com, tzimmermann@suse.de,
	corbet@lwn.net, mripard@kernel.org,
	maarten.lankhorst@linux.intel.com, benno.lossin@proton.me,
	bjorn3_gh@protonmail.com, airlied@gmail.com, aliceryhl@google.com,
	Alexandre Courbot, gary@garyguo.net
  Cc: Shirish Baskaran, Alistair Popple, John Hubbard,
	rust-for-linux@vger.kernel.org, dri-devel@lists.freedesktop.org,
	nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	Ben Skeggs



On 5/16/2025 4:38 PM, Timur Tabi wrote:
> n Wed, 2025-05-07 at 22:52 +0900, Alexandre Courbot wrote:
>> +impl FwSecBiosImage {
>> +    fn setup_falcon_data(
>> +        &mut self,
>> +        pdev: &pci::Device,
>> +        pci_at_image: &PciAtBiosImage,
>> +        first_fwsec_image: &FwSecBiosImage,
>> +    ) -> Result<()> {
>> +        let mut offset = pci_at_image.falcon_data_ptr(pdev)? as usize;
>> +
>> +        // The falcon data pointer assumes that the PciAt and FWSEC images
>> +        // are contiguous in memory. However, testing shows the EFI image sits in
>> +        // between them. So calculate the offset from the end of the PciAt image
>> +        // rather than the start of it. Compensate.
>> +        offset -= pci_at_image.base.data.len();
>> +
>> +        // The offset is now from the start of the first Fwsec image, however
>> +        // the offset points to a location in the second Fwsec image. Since
>> +        // the fwsec images are contiguous, subtract the length of the first Fwsec
>> +        // image from the offset to get the offset to the start of the second
>> +        // Fwsec image.
>> +        offset -= first_fwsec_image.base.data.len();
>> +
>> +        self.falcon_data_offset = Some(offset);
>> +
>> +        // The PmuLookupTable starts at the offset of the falcon data pointer
>> +        self.pmu_lookup_table = Some(PmuLookupTable::new(&self.base.data[offset..])?);
>> +
>> +        match self
>> +            .pmu_lookup_table
>> +            .as_ref()
>> +            .ok_or(EINVAL)?
>> +            .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD)
>> +        {
>> +            Ok(entry) => {
>> +                let mut ucode_offset = entry.data as usize;
>> +                ucode_offset -= pci_at_image.base.data.len();
>> +                ucode_offset -= first_fwsec_image.base.data.len();
>> +                self.falcon_ucode_offset = Some(ucode_offset);
>> +                if cfg!(debug_assertions) {
>> +                    // Print the v3_desc header for debugging
>> +                    let v3_desc = self.fwsec_header(pdev.as_ref())?;
>> +                    pr_info!("PmuLookupTableEntry v3_desc: {:#?}\n", v3_desc);
>> +                }
>> +            }
>> +            Err(e) => {
>> +                dev_err!(
>> +                    pdev.as_ref(),
>> +                    "PmuLookupTableEntry not found, error: {:?}\n",
>> +                    e
>> +                );
> Shouldn't you return an error here?
> 
> If not, then maybe this should probably be dev_warn.

Good catch, fixed! Thanks,

 - Joel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-14 16:23   ` Danilo Krummrich
  2025-05-19 22:59     ` Joel Fernandes
@ 2025-05-20  7:18     ` Joel Fernandes
  1 sibling, 0 replies; 51+ messages in thread
From: Joel Fernandes @ 2025-05-20  7:18 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Timur Tabi, Alistair Popple, linux-kernel,
	rust-for-linux, nouveau, dri-devel, Shirish Baskaran

Hi Danilo,

On 5/14/2025 12:23 PM, Danilo Krummrich wrote:
>  I feel like this patch utilizes the Option type way too much and
> often without actual need. Can you please also double check?
> 

I found one other instance (vbios.fwsec_image). Other than that, all others are
required AFAICS.

>> +
>> +            if cfg!(debug_assertions) {
>> +                // Debugging (dumps the table data to dmesg):
>> +                if let Some(ref mut last_entry_bytes) = last_entry_bytes {
>> +                    last_entry_bytes.push(byte, GFP_KERNEL)?;
>> +
>> +                    let last_entry_bytes_len = last_entry_bytes.len();
>> +                    if last_entry_bytes_len == entry_len {
>> +                        pr_info!("Last entry bytes: {:02x?}\n", &last_entry_bytes[..]);
>
> Please use dev_dbg!().
>

This required passing down the pdev here, but did that, thanks.

>> +                        *last_entry_bytes = KVec::new();
>> +                    }
>> +                }
>> +            }
>> +        }
>> +
>> +        Ok(PmuLookupTable {
>> +            version: data[0],
>> +            header_len: header_len as u8,
>> +            entry_len: entry_len as u8,
>> +            entry_count: entry_count as u8,
>> +            table_data,
>> +        })
>> +    }
>> +
>> +    fn lookup_index(&self, idx: u8) -> Result<PmuLookupTableEntry> {
>> +        if idx >= self.entry_count {
>> +            return Err(EINVAL);
>> +        }
>> +
>> +        let index = (idx as usize) * self.entry_len as usize;
>> +        PmuLookupTableEntry::new(&self.table_data[index..])
>> +    }
>> +
>> +    // find entry by type value
>> +    fn find_entry_by_type(&self, entry_type: u8) -> Result<PmuLookupTableEntry> {
>> +        for i in 0..self.entry_count {
>> +            let entry = self.lookup_index(i)?;
>> +            if entry.application_id == entry_type {
>> +                return Ok(entry);
>> +            }
>> +        }
>> +
>> +        Err(EINVAL)
>> +    }
>> +}
>> +
>> +/// The FwSecBiosImage structure contains the PMU table and the Falcon Ucode.
>> +/// The PMU table contains voltage/frequency tables as well as a pointer to the
>> +/// Falcon Ucode.
>> +impl FwSecBiosImage {
>> +    fn setup_falcon_data(
>> +        &mut self,
>> +        pdev: &pci::Device,
>> +        pci_at_image: &PciAtBiosImage,
>> +        first_fwsec_image: &FwSecBiosImage,
>> +    ) -> Result<()> {
>
> Just Result will do.
>

Fixed.

>> +        let mut offset = pci_at_image.falcon_data_ptr(pdev)? as usize;
>> +
>> +        // The falcon data pointer assumes that the PciAt and FWSEC images
>> +        // are contiguous in memory. However, testing shows the EFI image sits in
>> +        // between them. So calculate the offset from the end of the PciAt image
>> +        // rather than the start of it. Compensate.
>> +        offset -= pci_at_image.base.data.len();
>> +
>> +        // The offset is now from the start of the first Fwsec image, however
>> +        // the offset points to a location in the second Fwsec image. Since
>> +        // the fwsec images are contiguous, subtract the length of the first Fwsec
>> +        // image from the offset to get the offset to the start of the second
>> +        // Fwsec image.
>> +        offset -= first_fwsec_image.base.data.len();
>> +
>> +        self.falcon_data_offset = Some(offset);
>> +
>> +        // The PmuLookupTable starts at the offset of the falcon data pointer
>> +        self.pmu_lookup_table = Some(PmuLookupTable::new(&self.base.data[offset..])?);
>> +
>> +        match self
>> +            .pmu_lookup_table
>> +            .as_ref()
>> +            .ok_or(EINVAL)?
>> +            .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD)
>> +        {
>> +            Ok(entry) => {
>> +                let mut ucode_offset = entry.data as usize;
>> +                ucode_offset -= pci_at_image.base.data.len();
>> +                ucode_offset -= first_fwsec_image.base.data.len();
>> +                self.falcon_ucode_offset = Some(ucode_offset);
>> +                if cfg!(debug_assertions) {
>> +                    // Print the v3_desc header for debugging
>> +                    let v3_desc = self.fwsec_header(pdev.as_ref())?;
>> +                    pr_info!("PmuLookupTableEntry v3_desc: {:#?}\n", v3_desc);
>> +                }
>> +            }
>> +            Err(e) => {
>> +                dev_err!(
>> +                    pdev.as_ref(),
>> +                    "PmuLookupTableEntry not found, error: {:?}\n",
>> +                    e
>> +                );
>> +            }
>> +        }
>> +        Ok(())
>> +    }
>> +
>> +    /// TODO: These were borrowed from the old code for integrating this module
>> +    /// with the outside world. They should be cleaned up and integrated properly.
>
> Okay, won't review for now then. 🙂

In the next revision, we are removing this TODO and can continue review. :)

thanks,

 - Joel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-13 17:19   ` Danilo Krummrich
@ 2025-05-20  7:55     ` Joel Fernandes
  2025-05-20  9:30       ` Danilo Krummrich
  0 siblings, 1 reply; 51+ messages in thread
From: Joel Fernandes @ 2025-05-20  7:55 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, David Airlie, Simona Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, John Hubbard,
	Ben Skeggs, Timur Tabi, Alistair Popple, linux-kernel,
	rust-for-linux, nouveau, dri-devel, Shirish Baskaran

Hi Danilo,

On 5/13/2025 1:19 PM, Danilo Krummrich wrote:
> On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
>> From: Joel Fernandes <joelagnelf@nvidia.com>
>>
>> Add support for navigating and setting up vBIOS ucode data required for
>> GSP to boot. The main data extracted from the vBIOS is the FWSEC-FRTS
>> firmware which runs on the GSP processor. This firmware runs in high
>> secure mode, and sets up the WPR2 (Write protected region) before the
>> Booter runs on the SEC2 processor.
>>
>> Also add log messages to show the BIOS images.
>>
>> [102141.013287] NovaCore: Found BIOS image at offset 0x0, size: 0xfe00, type: PciAt
>> [102141.080692] NovaCore: Found BIOS image at offset 0xfe00, size: 0x14800, type: Efi
>> [102141.098443] NovaCore: Found BIOS image at offset 0x24600, size: 0x5600, type: FwSec
>> [102141.415095] NovaCore: Found BIOS image at offset 0x29c00, size: 0x60800, type: FwSec
>>
>> Tested on my Ampere GA102 and boot is successful.
>>
>> [applied changes by Alex Courbot for fwsec signatures]
>> [applied feedback from Alex Courbot and Timur Tabi]
>> [applied changes related to code reorg, prints etc from Danilo Krummrich]
>> [acourbot@nvidia.com: fix clippy warnings]
>> [acourbot@nvidia.com: remove now-unneeded Devres acquisition]
>> [acourbot@nvidia.com: fix read_more to read `len` bytes, not u32s]
>>
>> Cc: Alexandre Courbot <acourbot@nvidia.com>
>> Cc: John Hubbard <jhubbard@nvidia.com>
>> Cc: Shirish Baskaran <sbaskaran@nvidia.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: Timur Tabi <ttabi@nvidia.com>
>> Cc: Ben Skeggs <bskeggs@nvidia.com>
>> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/firmware.rs  |    2 -
>>  drivers/gpu/nova-core/gpu.rs       |    3 +
>>  drivers/gpu/nova-core/nova_core.rs |    1 +
>>  drivers/gpu/nova-core/vbios.rs     | 1147 ++++++++++++++++++++++++++++++++++++
>>  4 files changed, 1151 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
>> index 1eb216307cd01d975b3d5beda1dc516f34b4b3f2..960982174d834c7c66a47ecfb3a15bf47116b2c5 100644
>> --- a/drivers/gpu/nova-core/firmware.rs
>> +++ b/drivers/gpu/nova-core/firmware.rs
>> @@ -80,8 +80,6 @@ pub(crate) struct FalconUCodeDescV3 {
>>      _reserved: u16,
>>  }
>>  
>> -// To be removed once that code is used.
>> -#[expect(dead_code)]
>>  impl FalconUCodeDescV3 {
>>      pub(crate) fn size(&self) -> usize {
>>          ((self.hdr & 0xffff0000) >> 16) as usize
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index ece13594fba687f3f714e255b5436e72d80dece3..4bf7f72247e5320935a517270b5a0e1ec2becfec 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -9,6 +9,7 @@
>>  use crate::firmware::Firmware;
>>  use crate::regs;
>>  use crate::util;
>> +use crate::vbios::Vbios;
>>  use core::fmt;
>>  
>>  macro_rules! define_chipset {
>> @@ -238,6 +239,8 @@ pub(crate) fn new(
>>  
>>          let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
>>  
>> +        let _bios = Vbios::new(pdev, bar)?;
> 
> Please add a comment why, even though unused, it is important to create this
> instance.
> 
> Also, please use `_` if it's not intended to ever be used.

If I add a comment, it will simply be removed by the next patch. I can add that
though so it makes it more clear.

[...]
>> +impl<'a> Iterator for VbiosIterator<'a> {
>> +    type Item = Result<BiosImage>;
>> +
>> +    /// Iterate over all VBIOS images until the last image is detected or offset
>> +    /// exceeds scan limit.
>> +    fn next(&mut self) -> Option<Self::Item> {
>> +        if self.last_found {
>> +            return None;
>> +        }
>> +
>> +        if self.current_offset > BIOS_MAX_SCAN_LEN {
>> +            dev_err!(
>> +                self.pdev.as_ref(),
>> +                "Error: exceeded BIOS scan limit, stopping scan\n"
>> +            );
>> +            return None;
>> +        }
>> +
>> +        // Parse image headers first to get image size
>> +        let image_size = match self
>> +            .read_bios_image_at_offset(
>> +                self.current_offset,
>> +                BIOS_READ_AHEAD_SIZE,
>> +                "parse initial BIOS image headers",
>> +            )
>> +            .and_then(|image| image.image_size_bytes())
>> +        {
>> +            Ok(size) => size,
>> +            Err(e) => return Some(Err(e)),
>> +        };
>> +
>> +        // Now create a new BiosImage with the full image data
>> +        let full_image = match self.read_bios_image_at_offset(
>> +            self.current_offset,
>> +            image_size,
>> +            "parse full BIOS image",
>> +        ) {
>> +            Ok(image) => image,
>> +            Err(e) => return Some(Err(e)),
>> +        };
>> +
>> +        self.last_found = full_image.is_last();
>> +
>> +        // Advance to next image (aligned to 512 bytes)
>> +        self.current_offset += image_size;
>> +        self.current_offset = self.current_offset.align_up(512);
>> +
>> +        Some(Ok(full_image))
>> +    }
>> +}
>> +
>> +pub(crate) struct Vbios {
>> +    pub fwsec_image: Option<FwSecBiosImage>,
> 
> Please use pub(crate) instead or provide an accessor.
> 
> Also, this shouldn't be an Option, see below comment in Vbios::new().

Ok, I just removed pub altogether, since the users all within this module.

>> +}
>> +
>> +impl Vbios {
>> +    /// Probe for VBIOS extraction
>> +    /// Once the VBIOS object is built, bar0 is not read for vbios purposes anymore.
>> +    pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> Result<Vbios> {
>> +        // Images to extract from iteration
>> +        let mut pci_at_image: Option<PciAtBiosImage> = None;
>> +        let mut first_fwsec_image: Option<FwSecBiosImage> = None;
>> +        let mut second_fwsec_image: Option<FwSecBiosImage> = None;
>> +
>> +        // Parse all VBIOS images in the ROM
>> +        for image_result in VbiosIterator::new(pdev, bar0)? {
>> +            let full_image = image_result?;
>> +
>> +            dev_info!(
> 
> Let's use dev_dbg!() instaed.

Done.

> 
>> +                pdev.as_ref(),
>> +                "Found BIOS image: size: {:#x}, type: {}, last: {}\n",
>> +                full_image.image_size_bytes()?,
>> +                full_image.image_type_str(),
>> +                full_image.is_last()
>> +            );
>> +
>> +            // Get references to images we will need after the loop, in order to
>> +            // setup the falcon data offset.
>> +            match full_image {
>> +                BiosImage::PciAt(image) => {
>> +                    pci_at_image = Some(image);
>> +                }
>> +                BiosImage::FwSec(image) => {
>> +                    if first_fwsec_image.is_none() {
>> +                        first_fwsec_image = Some(image);
>> +                    } else {
>> +                        second_fwsec_image = Some(image);
>> +                    }
>> +                }
>> +                // For now we don't need to handle these
>> +                BiosImage::Efi(_image) => {}
>> +                BiosImage::Nbsi(_image) => {}
>> +            }
>> +        }
>> +
>> +        // Using all the images, setup the falcon data pointer in Fwsec.
>> +        // We need mutable access here, so we handle the Option manually.
>> +        let final_fwsec_image = {
>> +            let mut second = second_fwsec_image; // Take ownership of the option
>> +
>> +            if let (Some(second), Some(first), Some(pci_at)) =
>> +                (second.as_mut(), first_fwsec_image, pci_at_image)
>> +            {
>> +                second
>> +                    .setup_falcon_data(pdev, &pci_at, &first)
>> +                    .inspect_err(|e| {
>> +                        dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
>> +                    })?;
>> +            } else {
>> +                dev_err!(
>> +                    pdev.as_ref(),
>> +                    "Missing required images for falcon data setup, skipping\n"
>> +                );
>> +                return Err(EINVAL);
> 
> This means that if second == None we fail, which makes sense, so why store an
> Option in Vbios? All methods of Vbios fail if fwsec_image == None.
> 

Well, if first and pci_at are None, we will fail as well. Not just second. But
we don't know until we finish parsing all the images in the prior loop, if we
found all the images. So we store it as Option during the prior loop, and check
it later. Right?

>> +            }
>> +            second
>> +        };
> 
> I think this should be:
> 
> 	let mut second = second_fwsec_image;
> 	
> 	if let (Some(second), Some(first), Some(pci_at)) =
> 	    (second.as_mut(), first_fwsec_image, pci_at_image)
> 	{
> 	    second
> 	        .setup_falcon_data(pdev, &pci_at, &first)
> 	        .inspect_err(|e| {
> 	            dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
> 	        })?;
> 	
> 	    Ok(Vbios(second)
> 	} else {
> 	    dev_err!(
> 	        pdev.as_ref(),
> 	        "Missing required images for falcon data setup, skipping\n"
> 	    );
> 	
> 	    Err(EINVAL)
> 	}
> 
> where Vbios can just be
> 
> 	pub(crate) struct Vbios(FwSecBiosImage);

But your suggestion here still considers second as an Option? That's why you
wrote 'Some(second)' ?

> 
>> +
>> +        Ok(Vbios {
>> +            fwsec_image: final_fwsec_image,
>> +        })
>> +    }
>> +
>> +    pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> {
>> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
>> +        image.fwsec_header(pdev)
>> +    }
>> +
>> +    pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> {
>> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
>> +        image.fwsec_ucode(pdev, image.fwsec_header(pdev)?)
>> +    }
>> +
>> +    pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> {
>> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
>> +        image.fwsec_sigs(pdev, image.fwsec_header(pdev)?)
>> +    }
> 
> Those then become infallible, e.g.
> 
> 	pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> &[u8] {
> 	    self.0.fwsec_sigs(pdev, self.fwsec_header(pdev))
> 	}
> 

Nope, I think you are wrong there. fwsec_sigs() of the underlying .0 returns a
Result.

Also in Vbios::new(), I extract the Option when returning:

        Ok(Vbios {
            fwsec_image: final_fwsec_image.ok_or(EINVAL)?,
        })

But fwsec_header() still has to return a Result:
    pub(crate) fn fwsec_header(&self, pdev: &device::Device) ->
              Result<FalconUCodeDescV3> {
        self.fwsec_image.fwsec_header(pdev)
    }

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20  7:55     ` Joel Fernandes
@ 2025-05-20  9:30       ` Danilo Krummrich
  2025-05-20 13:43         ` Joel Fernandes
  0 siblings, 1 reply; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-20  9:30 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Alexandre Courbot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran

On Tue, May 20, 2025 at 03:55:06AM -0400, Joel Fernandes wrote:
> On 5/13/2025 1:19 PM, Danilo Krummrich wrote:
> > On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
> >> @@ -238,6 +239,8 @@ pub(crate) fn new(
> >>  
> >>          let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
> >>  
> >> +        let _bios = Vbios::new(pdev, bar)?;
> > 
> > Please add a comment why, even though unused, it is important to create this
> > instance.
> > 
> > Also, please use `_` if it's not intended to ever be used.
> 
> If I add a comment, it will simply be removed by the next patch. I can add that
> though so it makes it more clear.

I recommend to add such comments, because then reviewers don't stumble over it.
:-)

> >> +impl Vbios {
> >> +    /// Probe for VBIOS extraction
> >> +    /// Once the VBIOS object is built, bar0 is not read for vbios purposes anymore.
> >> +    pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> Result<Vbios> {
> >> +        // Images to extract from iteration
> >> +        let mut pci_at_image: Option<PciAtBiosImage> = None;
> >> +        let mut first_fwsec_image: Option<FwSecBiosImage> = None;
> >> +        let mut second_fwsec_image: Option<FwSecBiosImage> = None;
> >> +
> >> +        // Parse all VBIOS images in the ROM
> >> +        for image_result in VbiosIterator::new(pdev, bar0)? {
> >> +            let full_image = image_result?;
> >> +
> >> +            dev_info!(
> > 
> > Let's use dev_dbg!() instaed.
> 
> Done.
> 
> > 
> >> +                pdev.as_ref(),
> >> +                "Found BIOS image: size: {:#x}, type: {}, last: {}\n",
> >> +                full_image.image_size_bytes()?,
> >> +                full_image.image_type_str(),
> >> +                full_image.is_last()
> >> +            );
> >> +
> >> +            // Get references to images we will need after the loop, in order to
> >> +            // setup the falcon data offset.
> >> +            match full_image {
> >> +                BiosImage::PciAt(image) => {
> >> +                    pci_at_image = Some(image);
> >> +                }
> >> +                BiosImage::FwSec(image) => {
> >> +                    if first_fwsec_image.is_none() {
> >> +                        first_fwsec_image = Some(image);
> >> +                    } else {
> >> +                        second_fwsec_image = Some(image);
> >> +                    }
> >> +                }
> >> +                // For now we don't need to handle these
> >> +                BiosImage::Efi(_image) => {}
> >> +                BiosImage::Nbsi(_image) => {}
> >> +            }
> >> +        }
> >> +
> >> +        // Using all the images, setup the falcon data pointer in Fwsec.
> >> +        // We need mutable access here, so we handle the Option manually.
> >> +        let final_fwsec_image = {
> >> +            let mut second = second_fwsec_image; // Take ownership of the option
> >> +
> >> +            if let (Some(second), Some(first), Some(pci_at)) =
> >> +                (second.as_mut(), first_fwsec_image, pci_at_image)
> >> +            {
> >> +                second
> >> +                    .setup_falcon_data(pdev, &pci_at, &first)
> >> +                    .inspect_err(|e| {
> >> +                        dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
> >> +                    })?;
> >> +            } else {
> >> +                dev_err!(
> >> +                    pdev.as_ref(),
> >> +                    "Missing required images for falcon data setup, skipping\n"
> >> +                );
> >> +                return Err(EINVAL);
> > 
> > This means that if second == None we fail, which makes sense, so why store an
> > Option in Vbios? All methods of Vbios fail if fwsec_image == None.
> > 
> 
> Well, if first and pci_at are None, we will fail as well. Not just second. But
> we don't know until we finish parsing all the images in the prior loop, if we
> found all the images. So we store it as Option during the prior loop, and check
> it later. Right?

My point is not that second is an option within this function -- that's fine. I
don't want the Vbios type to store an Option, because that doesn't make sense.
I.e. it should be

	struct Vbios {
	   fwsec_image: FwSecBiosImage,
	}

or just

	struct Vbios(FwSecBiosImage);

which is the same, rather than

	struct Vbios {
	   fwsec_image: Option<FwSecBiosImage>,
	}

because Vbios::new() fails anyways if any of the images is None, i.e.
vbios.fwsec_image can't ever be None.

The code below does that for you, i.e. it returns an instance of Vbios without
the inner Option.

> >> +            }
> >> +            second
> >> +        };
> > 
> > I think this should be:
> > 
> > 	let mut second = second_fwsec_image;
> > 	
> > 	if let (Some(second), Some(first), Some(pci_at)) =
> > 	    (second.as_mut(), first_fwsec_image, pci_at_image)
> > 	{
> > 	    second
> > 	        .setup_falcon_data(pdev, &pci_at, &first)
> > 	        .inspect_err(|e| {
> > 	            dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
> > 	        })?;
> > 	
> > 	    Ok(Vbios(second)
> > 	} else {
> > 	    dev_err!(
> > 	        pdev.as_ref(),
> > 	        "Missing required images for falcon data setup, skipping\n"
> > 	    );
> > 	
> > 	    Err(EINVAL)
> > 	}
> > 
> > where Vbios can just be
> > 
> > 	pub(crate) struct Vbios(FwSecBiosImage);
> 
> But your suggestion here still considers second as an Option? That's why you
> wrote 'Some(second)' ?

Yes, that's fine, see above. The difference is that the code returns you an
instance of

	struct Vbios(FwSecBiosImage);

rather than

	struct Vbios {
	   fwsec_image: Option<FwSecBiosImage>,
	}

which is unnecessary.

> 
> > 
> >> +
> >> +        Ok(Vbios {
> >> +            fwsec_image: final_fwsec_image,
> >> +        })
> >> +    }
> >> +
> >> +    pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> {
> >> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
> >> +        image.fwsec_header(pdev)
> >> +    }
> >> +
> >> +    pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> {
> >> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
> >> +        image.fwsec_ucode(pdev, image.fwsec_header(pdev)?)
> >> +    }
> >> +
> >> +    pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> {
> >> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
> >> +        image.fwsec_sigs(pdev, image.fwsec_header(pdev)?)
> >> +    }
> > 
> > Those then become infallible, e.g.
> > 
> > 	pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> &[u8] {
> > 	    self.0.fwsec_sigs(pdev, self.fwsec_header(pdev))
> > 	}
> > 
> 
> Nope, I think you are wrong there. fwsec_sigs() of the underlying .0 returns a
> Result.

That's true, I confused self.fwsec_sigs() with self.0.fwsec_sigs(). It seems
that you may want to implement Deref for Vbios.

Also, can you please double check the Options in FwSecBiosImage (in case we
didn't talk about them yet)? They look quite suspicious too.

In general, I feel like a lot of those Option come from a programming pattern
that is very common in C, i.e. allocate a structure (stack or heap) and then
initialize its fields.

In Rust you should aim to initialize all the fields of a structure when you
create the instance. Option as a return type of a function is common, but it's
always a bit suspicious when there is an Option field in a struct.

I understand that there are cases where we can't omit it, and for obvious
reasons the Vbios code is probably a perfect example for that.

However, I recommend looking at this from top to bottom: Do the "final"
structures that we expose to the driver from the Vbios module have fields that
are *really* optional? Or is the Option type just a result from the parsing
process?

If it's the latter, we should get rid of it and work with a different type
during the parsing process and then create the final instance that is exposed to
the driver at the end.

For instance FwSecBiosImage is defined as:

	pub(crate) struct FwSecBiosImage {
	    base: BiosImageBase,
	    falcon_data_offset: Option<usize>,
	    pmu_lookup_table: Option<PmuLookupTable>,
	    falcon_ucode_offset: Option<usize>,
	}

Do only *some* FwSecBiosImage instances have a falcon_ucode_offset?

If the answer is 'no' then it shouldn't be an Option. If the answer is 'yes',
then this indicates that FwSecBiosImage is probably too generic and should be
split into more specific types of a FwSecBiosImage which instead share a common
trait in order to treat the different types generically.

> Also in Vbios::new(), I extract the Option when returning:
> 
>         Ok(Vbios {
>             fwsec_image: final_fwsec_image.ok_or(EINVAL)?,
>         })

Maybe you do so in your tree? v3 of the patch series has:

	pub(crate) struct Vbios {
	   pub fwsec_image: Option<FwSecBiosImage>,
	}

and

	Ok(Vbios {
	   fwsec_image: final_fwsec_image,
	})

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20  9:30       ` Danilo Krummrich
@ 2025-05-20 13:43         ` Joel Fernandes
  2025-05-20 15:01           ` Danilo Krummrich
  0 siblings, 1 reply; 51+ messages in thread
From: Joel Fernandes @ 2025-05-20 13:43 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran



On 5/20/2025 5:30 AM, Danilo Krummrich wrote:
> On Tue, May 20, 2025 at 03:55:06AM -0400, Joel Fernandes wrote:
>> On 5/13/2025 1:19 PM, Danilo Krummrich wrote:
>>> On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
>>>> @@ -238,6 +239,8 @@ pub(crate) fn new(
>>>>  
>>>>          let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?;
>>>>  
>>>> +        let _bios = Vbios::new(pdev, bar)?;
>>>
>>> Please add a comment why, even though unused, it is important to create this
>>> instance.
>>>
>>> Also, please use `_` if it's not intended to ever be used.
>>
>> If I add a comment, it will simply be removed by the next patch. I can add that
>> though so it makes it more clear.
> 
> I recommend to add such comments, because then reviewers don't stumble over it.
> :-)

Point taken and fixed! ;-)

>>>
>>>> +                pdev.as_ref(),
>>>> +                "Found BIOS image: size: {:#x}, type: {}, last: {}\n",
>>>> +                full_image.image_size_bytes()?,
>>>> +                full_image.image_type_str(),
>>>> +                full_image.is_last()
>>>> +            );
>>>> +
>>>> +            // Get references to images we will need after the loop, in order to
>>>> +            // setup the falcon data offset.
>>>> +            match full_image {
>>>> +                BiosImage::PciAt(image) => {
>>>> +                    pci_at_image = Some(image);
>>>> +                }
>>>> +                BiosImage::FwSec(image) => {
>>>> +                    if first_fwsec_image.is_none() {
>>>> +                        first_fwsec_image = Some(image);
>>>> +                    } else {
>>>> +                        second_fwsec_image = Some(image);
>>>> +                    }
>>>> +                }
>>>> +                // For now we don't need to handle these
>>>> +                BiosImage::Efi(_image) => {}
>>>> +                BiosImage::Nbsi(_image) => {}
>>>> +            }
>>>> +        }
>>>> +
>>>> +        // Using all the images, setup the falcon data pointer in Fwsec.
>>>> +        // We need mutable access here, so we handle the Option manually.
>>>> +        let final_fwsec_image = {
>>>> +            let mut second = second_fwsec_image; // Take ownership of the option
>>>> +
>>>> +            if let (Some(second), Some(first), Some(pci_at)) =
>>>> +                (second.as_mut(), first_fwsec_image, pci_at_image)
>>>> +            {
>>>> +                second
>>>> +                    .setup_falcon_data(pdev, &pci_at, &first)
>>>> +                    .inspect_err(|e| {
>>>> +                        dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
>>>> +                    })?;
>>>> +            } else {
>>>> +                dev_err!(
>>>> +                    pdev.as_ref(),
>>>> +                    "Missing required images for falcon data setup, skipping\n"
>>>> +                );
>>>> +                return Err(EINVAL);
>>>
>>> This means that if second == None we fail, which makes sense, so why store an
>>> Option in Vbios? All methods of Vbios fail if fwsec_image == None.
>>>
>>
>> Well, if first and pci_at are None, we will fail as well. Not just second. But
>> we don't know until we finish parsing all the images in the prior loop, if we
>> found all the images. So we store it as Option during the prior loop, and check
>> it later. Right?
> 
> My point is not that second is an option within this function -- that's fine. I
> don't want the Vbios type to store an Option, because that doesn't make sense.
> I.e. it should be
> 
> 	struct Vbios {
> 	   fwsec_image: FwSecBiosImage,
> 	}
> 
> or just
> 
> 	struct Vbios(FwSecBiosImage);
> 
> which is the same, rather than
> 
> 	struct Vbios {
> 	   fwsec_image: Option<FwSecBiosImage>,
> 	}
> 
> because Vbios::new() fails anyways if any of the images is None, i.e.
> vbios.fwsec_image can't ever be None.
> 
> The code below does that for you, i.e. it returns an instance of Vbios without
> the inner Option.

But your code below does Vbios(second) where Vbios is an option..

> 
>>>> +            }
>>>> +            second
>>>> +        };
>>>
>>> I think this should be:
>>>
>>> 	let mut second = second_fwsec_image;
>>> 	
>>> 	if let (Some(second), Some(first), Some(pci_at)) =
>>> 	    (second.as_mut(), first_fwsec_image, pci_at_image)
>>> 	{
>>> 	    second
>>> 	        .setup_falcon_data(pdev, &pci_at, &first)
>>> 	        .inspect_err(|e| {
>>> 	            dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
>>> 	        })?;
>>> 	
>>> 	    Ok(Vbios(second)

I can't do that become second is a mutable reference in the above snippet.

But this works:
             Ok(Vbios { fwsec_image: second.take().ok_or(EINVAL)? })

(This did require changing 'Some(second)' to 'Some(second_ref)', see below.)

>>> 	} else {
>>> 	    dev_err!(
>>> 	        pdev.as_ref(),
>>> 	        "Missing required images for falcon data setup, skipping\n"
>>> 	    );
>>> 	
>>> 	    Err(EINVAL)
>>> 	}
>>>
>>> where Vbios can just be
>>>
>>> 	pub(crate) struct Vbios(FwSecBiosImage);
>>
>> But your suggestion here still considers second as an Option? That's why you
>> wrote 'Some(second)' ?
> 
> Yes, that's fine, see above. The difference is that the code returns you an
> instance of
> 
> 	struct Vbios(FwSecBiosImage);
> 
> rather than
> 
> 	struct Vbios {
> 	   fwsec_image: Option<FwSecBiosImage>,
> 	}
> 
> which is unnecessary.

Sure, ok, yeah I made this change in another thread we are discussing so we are
good.

So the code here now looks like the below, definitely better, thanks! :

            if let (Some(second_ref), Some(first), Some(pci_at)) =
                (second.as_mut(), first_fwsec_image, pci_at_image)
            {
                second_ref
                    .setup_falcon_data(pdev, &pci_at, &first)
                    .inspect_err(|e| {
                        dev_err!(..)
                    })?;
                Ok(Vbios { fwsec_image: second.take().ok_or(EINVAL)? })
            } else {
                dev_err!(
                    pdev.as_ref(),
                    "Missing required images for falcon data setup, skipping\n"
                );
                Err(EINVAL)
            }

>>>> +
>>>> +        Ok(Vbios {
>>>> +            fwsec_image: final_fwsec_image,
>>>> +        })
>>>> +    }
>>>> +
>>>> +    pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> {
>>>> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
>>>> +        image.fwsec_header(pdev)
>>>> +    }
>>>> +
>>>> +    pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> {
>>>> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
>>>> +        image.fwsec_ucode(pdev, image.fwsec_header(pdev)?)
>>>> +    }
>>>> +
>>>> +    pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> {
>>>> +        let image = self.fwsec_image.as_ref().ok_or(EINVAL)?;
>>>> +        image.fwsec_sigs(pdev, image.fwsec_header(pdev)?)
>>>> +    }
>>>
>>> Those then become infallible, e.g.
>>>
>>> 	pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> &[u8] {
>>> 	    self.0.fwsec_sigs(pdev, self.fwsec_header(pdev))
>>> 	}
>>>
>>
>> Nope, I think you are wrong there. fwsec_sigs() of the underlying .0 returns a
>> Result.
> 
> That's true, I confused self.fwsec_sigs() with self.0.fwsec_sigs(). It seems
> that you may want to implement Deref for Vbios.
> 
> Also, can you please double check the Options in FwSecBiosImage (in case we
> didn't talk about them yet)? They look quite suspicious too.


> In general, I feel like a lot of those Option come from a programming pattern
> that is very common in C, i.e. allocate a structure (stack or heap) and then
> initialize its fields.
> 
> In Rust you should aim to initialize all the fields of a structure when you
> create the instance. Option as a return type of a function is common, but it's
> always a bit suspicious when there is an Option field in a struct.

I looked into it, I could not git rid of those ones because we need to
initialize in the "impl TryFrom<BiosImageBase> for BiosImage {"

            0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage {
                base,
                falcon_data_offset: None,
                pmu_lookup_table: None,
                falcon_ucode_offset: None,
            })),

And these fields will not be determined until much later, because as is the case
with the earlier example, these fields cannot be determined until all the images
are parsed.

> I understand that there are cases where we can't omit it, and for obvious
> reasons the Vbios code is probably a perfect example for that.
> 
> However, I recommend looking at this from top to bottom: Do the "final"
> structures that we expose to the driver from the Vbios module have fields that
> are *really* optional? Or is the Option type just a result from the parsing
> process?
> 
> If it's the latter, we should get rid of it and work with a different type
> during the parsing process and then create the final instance that is exposed to
> the driver at the end.
> 
> For instance FwSecBiosImage is defined as:
> 
> 	pub(crate) struct FwSecBiosImage {
> 	    base: BiosImageBase,
> 	    falcon_data_offset: Option<usize>,
> 	    pmu_lookup_table: Option<PmuLookupTable>,
> 	    falcon_ucode_offset: Option<usize>,
> 	}
> 
> Do only *some* FwSecBiosImage instances have a falcon_ucode_offset?
> 
> If the answer is 'no' then it shouldn't be an Option. If the answer is 'yes',
> then this indicates that FwSecBiosImage is probably too generic and should be
> split into more specific types of a FwSecBiosImage which instead share a common
> trait in order to treat the different types generically.

Understood, thanks.

>> Also in Vbios::new(), I extract the Option when returning:
>>
>>         Ok(Vbios {
>>             fwsec_image: final_fwsec_image.ok_or(EINVAL)?,
>>         })
> 
> Maybe you do so in your tree? v3 of the patch series has:
> 
> 	pub(crate) struct Vbios {
> 	   pub fwsec_image: Option<FwSecBiosImage>,
> 	}
> 
> and
> 
> 	Ok(Vbios {
> 	   fwsec_image: final_fwsec_image,
> 	})

Yes, I made the change during our review on the other thread and will be posted
in the next posting. Sorry for any confusion.

thanks,

 - Joel





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20 13:43         ` Joel Fernandes
@ 2025-05-20 15:01           ` Danilo Krummrich
  2025-05-20 15:11             ` Joel Fernandes
  0 siblings, 1 reply; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-20 15:01 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Alexandre Courbot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran

On Tue, May 20, 2025 at 09:43:42AM -0400, Joel Fernandes wrote:
> On 5/20/2025 5:30 AM, Danilo Krummrich wrote:
> > On Tue, May 20, 2025 at 03:55:06AM -0400, Joel Fernandes wrote:
> >> On 5/13/2025 1:19 PM, Danilo Krummrich wrote:
> >>> On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
> 
> So the code here now looks like the below, definitely better, thanks! :
> 
>             if let (Some(second_ref), Some(first), Some(pci_at)) =
>                 (second.as_mut(), first_fwsec_image, pci_at_image)
>             {
>                 second_ref
>                     .setup_falcon_data(pdev, &pci_at, &first)
>                     .inspect_err(|e| {
>                         dev_err!(..)
>                     })?;
>                 Ok(Vbios { fwsec_image: second.take().ok_or(EINVAL)? })
>             } else {
>                 dev_err!(
>                     pdev.as_ref(),
>                     "Missing required images for falcon data setup, skipping\n"
>                 );
>                 Err(EINVAL)
>             }

Sorry, my code-snipped was incorrect indeed. Let me paste what I actually
intended (and this time properly compile checked) and should be even better:

	if let (Some(mut second), Some(first), Some(pci_at)) =
	    (second_fwsec_image, first_fwsec_image, pci_at_image)
	{
	    second
	        .setup_falcon_data(pdev, &pci_at, &first)
	        .inspect_err(|e| {
	            dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
	        })?;
	
	    Ok(Vbios(second))
	} else {
	    dev_err!(
	        pdev.as_ref(),
	        "Missing required images for falcon data setup, skipping\n"
	    );
	
	    Err(EINVAL)
	}

So, with this second is the actual value and not just a reference. :)

And the methods can become:

	pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> {
	    self.0.fwsec_header(pdev)
	}
	
	pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> {
	    self.0.fwsec_ucode(pdev, self.fwsec_header(pdev)?)
	}
	
	pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> {
	    self.0.fwsec_sigs(pdev, self.fwsec_header(pdev)?)
	}

However, I don't understand why they're not just implemented for FwSecBiosImage
itself this way. You can just implement Deref for Vbios then.

> > In general, I feel like a lot of those Option come from a programming pattern
> > that is very common in C, i.e. allocate a structure (stack or heap) and then
> > initialize its fields.
> > 
> > In Rust you should aim to initialize all the fields of a structure when you
> > create the instance. Option as a return type of a function is common, but it's
> > always a bit suspicious when there is an Option field in a struct.
> 
> I looked into it, I could not git rid of those ones because we need to
> initialize in the "impl TryFrom<BiosImageBase> for BiosImage {"
> 
>             0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage {
>                 base,
>                 falcon_data_offset: None,
>                 pmu_lookup_table: None,
>                 falcon_ucode_offset: None,
>             })),
> 
> And these fields will not be determined until much later, because as is the case
> with the earlier example, these fields cannot be determined until all the images
> are parsed.

You should not use TryFrom, but instead use a normal constructor, such as

	BiosImage::new(base_bios_image)

and do the parsing within this constructor.

If you want a helper type with Options while parsing that's totally fine, but
the final result can clearly be without Options. For instance:

	struct Data {
	   image: KVec<u8>,
	}

	impl Data {
	   fn new() -> Result<Self> {
	      let parser = DataParser::new();

	      Self { image: parser.parse()? }
	   }

	   fn load_image(&self) {
	      ...
	   }
	}

	struct DataParser {
	   // Only some images have a checksum.
	   checksum: Option<u64>,
	   // Some images have an extra offset.
	   offset: Option<u64>,
	   // Some images need to be patched.
	   patch: Option<KVec<u8>>,
	   image: KVec<u8>,
	}

	impl DataParser {
	   fn new() -> Self {
	      Self {
	         checksum: None,
	         offset: None,
	         patch: None,
	         bytes: KVec::new(),
	      }
	   }

	   fn parse(self) -> Result<KVec<u8>> {
	      // Fetch all the required data.
	      self.fetch_checksum()?;
	      self.fetch_offset()?;
	      self.fetch_patch()?;
	      self.fetch_byes()?;

	      // Doesn't do anything if `checksum == None`.
	      self.validate_checksum()?;

	      // Doesn't do anything if `offset == None`.
	      self.apply_offset()?;

	      // Doesn't do anything if `patch == None`.
	      self.apply_patch()?;

	      // Return the final image.
	      self.image
	   }
	}

I think the pattern here is the same, but in this example you keep working with
the DataParser, instead of a new instance of Data.

> > I understand that there are cases where we can't omit it, and for obvious
> > reasons the Vbios code is probably a perfect example for that.
> > 
> > However, I recommend looking at this from top to bottom: Do the "final"
> > structures that we expose to the driver from the Vbios module have fields that
> > are *really* optional? Or is the Option type just a result from the parsing
> > process?
> > 
> > If it's the latter, we should get rid of it and work with a different type
> > during the parsing process and then create the final instance that is exposed to
> > the driver at the end.
> > 
> > For instance FwSecBiosImage is defined as:
> > 
> > 	pub(crate) struct FwSecBiosImage {
> > 	    base: BiosImageBase,
> > 	    falcon_data_offset: Option<usize>,
> > 	    pmu_lookup_table: Option<PmuLookupTable>,
> > 	    falcon_ucode_offset: Option<usize>,
> > 	}
> > 
> > Do only *some* FwSecBiosImage instances have a falcon_ucode_offset?
> > 
> > If the answer is 'no' then it shouldn't be an Option. If the answer is 'yes',
> > then this indicates that FwSecBiosImage is probably too generic and should be
> > split into more specific types of a FwSecBiosImage which instead share a common
> > trait in order to treat the different types generically.
> 
> Understood, thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20 15:01           ` Danilo Krummrich
@ 2025-05-20 15:11             ` Joel Fernandes
  2025-05-20 15:36               ` Danilo Krummrich
  0 siblings, 1 reply; 51+ messages in thread
From: Joel Fernandes @ 2025-05-20 15:11 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran



On 5/20/2025 11:01 AM, Danilo Krummrich wrote:
> On Tue, May 20, 2025 at 09:43:42AM -0400, Joel Fernandes wrote:
>> On 5/20/2025 5:30 AM, Danilo Krummrich wrote:
>>> On Tue, May 20, 2025 at 03:55:06AM -0400, Joel Fernandes wrote:
>>>> On 5/13/2025 1:19 PM, Danilo Krummrich wrote:
>>>>> On Wed, May 07, 2025 at 10:52:43PM +0900, Alexandre Courbot wrote:
>>
>> So the code here now looks like the below, definitely better, thanks! :
>>
>>             if let (Some(second_ref), Some(first), Some(pci_at)) =
>>                 (second.as_mut(), first_fwsec_image, pci_at_image)
>>             {
>>                 second_ref
>>                     .setup_falcon_data(pdev, &pci_at, &first)
>>                     .inspect_err(|e| {
>>                         dev_err!(..)
>>                     })?;
>>                 Ok(Vbios { fwsec_image: second.take().ok_or(EINVAL)? })
>>             } else {
>>                 dev_err!(
>>                     pdev.as_ref(),
>>                     "Missing required images for falcon data setup, skipping\n"
>>                 );
>>                 Err(EINVAL)
>>             }
> 
> Sorry, my code-snipped was incorrect indeed. Let me paste what I actually
> intended (and this time properly compile checked) and should be even better:
> 
> 	if let (Some(mut second), Some(first), Some(pci_at)) =
> 	    (second_fwsec_image, first_fwsec_image, pci_at_image)
> 	{
> 	    second
> 	        .setup_falcon_data(pdev, &pci_at, &first)
> 	        .inspect_err(|e| {
> 	            dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e)
> 	        })?;
> 	
> 	    Ok(Vbios(second))
> 	} else {
> 	    dev_err!(
> 	        pdev.as_ref(),
> 	        "Missing required images for falcon data setup, skipping\n"
> 	    );
> 	
> 	    Err(EINVAL)
> 	}
> 
> So, with this second is the actual value and not just a reference. :)
> 
> And the methods can become:
> 
> 	pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> {
> 	    self.0.fwsec_header(pdev)
> 	}
> 	
> 	pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> {
> 	    self.0.fwsec_ucode(pdev, self.fwsec_header(pdev)?)
> 	}
> 	
> 	pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> {
> 	    self.0.fwsec_sigs(pdev, self.fwsec_header(pdev)?)
> 	}

I made this change and it LGTM. Thanks! I did not do the '.0' though since I
want to keep the readability, lets see in the next revision if that looks good.

>>> In general, I feel like a lot of those Option come from a programming pattern
>>> that is very common in C, i.e. allocate a structure (stack or heap) and then
>>> initialize its fields.
>>>
>>> In Rust you should aim to initialize all the fields of a structure when you
>>> create the instance. Option as a return type of a function is common, but it's
>>> always a bit suspicious when there is an Option field in a struct.
>>
>> I looked into it, I could not git rid of those ones because we need to
>> initialize in the "impl TryFrom<BiosImageBase> for BiosImage {"
>>
>>             0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage {
>>                 base,
>>                 falcon_data_offset: None,
>>                 pmu_lookup_table: None,
>>                 falcon_ucode_offset: None,
>>             })),
>>
>> And these fields will not be determined until much later, because as is the case
>> with the earlier example, these fields cannot be determined until all the images
>> are parsed.
> 
> You should not use TryFrom, but instead use a normal constructor, such as
> 
> 	BiosImage::new(base_bios_image)
> 
> and do the parsing within this constructor.
> 
> If you want a helper type with Options while parsing that's totally fine, but
> the final result can clearly be without Options. For instance:
> 
> 	struct Data {
> 	   image: KVec<u8>,
> 	}
> 
> 	impl Data {
> 	   fn new() -> Result<Self> {
> 	      let parser = DataParser::new();
> 
> 	      Self { image: parser.parse()? }
> 	   }
> 
> 	   fn load_image(&self) {
> 	      ...
> 	   }
> 	}
> 
> 	struct DataParser {
> 	   // Only some images have a checksum.
> 	   checksum: Option<u64>,
> 	   // Some images have an extra offset.
> 	   offset: Option<u64>,
> 	   // Some images need to be patched.
> 	   patch: Option<KVec<u8>>,
> 	   image: KVec<u8>,
> 	}
> 
> 	impl DataParser {
> 	   fn new() -> Self {
> 	      Self {
> 	         checksum: None,
> 	         offset: None,
> 	         patch: None,
> 	         bytes: KVec::new(),
> 	      }
> 	   }
> 
> 	   fn parse(self) -> Result<KVec<u8>> {
> 	      // Fetch all the required data.
> 	      self.fetch_checksum()?;
> 	      self.fetch_offset()?;
> 	      self.fetch_patch()?;
> 	      self.fetch_byes()?;
> 
> 	      // Doesn't do anything if `checksum == None`.
> 	      self.validate_checksum()?;
> 
> 	      // Doesn't do anything if `offset == None`.
> 	      self.apply_offset()?;
> 
> 	      // Doesn't do anything if `patch == None`.
> 	      self.apply_patch()?;
> 
> 	      // Return the final image.
> 	      self.image
> 	   }
> 	}
> 
> I think the pattern here is the same, but in this example you keep working with
> the DataParser, instead of a new instance of Data.

I think this would be a fundamental rewrite of the patch. I am Ok with looking
into it as a future item, but right now I am not sure if it justifies not using
Option for these few. There's a lot of immediate work we have to do for boot,
lets please not block the patch on just this if that's Ok with you. If you want,
I could add a TODO here.

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20 15:11             ` Joel Fernandes
@ 2025-05-20 15:36               ` Danilo Krummrich
  2025-05-20 16:02                 ` Joel Fernandes
  2025-05-20 18:13                 ` Joel Fernandes
  0 siblings, 2 replies; 51+ messages in thread
From: Danilo Krummrich @ 2025-05-20 15:36 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Alexandre Courbot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran

On Tue, May 20, 2025 at 11:11:12AM -0400, Joel Fernandes wrote:
> On 5/20/2025 11:01 AM, Danilo Krummrich wrote:
> 
> I made this change and it LGTM. Thanks! I did not do the '.0' though since I
> want to keep the readability, lets see in the next revision if that looks good.

I think readability, is just as good with `.0`, but I'm fine with either.

> >>> In general, I feel like a lot of those Option come from a programming pattern
> >>> that is very common in C, i.e. allocate a structure (stack or heap) and then
> >>> initialize its fields.
> >>>
> >>> In Rust you should aim to initialize all the fields of a structure when you
> >>> create the instance. Option as a return type of a function is common, but it's
> >>> always a bit suspicious when there is an Option field in a struct.
> >>
> >> I looked into it, I could not git rid of those ones because we need to
> >> initialize in the "impl TryFrom<BiosImageBase> for BiosImage {"
> >>
> >>             0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage {
> >>                 base,
> >>                 falcon_data_offset: None,
> >>                 pmu_lookup_table: None,
> >>                 falcon_ucode_offset: None,
> >>             })),
> >>
> >> And these fields will not be determined until much later, because as is the case
> >> with the earlier example, these fields cannot be determined until all the images
> >> are parsed.
> > 
> > You should not use TryFrom, but instead use a normal constructor, such as
> > 
> > 	BiosImage::new(base_bios_image)
> > 
> > and do the parsing within this constructor.
> > 
> > If you want a helper type with Options while parsing that's totally fine, but
> > the final result can clearly be without Options. For instance:
> > 
> > 	struct Data {
> > 	   image: KVec<u8>,
> > 	}
> > 
> > 	impl Data {
> > 	   fn new() -> Result<Self> {
> > 	      let parser = DataParser::new();
> > 
> > 	      Self { image: parser.parse()? }
> > 	   }
> > 
> > 	   fn load_image(&self) {
> > 	      ...
> > 	   }
> > 	}
> > 
> > 	struct DataParser {
> > 	   // Only some images have a checksum.
> > 	   checksum: Option<u64>,
> > 	   // Some images have an extra offset.
> > 	   offset: Option<u64>,
> > 	   // Some images need to be patched.
> > 	   patch: Option<KVec<u8>>,
> > 	   image: KVec<u8>,
> > 	}
> > 
> > 	impl DataParser {
> > 	   fn new() -> Self {
> > 	      Self {
> > 	         checksum: None,
> > 	         offset: None,
> > 	         patch: None,
> > 	         bytes: KVec::new(),
> > 	      }
> > 	   }
> > 
> > 	   fn parse(self) -> Result<KVec<u8>> {
> > 	      // Fetch all the required data.
> > 	      self.fetch_checksum()?;
> > 	      self.fetch_offset()?;
> > 	      self.fetch_patch()?;
> > 	      self.fetch_byes()?;
> > 
> > 	      // Doesn't do anything if `checksum == None`.
> > 	      self.validate_checksum()?;
> > 
> > 	      // Doesn't do anything if `offset == None`.
> > 	      self.apply_offset()?;
> > 
> > 	      // Doesn't do anything if `patch == None`.
> > 	      self.apply_patch()?;
> > 
> > 	      // Return the final image.
> > 	      self.image
> > 	   }
> > 	}
> > 
> > I think the pattern here is the same, but in this example you keep working with
> > the DataParser, instead of a new instance of Data.
> 
> I think this would be a fundamental rewrite of the patch. I am Ok with looking
> into it as a future item, but right now I am not sure if it justifies not using
> Option for these few. There's a lot of immediate work we have to do for boot,
> lets please not block the patch on just this if that's Ok with you. If you want,
> I could add a TODO here.

Honestly, I don't think it'd be too bad to fix this up. It's "just" a bit of
juggling fields and moving code around. The actual code should not change much.

Having Option<T> where the corresponding value T isn't actually optional is
extremely confusing and makes it hard for everyone, but especially new
contributors, to understand the code and can easily trick people into taking
wrong assumptions.

Making the code reasonably accessible for (new) contributors is one of the
objectives of nova and one of the learnings from nouveau.

Hence, let's get this right from the get-go please.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20 15:36               ` Danilo Krummrich
@ 2025-05-20 16:02                 ` Joel Fernandes
  2025-05-20 18:13                 ` Joel Fernandes
  1 sibling, 0 replies; 51+ messages in thread
From: Joel Fernandes @ 2025-05-20 16:02 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel@vger.kernel.org,
	rust-for-linux@vger.kernel.org, nouveau@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, Shirish Baskaran



> On May 20, 2025, at 11:37 AM, Danilo Krummrich <dakr@kernel.org> wrote:
> 
> On Tue, May 20, 2025 at 11:11:12AM -0400, Joel Fernandes wrote:
>> On 5/20/2025 11:01 AM, Danilo Krummrich wrote:
>> 
>> I made this change and it LGTM. Thanks! I did not do the '.0' though since I
>> want to keep the readability, lets see in the next revision if that looks good.
> 
> I think readability, is just as good with `.0`, but I'm fine with either.

Cool.

> 
>>>>> In general, I feel like a lot of those Option come from a programming pattern
>>>>> that is very common in C, i.e. allocate a structure (stack or heap) and then
>>>>> initialize its fields.
>>>>> 
>>>>> In Rust you should aim to initialize all the fields of a structure when you
>>>>> create the instance. Option as a return type of a function is common, but it's
>>>>> always a bit suspicious when there is an Option field in a struct.
>>>> 
>>>> I looked into it, I could not git rid of those ones because we need to
>>>> initialize in the "impl TryFrom<BiosImageBase> for BiosImage {"
>>>> 
>>>>            0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage {
>>>>                base,
>>>>                falcon_data_offset: None,
>>>>                pmu_lookup_table: None,
>>>>                falcon_ucode_offset: None,
>>>>            })),
>>>> 
>>>> And these fields will not be determined until much later, because as is the case
>>>> with the earlier example, these fields cannot be determined until all the images
>>>> are parsed.
>>> 
>>> You should not use TryFrom, but instead use a normal constructor, such as
>>> 
>>>    BiosImage::new(base_bios_image)
>>> 
>>> and do the parsing within this constructor.
>>> 
>>> If you want a helper type with Options while parsing that's totally fine, but
>>> the final result can clearly be without Options. For instance:
>>> 
>>>    struct Data {
>>>       image: KVec<u8>,
>>>    }
>>> 
>>>    impl Data {
>>>       fn new() -> Result<Self> {
>>>          let parser = DataParser::new();
>>> 
>>>          Self { image: parser.parse()? }
>>>       }
>>> 
>>>       fn load_image(&self) {
>>>          ...
>>>       }
>>>    }
>>> 
>>>    struct DataParser {
>>>       // Only some images have a checksum.
>>>       checksum: Option<u64>,
>>>       // Some images have an extra offset.
>>>       offset: Option<u64>,
>>>       // Some images need to be patched.
>>>       patch: Option<KVec<u8>>,
>>>       image: KVec<u8>,
>>>    }
>>> 
>>>    impl DataParser {
>>>       fn new() -> Self {
>>>          Self {
>>>             checksum: None,
>>>             offset: None,
>>>             patch: None,
>>>             bytes: KVec::new(),
>>>          }
>>>       }
>>> 
>>>       fn parse(self) -> Result<KVec<u8>> {
>>>          // Fetch all the required data.
>>>          self.fetch_checksum()?;
>>>          self.fetch_offset()?;
>>>          self.fetch_patch()?;
>>>          self.fetch_byes()?;
>>> 
>>>          // Doesn't do anything if `checksum == None`.
>>>          self.validate_checksum()?;
>>> 
>>>          // Doesn't do anything if `offset == None`.
>>>          self.apply_offset()?;
>>> 
>>>          // Doesn't do anything if `patch == None`.
>>>          self.apply_patch()?;
>>> 
>>>          // Return the final image.
>>>          self.image
>>>       }
>>>    }
>>> 
>>> I think the pattern here is the same, but in this example you keep working with
>>> the DataParser, instead of a new instance of Data.
>> 
>> I think this would be a fundamental rewrite of the patch. I am Ok with looking
>> into it as a future item, but right now I am not sure if it justifies not using
>> Option for these few. There's a lot of immediate work we have to do for boot,
>> lets please not block the patch on just this if that's Ok with you. If you want,
>> I could add a TODO here.
> 
> Honestly, I don't think it'd be too bad to fix this up. It's "just" a bit of
> juggling fields and moving code around. The actual code should not change much.
> 
> Having Option<T> where the corresponding value T isn't actually optional is
> extremely confusing and makes it hard for everyone, but especially new
> contributors, to understand the code and can easily trick people into taking
> wrong assumptions.
> 
> Making the code reasonably accessible for (new) contributors is one of the
> objectives of nova and one of the learnings from nouveau.
> 
> Hence, let's get this right from the get-go please.

Ok, I will look into making this change. :-)

thanks,

 - Joel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20 15:36               ` Danilo Krummrich
  2025-05-20 16:02                 ` Joel Fernandes
@ 2025-05-20 18:13                 ` Joel Fernandes
  2025-05-20 21:32                   ` Dave Airlie
  1 sibling, 1 reply; 51+ messages in thread
From: Joel Fernandes @ 2025-05-20 18:13 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Alexandre Courbot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, David Airlie, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran



On 5/20/2025 11:36 AM, Danilo Krummrich wrote:
>>> If you want a helper type with Options while parsing that's totally fine, but
>>> the final result can clearly be without Options. For instance:
>>>
>>> 	struct Data {
>>> 	   image: KVec<u8>,
>>> 	}
>>>
>>> 	impl Data {
>>> 	   fn new() -> Result<Self> {
>>> 	      let parser = DataParser::new();
>>>
>>> 	      Self { image: parser.parse()? }
>>> 	   }
>>>
>>> 	   fn load_image(&self) {
>>> 	      ...
>>> 	   }
>>> 	}
>>>
>>> 	struct DataParser {
>>> 	   // Only some images have a checksum.
>>> 	   checksum: Option<u64>,
>>> 	   // Some images have an extra offset.
>>> 	   offset: Option<u64>,
>>> 	   // Some images need to be patched.
>>> 	   patch: Option<KVec<u8>>,
>>> 	   image: KVec<u8>,
>>> 	}
>>>
>>> 	impl DataParser {
>>> 	   fn new() -> Self {
>>> 	      Self {
>>> 	         checksum: None,
>>> 	         offset: None,
>>> 	         patch: None,
>>> 	         bytes: KVec::new(),
>>> 	      }
>>> 	   }
>>>
>>> 	   fn parse(self) -> Result<KVec<u8>> {
>>> 	      // Fetch all the required data.
>>> 	      self.fetch_checksum()?;
>>> 	      self.fetch_offset()?;
>>> 	      self.fetch_patch()?;
>>> 	      self.fetch_byes()?;
>>>
>>> 	      // Doesn't do anything if `checksum == None`.
>>> 	      self.validate_checksum()?;
>>>
>>> 	      // Doesn't do anything if `offset == None`.
>>> 	      self.apply_offset()?;
>>>
>>> 	      // Doesn't do anything if `patch == None`.
>>> 	      self.apply_patch()?;
>>>
>>> 	      // Return the final image.
>>> 	      self.image
>>> 	   }
>>> 	}
>>>
>>> I think the pattern here is the same, but in this example you keep working with
>>> the DataParser, instead of a new instance of Data.
>> I think this would be a fundamental rewrite of the patch. I am Ok with looking
>> into it as a future item, but right now I am not sure if it justifies not using
>> Option for these few. There's a lot of immediate work we have to do for boot,
>> lets please not block the patch on just this if that's Ok with you. If you want,
>> I could add a TODO here.
>
> Honestly, I don't think it'd be too bad to fix this up. It's "just" a bit of
> juggling fields and moving code around. The actual code should not change much.
> 
> Having Option<T> where the corresponding value T isn't actually optional is
> extremely confusing and makes it hard for everyone, but especially new
> contributors, to understand the code and can easily trick people into taking
> wrong assumptions.
> 
> Making the code reasonably accessible for (new) contributors is one of the
> objectives of nova and one of the learnings from nouveau.


I implemented the Data parsing pattern like the following, the final
FwSecBiosImage will not have optional fields as you suggested. It does get rid
of 2 additional fields as well which are not needed after vbios parsing completes.

https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=nova/vbios&id=8cc852fe5573890596a91a2a935b3af24dcb9f04

Hope that looks Ok now! I am open to naming FwSecBiosPartial as FwSecBiosData if
that's better.

The full file after the change:
https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/tree/drivers/gpu/nova-core/vbios.rs?h=nova/vbios&id=8cc852fe5573890596a91a2a935b3af24dcb9f04

thanks,

 - Joel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20 18:13                 ` Joel Fernandes
@ 2025-05-20 21:32                   ` Dave Airlie
  2025-05-21  3:17                     ` Joel Fernandes
  0 siblings, 1 reply; 51+ messages in thread
From: Dave Airlie @ 2025-05-20 21:32 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Danilo Krummrich, Alexandre Courbot, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran

On Wed, 21 May 2025 at 04:13, Joel Fernandes <joelagnelf@nvidia.com> wrote:
>
>
>
> On 5/20/2025 11:36 AM, Danilo Krummrich wrote:
> >>> If you want a helper type with Options while parsing that's totally fine, but
> >>> the final result can clearly be without Options. For instance:
> >>>
> >>>     struct Data {
> >>>        image: KVec<u8>,
> >>>     }
> >>>
> >>>     impl Data {
> >>>        fn new() -> Result<Self> {
> >>>           let parser = DataParser::new();
> >>>
> >>>           Self { image: parser.parse()? }
> >>>        }
> >>>
> >>>        fn load_image(&self) {
> >>>           ...
> >>>        }
> >>>     }
> >>>
> >>>     struct DataParser {
> >>>        // Only some images have a checksum.
> >>>        checksum: Option<u64>,
> >>>        // Some images have an extra offset.
> >>>        offset: Option<u64>,
> >>>        // Some images need to be patched.
> >>>        patch: Option<KVec<u8>>,
> >>>        image: KVec<u8>,
> >>>     }
> >>>
> >>>     impl DataParser {
> >>>        fn new() -> Self {
> >>>           Self {
> >>>              checksum: None,
> >>>              offset: None,
> >>>              patch: None,
> >>>              bytes: KVec::new(),
> >>>           }
> >>>        }
> >>>
> >>>        fn parse(self) -> Result<KVec<u8>> {
> >>>           // Fetch all the required data.
> >>>           self.fetch_checksum()?;
> >>>           self.fetch_offset()?;
> >>>           self.fetch_patch()?;
> >>>           self.fetch_byes()?;
> >>>
> >>>           // Doesn't do anything if `checksum == None`.
> >>>           self.validate_checksum()?;
> >>>
> >>>           // Doesn't do anything if `offset == None`.
> >>>           self.apply_offset()?;
> >>>
> >>>           // Doesn't do anything if `patch == None`.
> >>>           self.apply_patch()?;
> >>>
> >>>           // Return the final image.
> >>>           self.image
> >>>        }
> >>>     }
> >>>
> >>> I think the pattern here is the same, but in this example you keep working with
> >>> the DataParser, instead of a new instance of Data.
> >> I think this would be a fundamental rewrite of the patch. I am Ok with looking
> >> into it as a future item, but right now I am not sure if it justifies not using
> >> Option for these few. There's a lot of immediate work we have to do for boot,
> >> lets please not block the patch on just this if that's Ok with you. If you want,
> >> I could add a TODO here.
> >
> > Honestly, I don't think it'd be too bad to fix this up. It's "just" a bit of
> > juggling fields and moving code around. The actual code should not change much.
> >
> > Having Option<T> where the corresponding value T isn't actually optional is
> > extremely confusing and makes it hard for everyone, but especially new
> > contributors, to understand the code and can easily trick people into taking
> > wrong assumptions.
> >
> > Making the code reasonably accessible for (new) contributors is one of the
> > objectives of nova and one of the learnings from nouveau.

I just want to back Danilo up on this concept as well.

When I did the experiments code, I faced the not fully constructed
object problem a lot, and I tried to resist the C pattern of Option<>
all the things, it's a very C based thing where we create an object
then initialise it as we go, and it's not a great pattern to have for
rust code.

I'm not a huge fan of constructor/builder objects either if they can
be avoided, please do, and I tried to also avoid proliferating them,
but I think for most things we can build the pieces and then the final
object as we go, it just requires doing so from the start, and not
giving into the Option<> pattern.

Dave.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
  2025-05-20 21:32                   ` Dave Airlie
@ 2025-05-21  3:17                     ` Joel Fernandes
  0 siblings, 0 replies; 51+ messages in thread
From: Joel Fernandes @ 2025-05-21  3:17 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Danilo Krummrich, Alexandre Courbot, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Simona Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, John Hubbard, Ben Skeggs, Timur Tabi,
	Alistair Popple, linux-kernel, rust-for-linux, nouveau, dri-devel,
	Shirish Baskaran



On 5/20/2025 5:32 PM, Dave Airlie wrote:
> On Wed, 21 May 2025 at 04:13, Joel Fernandes <joelagnelf@nvidia.com> wrote:
>>
>>
>>
>> On 5/20/2025 11:36 AM, Danilo Krummrich wrote:
>>>>> If you want a helper type with Options while parsing that's totally fine, but
>>>>> the final result can clearly be without Options. For instance:
>>>>>
>>>>>     struct Data {
>>>>>        image: KVec<u8>,
>>>>>     }
>>>>>
>>>>>     impl Data {
>>>>>        fn new() -> Result<Self> {
>>>>>           let parser = DataParser::new();
>>>>>
>>>>>           Self { image: parser.parse()? }
>>>>>        }
>>>>>
>>>>>        fn load_image(&self) {
>>>>>           ...
>>>>>        }
>>>>>     }
>>>>>
>>>>>     struct DataParser {
>>>>>        // Only some images have a checksum.
>>>>>        checksum: Option<u64>,
>>>>>        // Some images have an extra offset.
>>>>>        offset: Option<u64>,
>>>>>        // Some images need to be patched.
>>>>>        patch: Option<KVec<u8>>,
>>>>>        image: KVec<u8>,
>>>>>     }
>>>>>
>>>>>     impl DataParser {
>>>>>        fn new() -> Self {
>>>>>           Self {
>>>>>              checksum: None,
>>>>>              offset: None,
>>>>>              patch: None,
>>>>>              bytes: KVec::new(),
>>>>>           }
>>>>>        }
>>>>>
>>>>>        fn parse(self) -> Result<KVec<u8>> {
>>>>>           // Fetch all the required data.
>>>>>           self.fetch_checksum()?;
>>>>>           self.fetch_offset()?;
>>>>>           self.fetch_patch()?;
>>>>>           self.fetch_byes()?;
>>>>>
>>>>>           // Doesn't do anything if `checksum == None`.
>>>>>           self.validate_checksum()?;
>>>>>
>>>>>           // Doesn't do anything if `offset == None`.
>>>>>           self.apply_offset()?;
>>>>>
>>>>>           // Doesn't do anything if `patch == None`.
>>>>>           self.apply_patch()?;
>>>>>
>>>>>           // Return the final image.
>>>>>           self.image
>>>>>        }
>>>>>     }
>>>>>
>>>>> I think the pattern here is the same, but in this example you keep working with
>>>>> the DataParser, instead of a new instance of Data.
>>>> I think this would be a fundamental rewrite of the patch. I am Ok with looking
>>>> into it as a future item, but right now I am not sure if it justifies not using
>>>> Option for these few. There's a lot of immediate work we have to do for boot,
>>>> lets please not block the patch on just this if that's Ok with you. If you want,
>>>> I could add a TODO here.
>>>
>>> Honestly, I don't think it'd be too bad to fix this up. It's "just" a bit of
>>> juggling fields and moving code around. The actual code should not change much.
>>>
>>> Having Option<T> where the corresponding value T isn't actually optional is
>>> extremely confusing and makes it hard for everyone, but especially new
>>> contributors, to understand the code and can easily trick people into taking
>>> wrong assumptions.
>>>
>>> Making the code reasonably accessible for (new) contributors is one of the
>>> objectives of nova and one of the learnings from nouveau.
> 
> I just want to back Danilo up on this concept as well.
> 
> When I did the experiments code, I faced the not fully constructed
> object problem a lot, and I tried to resist the C pattern of Option<>
> all the things, it's a very C based thing where we create an object
> then initialise it as we go, and it's not a great pattern to have for
> rust code.
> 
> I'm not a huge fan of constructor/builder objects either if they can
> be avoided, please do, and I tried to also avoid proliferating them,
> but I think for most things we can build the pieces and then the final
> object as we go, it just requires doing so from the start, and not
> giving into the Option<> pattern.

Sure, I am on the same page there as well. For next revision of this patch,
struct Vbios will contain a struct FwsecBiosImage without any Option in either
struct Vbios or struct FwsecBiosImage. This is only logical, because there is
nothing optional about it (in what Vbios::new() returns).

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2025-05-21  3:17 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-07 13:52 [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation Alexandre Courbot
2025-05-13 12:15   ` Danilo Krummrich
2025-05-07 13:52 ` [PATCH v3 02/19] gpu: nova-core: derive useful traits for Chipset Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 03/19] gpu: nova-core: add missing GA100 definition Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 04/19] gpu: nova-core: take bound device in Gpu::new Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 05/19] gpu: nova-core: define registers layout using helper macro Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 06/19] gpu: nova-core: fix layout of NV_PMC_BOOT_0 Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 07/19] gpu: nova-core: move Firmware to firmware module Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 08/19] rust: make ETIMEDOUT error available Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion Alexandre Courbot
2025-05-13 14:07   ` Danilo Krummrich
2025-05-16 12:16     ` Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 10/19] gpu: nova-core: add DMA object struct Alexandre Courbot
2025-05-13 14:25   ` Danilo Krummrich
2025-05-07 13:52 ` [PATCH v3 11/19] gpu: nova-core: register sysmem flush page Alexandre Courbot
2025-05-13 14:47   ` Danilo Krummrich
2025-05-07 13:52 ` [PATCH v3 12/19] gpu: nova-core: add helper function to wait on condition Alexandre Courbot
2025-05-13 14:50   ` Danilo Krummrich
2025-05-07 13:52 ` [PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code Alexandre Courbot
2025-05-13 16:19   ` Danilo Krummrich
2025-05-16 12:19     ` Alexandre Courbot
2025-05-16 12:26       ` Danilo Krummrich
2025-05-07 13:52 ` [PATCH v3 14/19] gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 15/19] rust: num: Add an upward alignment helper for usize Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot Alexandre Courbot
2025-05-13 17:19   ` Danilo Krummrich
2025-05-20  7:55     ` Joel Fernandes
2025-05-20  9:30       ` Danilo Krummrich
2025-05-20 13:43         ` Joel Fernandes
2025-05-20 15:01           ` Danilo Krummrich
2025-05-20 15:11             ` Joel Fernandes
2025-05-20 15:36               ` Danilo Krummrich
2025-05-20 16:02                 ` Joel Fernandes
2025-05-20 18:13                 ` Joel Fernandes
2025-05-20 21:32                   ` Dave Airlie
2025-05-21  3:17                     ` Joel Fernandes
2025-05-14 16:23   ` Danilo Krummrich
2025-05-19 22:59     ` Joel Fernandes
2025-05-20  7:18     ` Joel Fernandes
2025-05-16 20:38   ` Timur Tabi
2025-05-20  6:35     ` Joel Fernandes
2025-05-07 13:52 ` [PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region Alexandre Courbot
2025-05-13 16:41   ` Danilo Krummrich
2025-05-17 13:42     ` Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS Alexandre Courbot
2025-05-14 16:38   ` Danilo Krummrich
2025-05-19 14:24     ` Alexandre Courbot
2025-05-07 13:52 ` [PATCH v3 19/19] gpu: nova-core: load and " Alexandre Courbot
2025-05-14 16:42   ` Danilo Krummrich
2025-05-13 13:10 ` [PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization Danilo Krummrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).