dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: Alexandre Courbot <acourbot@nvidia.com>
Cc: dri-devel@lists.freedesktop.org, dakr@kernel.org,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <lossin@kernel.org>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	"Timur Tabi" <ttabi@nvidia.com>,
	linux-kernel@vger.kernel.org, nouveau@lists.freedesktop.org,
	Nouveau <nouveau-bounces@lists.freedesktop.org>
Subject: Re: [PATCH 03/10] gpu: nova-core: gsp: Create wpr metadata
Date: Wed, 3 Sep 2025 18:57:28 +1000	[thread overview]
Message-ID: <iyjecyybwyilem2ituw6esmufid72cximthc5qo2fgdpzz4fko@cb6n2vcrptb5> (raw)
In-Reply-To: <DCHAPJRPKSSA.37QLQGAVCERCZ@nvidia.com>

On 2025-09-01 at 17:46 +1000, Alexandre Courbot <acourbot@nvidia.com> wrote...
> Hi Alistair,
> 
> On Wed Aug 27, 2025 at 5:20 PM JST, Alistair Popple wrote:
> <snip>
> > index 161c057350622..1f51e354b9569 100644
> > --- a/drivers/gpu/nova-core/gsp.rs
> > +++ b/drivers/gpu/nova-core/gsp.rs
> > @@ -6,12 +6,17 @@
> >  use kernel::dma_write;
> >  use kernel::pci;
> >  use kernel::prelude::*;
> > -use kernel::ptr::Alignment;
> > +use kernel::ptr::{Alignable, Alignment};
> > +use kernel::sizes::SZ_128K;
> >  use kernel::transmute::{AsBytes, FromBytes};
> >  
> > +use crate::fb::FbLayout;
> > +use crate::firmware::Firmware;
> >  use crate::nvfw::{
> > -    LibosMemoryRegionInitArgument, LibosMemoryRegionKind_LIBOS_MEMORY_REGION_CONTIGUOUS,
> > -    LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM,
> > +    GspFwWprMeta, GspFwWprMetaBootInfo, GspFwWprMetaBootResumeInfo, LibosMemoryRegionInitArgument,
> > +    LibosMemoryRegionKind_LIBOS_MEMORY_REGION_CONTIGUOUS,
> > +    LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM, GSP_FW_WPR_META_MAGIC,
> > +    GSP_FW_WPR_META_REVISION,
> >  };
> >  
> >  pub(crate) const GSP_PAGE_SHIFT: usize = 12;
> > @@ -25,12 +30,69 @@ unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
> >  // are valid.
> >  unsafe impl FromBytes for LibosMemoryRegionInitArgument {}
> >  
> > +// SAFETY: Padding is explicit and will not contain uninitialized data.
> > +unsafe impl AsBytes for GspFwWprMeta {}
> > +
> > +// SAFETY: This struct only contains integer types for which all bit patterns
> > +// are valid.
> > +unsafe impl FromBytes for GspFwWprMeta {}
> > +
> >  #[allow(unused)]
> >  pub(crate) struct GspMemObjects {
> >      libos: CoherentAllocation<LibosMemoryRegionInitArgument>,
> >      pub loginit: CoherentAllocation<u8>,
> >      pub logintr: CoherentAllocation<u8>,
> >      pub logrm: CoherentAllocation<u8>,
> > +    pub wpr_meta: CoherentAllocation<GspFwWprMeta>,
> > +}
> 
> I think `wpr_meta` is a bit out-of-place in this structure. There are
> several reason for this:
> 
> - All the other members of this structure (including `cmdq` which is
>   added later) are referenced by `libos` and constitute the GSP runtime:
>   they are used as long as the GSP is active. `wpr_meta`, OTOH, does not
>   reference any of the other objects, nor is it referenced by them.
> - `wpr_meta` is never used by the GSP, but needed as a parameter of
>   Booter on SEC2 to load the GSP firmware. It can actually be discarded
>   once this step is completed. This is very different from the rest of
>   this structure, which is used by the GSP.

Yes, I had noticed that too and had tried to remove it previously. But as you
mention below that was a little bit tricky but if you fix it for v3 I think this
all makes perfect sense.

> So I think it doesn't really belong here, and would probably fit better
> in `Firmware`. Now the fault lies in my own series, which doesn't let
> you build `wpr_meta` easily from there. I'll try to fix that in the v3.
>
> And with the removal of `wpr_meta`, this structure ends up strictly
> containing the GSP runtime, including the command queue... Maybe it can
> simply be named `Gsp` then? It is even already in the right module! :)

Agreed - I noticed this right after I renamed this struct last time so wanted
to let things settle down a bit before doing another rename. But I think `Gsp`
makes a whole lot more sense, especially if we remove the wpr_meta data.

> Loosely related, but looking at this series made me realize there is a
> very logical split of our firmware into two "bundles":
> 
> - The GSP bundle includes the GSP runtime data, which is this
>   `GspMemObjects` structure minus `wpr_meta`. We pass it as an input
>   parameter to the GSP firmware using the GSP's falcon mbox registers.
>   It must live as long as the GSP is running.
> - The SEC2 bundle includes Booter, `wpr_meta`, the GSP firmware binary,
>   bootloader and its signatures (which are all referenced by
>   `wpr_meta`). All this data is consumed by SEC2, and crucially can be
>   dropped once the GSP is booted.
> 
> This separation is important as currently we are stuffing anything
> firmware-related into the `Firmware` struct and keep it there forever,
> consuming dozens of megabytes of host memory that we could free. Booting
> the GSP is typically a one-time operation in the life of the GPU device,
> and even if we ever need to do it again, we can very well build the SEC2
> bundle from scratch again.
> 
> I will try to reflect the separation better in the v3 of my patchset -
> then we can just build `wpr_meta` as a local variable of the method that
> runs `Booter`, and drop it (alongside the rest of the SEC2 bundle) upon
> return.
> 
> > +
> > +pub(crate) fn build_wpr_meta(
> > +    dev: &device::Device<device::Bound>,
> > +    fw: &Firmware,
> > +    fb_layout: &FbLayout,
> > +) -> Result<CoherentAllocation<GspFwWprMeta>> {
> > +    let wpr_meta =
> > +        CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
> > +    dma_write!(
> > +        wpr_meta[0] = GspFwWprMeta {
> > +            magic: GSP_FW_WPR_META_MAGIC as u64,
> > +            revision: u64::from(GSP_FW_WPR_META_REVISION),
> > +            sysmemAddrOfRadix3Elf: fw.gsp.lvl0_dma_handle(),
> > +            sizeOfRadix3Elf: fw.gsp.size as u64,
> > +            sysmemAddrOfBootloader: fw.gsp_bootloader.ucode.dma_handle(),
> > +            sizeOfBootloader: fw.gsp_bootloader.ucode.size() as u64,
> > +            bootloaderCodeOffset: u64::from(fw.gsp_bootloader.code_offset),
> > +            bootloaderDataOffset: u64::from(fw.gsp_bootloader.data_offset),
> > +            bootloaderManifestOffset: u64::from(fw.gsp_bootloader.manifest_offset),
> > +            __bindgen_anon_1: GspFwWprMetaBootResumeInfo {
> > +                __bindgen_anon_1: GspFwWprMetaBootInfo {
> > +                    sysmemAddrOfSignature: fw.gsp_sigs.dma_handle(),
> > +                    sizeOfSignature: fw.gsp_sigs.size() as u64,
> > +                }
> > +            },
> > +            gspFwRsvdStart: fb_layout.heap.start,
> > +            nonWprHeapOffset: fb_layout.heap.start,
> > +            nonWprHeapSize: fb_layout.heap.end - fb_layout.heap.start,
> > +            gspFwWprStart: fb_layout.wpr2.start,
> > +            gspFwHeapOffset: fb_layout.wpr2_heap.start,
> > +            gspFwHeapSize: fb_layout.wpr2_heap.end - fb_layout.wpr2_heap.start,
> > +            gspFwOffset: fb_layout.elf.start,
> > +            bootBinOffset: fb_layout.boot.start,
> > +            frtsOffset: fb_layout.frts.start,
> > +            frtsSize: fb_layout.frts.end - fb_layout.frts.start,
> > +            gspFwWprEnd: fb_layout
> > +                .vga_workspace
> > +                .start
> > +                .align_down(Alignment::new(SZ_128K)),
> > +            gspFwHeapVfPartitionCount: fb_layout.vf_partition_count,
> > +            fbSize: fb_layout.fb.end - fb_layout.fb.start,
> > +            vgaWorkspaceOffset: fb_layout.vga_workspace.start,
> > +            vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
> > +            ..Default::default()
> > +        }
> > +    )?;
> > +
> > +    Ok(wpr_meta)
> 
> I've discussed the bindings abstractions with Danilo last week. We
> agreed that no layout information should ever escape the `nvfw` module.
> I.e. the fields of `GspFwWprMeta` should not even be visible here.
> 
> Instead, `GspFwWprMeta` should be wrapped privately into another
> structure inside `nvfw`:
> 
>   /// Structure passed to the GSP bootloader, containing the framebuffer layout as well as the DMA
>   /// addresses of the GSP bootloader and firmware.
>   #[repr(transparent)]
>   pub(crate) struct GspFwWprMeta(r570_144::GspFwWprMeta);

I'm a little bit unsure what the advantage of this is? Admittedly I'm not sure
I've seen the discussion from last week so I may have missed something but it's
not obvious how creating another layer of abstraction is better. How would it
help contain any layout changes to nvfw? Supporting any new struct fields for
example would almost certainly still require code changes outside nvfw.

My thinking here was that the bindings (at least for GSP) probably want to live
in the Gsp crate/module, and the rest of the driver would be isolated from Gsp
changes by the public API provided by the Gsp crate/module rather than trying to
do that at the binding level. For example the get_gsp_info() command implemented
in [1] provides a separate public struct representing what the rest of the
driver needs, thus ensuring the implementation specific details of Gsp (such as
struct layout) do not leak into the wider nova-core driver.

> All its implementations should also be there:
> 
>   // SAFETY: Padding is explicit and will not contain uninitialized data.
>   unsafe impl AsBytes for GspFwWprMeta {}
> 
>   // SAFETY: This struct only contains integer types for which all bit patterns
>   // are valid.
>   unsafe impl FromBytes for GspFwWprMeta {}

Makes sense.

> And lastly, this `new` method can also be moved into `nvfw`, as an impl
> block for the wrapping `GspFwWprMeta` type. That way no layout detail
> escapes that module, and it will be easier to adapt the code to
> potential layout chances with new firmware versions.
> 
> (note that my series is the one carelessly re-exporting `GspFwWprMeta`
> as-is - I'll fix that too in v3)
> 
> The same applies to `LibosMemoryRegionInitArgument` of the previous
> patch, and other types introduced in subsequent patches. Usually there
> is little more work to do than moving the implentations into `nvfw` as
> everything is already abstracted correctly - just not where we
> eventually want it.

This is where I get a little bit uncomfortable - this doesn't feel right to me.
It seems to me moving all these implementations to the bindings would just end
up with a significant amount of Gsp code in nvfw.rs rather than in the places
that actually use it, making nvfw.rs large and unwieldy and the code more
distributed and harder to follow.

And it's all tightly coupled anyway - for example the Gsp boot arguments require some
command queue offsets which are all pretty specific to the Gsp implementation.
Ie. we can't define some nice public API in the Gsp crate for "getting arguments
required for booting Gsp" without that just being "here is a struct containing
all the fields that must be packed into the Gsp arguments for this version",
which at that point may as well just be the actual struct itself right?

 - Alistair

[1] - https://lore.kernel.org/rust-for-linux/20250829173254.2068763-18-joelagnelf@nvidia.com/

  reply	other threads:[~2025-09-03  8:57 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27  8:19 [PATCH 00/10] gpu: nova-core: Boot GSP to RISC-V active Alistair Popple
2025-08-27  8:19 ` [PATCH 01/10] gpu: nova-core: Set correct DMA mask Alistair Popple
2025-08-29 23:55   ` John Hubbard
2025-09-01 23:55     ` Alistair Popple
2025-09-03 19:45       ` John Hubbard
2025-09-03 22:03         ` Alistair Popple
2025-08-27  8:19 ` [PATCH 02/10] gpu: nova-core: Create initial GspSharedMemObjects Alistair Popple
2025-08-27  8:20 ` [PATCH 03/10] gpu: nova-core: gsp: Create wpr metadata Alistair Popple
2025-09-01  7:46   ` Alexandre Courbot
2025-09-03  8:57     ` Alistair Popple [this message]
2025-09-03 12:51       ` Alexandre Courbot
2025-09-03 13:10         ` Alexandre Courbot
2025-08-27  8:20 ` [PATCH 04/10] gpu: nova-core: Add a slice-buffer (sbuffer) datastructure Alistair Popple
2025-08-27  8:20 ` [PATCH 05/10] gpu: nova-core: gsp: Add GSP command queue handling Alistair Popple
2025-08-27 20:35   ` John Hubbard
2025-08-27 23:42     ` Alistair Popple
2025-08-27  8:20 ` [PATCH 06/10] gpu: nova-core: gsp: Create rmargs Alistair Popple
2025-08-27  8:20 ` [PATCH 07/10] gpu: nova-core: gsp: Create RM registry and sysinfo commands Alistair Popple
2025-08-29  6:02   ` Alistair Popple
2025-08-27  8:20 ` [PATCH 08/10] gpu: nova-core: falcon: Add support to check if RISC-V is active Alistair Popple
2025-08-29 18:48   ` Timur Tabi
2025-09-02  0:08     ` Alistair Popple
2025-08-27  8:20 ` [PATCH 09/10] gpu: nova-core: falcon: Add support to write firmware version Alistair Popple
2025-08-27  8:20 ` [PATCH 10/10] gpu: nova-core: gsp: Boot GSP Alistair Popple
2025-08-28  8:37 ` [PATCH 00/10] gpu: nova-core: Boot GSP to RISC-V active Miguel Ojeda
2025-08-29  3:03   ` Alexandre Courbot
2025-08-29  7:40     ` Danilo Krummrich
2025-08-29 10:01       ` Miguel Ojeda
2025-08-29 13:47         ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=iyjecyybwyilem2ituw6esmufid72cximthc5qo2fgdpzz4fko@cb6n2vcrptb5 \
    --to=apopple@nvidia.com \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=nouveau-bounces@lists.freedesktop.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=ojeda@kernel.org \
    --cc=simona@ffwll.ch \
    --cc=tmgross@umich.edu \
    --cc=ttabi@nvidia.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).