From: sashiko-bot@kernel.org
To: "Danilo Krummrich" <dakr@kernel.org>,
"Alexandre Courbot" <acourbot@nvidia.com>,
"John Hubbard" <jhubbard@nvidia.com>
Cc: "Danilo Krummrich" <dakr@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
"Eliot Courtney" <ecourtney@nvidia.com>,
"Zhi Wang" <zhiw@nvidia.com>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Gary Guo" <gary@garyguo.net>, "David Airlie" <airlied@gmail.com>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Alistair Popple" <apopple@nvidia.com>,
nova-gpu@lists.linux.dev,
"Shashank Sharma" <shashanks@nvidia.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"John Hubbard" <jhubbard@nvidia.com>,
"Alexandre Courbot" <acourbot@nvidia.com>,
"Timur Tabi" <ttabi@nvidia.com>,
"Alice Ryhl" <aliceryhl@google.com>,
"Trevor Gross" <tmgross@umich.edu>,
"Benno Lossin" <lossin@kernel.org>,
"Andreas Hindborg" <a.hindborg@kernel.org>
Subject: Re: [PATCH v12 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
Date: Tue, 02 Jun 2026 03:40:42 +0000 [thread overview]
Message-ID: <20260602034042.7C26E1F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260602032111.224790-20-jhubbard@nvidia.com>
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] DMA Use-After-Free of FMC boot parameters during initialization
- [High] Missing little-endian conversions in hardware payloads breaks big-endian hosts
- [Medium] Kernel stack overflow risk due to large struct passed by value
--
commit 42c5f0e39c2bb0bc87a3f8df25554d04f00a3e69
Author: John Hubbard <jhubbard@nvidia.com>
gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
Build and send the Chain of Trust message to FSP, bundling the
DMA-coherent boot parameters that FSP reads at boot time.
> diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
> index 67cf83aba83c6..352ef7683cf2a 100644
> --- a/drivers/gpu/nova-core/fsp.rs
> +++ b/drivers/gpu/nova-core/fsp.rs
[ ... ]
> @@ -113,8 +190,65 @@ pub(crate) fn wait_secure_boot(
> Ok(Fsp { falcon, fsp_fw })
> }
>
> + /// Boots GSP FMC via FSP Chain of Trust.
> + ///
> + /// Builds the CoT message from the pre-configured [`FmcBootArgs`], sends it
> + /// to FSP, and waits for the response.
> + pub(crate) fn boot_fmc(
> + &mut self,
> + dev: &device::Device<device::Bound>,
> + bar: &Bar0,
> + fb_layout: &FbLayout,
> + args: &FmcBootArgs,
> + ) -> Result {
> + dev_dbg!(dev, "Starting FSP boot sequence for {}\n", args.chipset);
> +
> + let fmc_addr = self.fsp_fw.fmc_image.dma_handle();
> + let fmc_boot_params_addr = args.fmc_boot_params.dma_handle();
> +
> + // frts_offset is relative to FB end: FRTS_location = FB_END - frts_offset
> + let frts_offset = if !args.resume {
> + let frts_reserved_size = fb_layout.heap.len() + u64::from(fb_layout.pmu_reserved_size);
> +
> + frts_reserved_size
> + .align_up(Alignment::new::<SZ_2M>())
> + .ok_or(EINVAL)?
> + } else {
> + 0
> + };
> + let frts_size: u32 = if !args.resume {
> + fb_layout.frts.len().try_into()?
> + } else {
> + 0
> + };
> +
> + let msg = KBox::new(
> + FspMessage {
> + mctp_header: MctpHeader::single_packet(),
> + nvdm_header: NvdmHeader::new(NvdmType::Cot),
> + cot: NvdmPayloadCot {
[Severity: Medium]
Is it intentional to construct this ~868-byte FspMessage structure on the
stack before boxing it?
This also dereferences and copies the 816-byte FmcSignatures structure
(sigs: *self.fsp_fw.fmc_sigs). Doesn't this bypass the optimization in
FspFirmware::extract_fmc_signatures() which avoided building FmcSignatures on
the stack?
Could this contribute to kernel stack overflows? Might it be safer to
initialize it directly on the heap using KBox::init() or KBox::new_uninit()?
> + version: hal::fsp_hal(args.chipset).ok_or(ENOTSUPP)?.cot_version(),
> + size: u16::try_from(core::mem::size_of::<NvdmPayloadCot>())
> + .map_err(|_| EINVAL)?,
> + gsp_fmc_sysmem_offset: fmc_addr,
> + frts_sysmem_offset: 0,
> + frts_sysmem_size: 0,
> + frts_vidmem_offset: frts_offset,
> + frts_vidmem_size: frts_size,
[Severity: High]
Will these native integer types be correctly interpreted by the little-endian
RISC-V FSP/GSP processors if the host architecture is big-endian?
Because the struct is laid out in big-endian byte order on big-endian hosts,
won't the little-endian hardware swap the bytes when reading, resulting in
incorrect sizes, versions, and DMA addresses?
Should explicit little-endian types or byte-swapping (like .to_le()) be used
here to avoid breaking driver initialization on big-endian hosts?
> + sigs: *self.fsp_fw.fmc_sigs,
> + gsp_boot_args_sysmem_offset: fmc_boot_params_addr,
> + },
> + },
> + GFP_KERNEL,
> + )?;
> +
> + self.send_sync_fsp(dev, bar, &*msg)?;
> +
> + dev_dbg!(dev, "FSP Chain of Trust completed successfully\n");
> + Ok(())
> + }
[ ... ]
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index 0c54e8bf4bb3a..558b37863f00a 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -934,3 +934,67 @@ fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
> })
> }
> }
> +
> +#[repr(u32)]
> +pub(crate) enum GspDmaTarget {
> + #[expect(dead_code)]
> + LocalFb = bindings::GSP_DMA_TARGET_GSP_DMA_TARGET_LOCAL_FB,
> + CoherentSystem = bindings::GSP_DMA_TARGET_GSP_DMA_TARGET_COHERENT_SYSTEM,
> + NoncoherentSystem = bindings::GSP_DMA_TARGET_GSP_DMA_TARGET_NONCOHERENT_SYSTEM,
> +}
> +
> +type GspAcrBootGspRmParams = bindings::GSP_ACR_BOOT_GSP_RM_PARAMS;
> +
> +impl GspAcrBootGspRmParams {
> + fn new(target: GspDmaTarget, wpr_meta_addr: u64) -> impl Init<Self> {
> + #[allow(non_snake_case)]
> + let params = init!(Self {
> + target: target as u32,
> + gspRmDescSize: num::usize_into_u32::<{ size_of::<GspFwWprMeta>() }>(),
> + gspRmDescOffset: wpr_meta_addr,
> + bIsGspRmBoot: 1,
[Severity: High]
Similarly to NvdmPayloadCot above, will these native integer types be
correctly interpreted by little-endian hardware on big-endian hosts?
Does this struct require explicit little-endian types or byte-swapping to
avoid accessing invalid physical memory on big-endian architectures?
> + wprCarveoutOffset: 0,
> + wprCarveoutSize: 0,
> + __bindgen_padding_0: Default::default(),
> + });
> +
> + params
> + }
> +}
[ ... ]
> diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
> index b25970dd4561d..f41f3fea15ff2 100644
> --- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
> +++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
[ ... ]
> @@ -39,17 +42,27 @@ impl GspHal for Gh100 {
> /// the GSP boot internally - no manual GSP reset/boot is needed.
> fn boot<'a>(
> &self,
> - _gsp: &'a Gsp,
> + gsp: &'a Gsp,
> dev: &'a device::Device<device::Bound>,
> bar: &'a Bar0,
> chipset: Chipset,
> - _fb_layout: &FbLayout,
> - _wpr_meta: &Coherent<GspFwWprMeta>,
> + fb_layout: &FbLayout,
> + wpr_meta: &Coherent<GspFwWprMeta>,
> _gsp_falcon: &'a Falcon<GspEngine>,
> _sec2_falcon: &'a Falcon<Sec2>,
> ) -> Result<BootUnloadGuard<'a>> {
> let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
> - let _fsp = Fsp::wait_secure_boot(dev, bar, chipset, fsp_fw)?;
> + let mut fsp = Fsp::wait_secure_boot(dev, bar, chipset, fsp_fw)?;
> +
> + let args = FmcBootArgs::new(
> + dev,
> + chipset,
> + wpr_meta.dma_handle(),
> + gsp.libos.dma_handle(),
> + false,
> + )?;
> +
> + fsp.boot_fmc(dev, bar, fb_layout, &args)?;
>
> Err(ENOTSUPP)
[Severity: High]
Does returning an error here immediately drop the args variable, thereby
freeing its DMA-coherent memory before the FSP finishes reading it?
Since the FSP is actively booting asynchronously, wouldn't this cause a
hardware use-after-free of the boot parameters, leading to IOMMU faults or
undefined behavior?
Note: I see this is fixed in a later commit 'gpu: nova-core: gsp: enable FSP
boot path' which changes the return to Ok(unload_guard) and properly handles
the required hardware delays.
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260602032111.224790-1-jhubbard@nvidia.com?part=19
next prev parent reply other threads:[~2026-06-02 3:40 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 3:20 [PATCH v12 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
2026-06-02 3:20 ` [PATCH v12 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-06-02 6:40 ` Eliot Courtney
2026-06-02 3:20 ` [PATCH v12 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-06-02 3:20 ` [PATCH v12 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
2026-06-02 3:20 ` [PATCH v12 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-06-02 3:20 ` [PATCH v12 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-06-02 3:20 ` [PATCH v12 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-06-02 3:30 ` sashiko-bot
2026-06-02 8:00 ` Alexandre Courbot
2026-06-02 7:12 ` Eliot Courtney
2026-06-02 8:26 ` Alexandre Courbot
2026-06-02 3:20 ` [PATCH v12 07/22] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-06-02 3:20 ` [PATCH v12 08/22] gpu: nova-core: add support for 32-bit " John Hubbard
2026-06-02 3:20 ` [PATCH v12 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-06-02 3:20 ` [PATCH v12 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-06-02 6:50 ` Eliot Courtney
2026-06-02 3:20 ` [PATCH v12 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image John Hubbard
2026-06-02 7:18 ` Eliot Courtney
2026-06-02 3:21 ` [PATCH v12 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-06-02 7:56 ` Eliot Courtney
2026-06-02 8:22 ` Alexandre Courbot
2026-06-02 3:21 ` [PATCH v12 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-06-02 3:32 ` sashiko-bot
2026-06-02 7:56 ` Alexandre Courbot
2026-06-02 8:11 ` Eliot Courtney
2026-06-02 8:28 ` Alexandre Courbot
2026-06-03 0:04 ` Timur Tabi
2026-06-03 0:20 ` Alexandre Courbot
2026-06-03 3:09 ` Timur Tabi
2026-06-03 3:53 ` John Hubbard
2026-06-02 3:21 ` [PATCH v12 14/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-06-02 11:42 ` Eliot Courtney
2026-06-02 14:55 ` Alexandre Courbot
2026-06-02 15:02 ` Alexandre Courbot
2026-06-02 3:21 ` [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-06-02 3:33 ` sashiko-bot
2026-06-03 1:14 ` Alexandre Courbot
2026-06-03 1:41 ` Eliot Courtney
2026-06-02 12:21 ` Eliot Courtney
2026-06-03 1:34 ` Alexandre Courbot
2026-06-03 4:49 ` Eliot Courtney
2026-06-03 5:00 ` Alexandre Courbot
2026-06-03 1:00 ` Alexandre Courbot
2026-06-02 3:21 ` [PATCH v12 16/22] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
2026-06-02 5:36 ` sashiko-bot
2026-06-03 2:41 ` Alexandre Courbot
2026-06-02 12:53 ` Eliot Courtney
2026-06-02 3:21 ` [PATCH v12 17/22] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-06-02 3:35 ` sashiko-bot
2026-06-02 3:21 ` [PATCH v12 18/22] gpu: nova-core: Hopper/Blackwell: select FSP Chain of Trust version John Hubbard
2026-06-02 12:55 ` Eliot Courtney
2026-06-02 3:21 ` [PATCH v12 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-06-02 3:40 ` sashiko-bot [this message]
2026-06-03 5:23 ` Alexandre Courbot
2026-06-03 5:19 ` Alexandre Courbot
2026-06-02 3:21 ` [PATCH v12 20/22] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-06-02 3:38 ` sashiko-bot
2026-06-03 5:45 ` Alexandre Courbot
2026-06-02 3:21 ` [PATCH v12 21/22] gpu: nova-core: add non-sec2 unload path John Hubbard
2026-06-02 3:21 ` [PATCH v12 22/22] gpu: nova-core: gsp: enable FSP boot path John Hubbard
2026-06-02 3:38 ` sashiko-bot
2026-06-02 12:38 ` [PATCH v12 00/22] gpu: nova-core: firmware: Hopper/Blackwell support Danilo Krummrich
2026-06-02 13:37 ` Alexandre Courbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260602034042.7C26E1F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=a.hindborg@kernel.org \
--cc=acourbot@nvidia.com \
--cc=airlied@gmail.com \
--cc=alex.gaynor@gmail.com \
--cc=aliceryhl@google.com \
--cc=apopple@nvidia.com \
--cc=bhelgaas@google.com \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=dakr@kernel.org \
--cc=ecourtney@nvidia.com \
--cc=gary@garyguo.net \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lossin@kernel.org \
--cc=nova-gpu@lists.linux.dev \
--cc=ojeda@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=shashanks@nvidia.com \
--cc=simona@ffwll.ch \
--cc=tmgross@umich.edu \
--cc=ttabi@nvidia.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox