Re: [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure

NVIDIA GPU driver infrastructure
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Danilo Krummrich" <dakr@kernel.org>,
	"Alexandre Courbot" <acourbot@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>
Cc: "Simona Vetter" <simona@ffwll.ch>,
	"Benno Lossin" <lossin@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	"Eliot Courtney" <ecourtney@nvidia.com>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Timur Tabi" <ttabi@nvidia.com>,
	"Shashank Sharma" <shashanks@nvidia.com>,
	"Zhi Wang" <zhiw@nvidia.com>, "Boqun Feng" <boqun.feng@gmail.com>,
	"Alistair Popple" <apopple@nvidia.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Alexandre Courbot" <acourbot@nvidia.com>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"David Airlie" <airlied@gmail.com>,
	nova-gpu@lists.linux.dev, LKML <linux-kernel@vger.kernel.org>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Danilo Krummrich" <dakr@kernel.org>
Subject: Re: [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
Date: Tue, 02 Jun 2026 03:33:57 +0000	[thread overview]
Message-ID: <20260602033358.5F2B71F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260602032111.224790-16-jhubbard@nvidia.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Race condition due to incorrect order of writing queue pointers in `send_msg`.
- [High] Silent failure when sending a 4-byte message.
- [High] Unchecked integer addition in `poll_msgq` can cause a kernel panic on untrusted MMIO reads.
--

commit 8a59b49984011320aa631b20a05550242e2baef2
Author: John Hubbard <jhubbard@nvidia.com>

gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure

This commit adds the foundation security processor (FSP) message
infrastructure for the nova-core GPU driver. It implements queue
registers and low-level helpers to communicate with the FSP falcon's EMEM.

> diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
> index 6b057d958115a..0ec1c55213bc8 100644
> --- a/drivers/gpu/nova-core/falcon/fsp.rs
> +++ b/drivers/gpu/nova-core/falcon/fsp.rs
[ ... ]
> @@ -145,4 +143,63 @@ fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
>  
>          Ok(())
>      }
> +
> +    /// Poll FSP for incoming data.
> +    ///
> +    /// Returns the size of available data in bytes, or 0 if no data is available.
> +    ///
> +    /// The FSP message queue is not circular. Pointers are reset to 0 after each
> +    /// message exchange, so `tail >= head` is always true when data is present.
> +    #[expect(dead_code)]
> +    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
> +        let head = bar.read(regs::NV_PFSP_MSGQ_HEAD).address();
> +        let tail = bar.read(regs::NV_PFSP_MSGQ_TAIL).address();
> +
> +        if head == tail {
> +            return 0;
> +        }
> +
> +        // TAIL points at last DWORD written, so add 4 to get total size
> +        tail.saturating_sub(head) + 4
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Can this unchecked addition panic if we encounter a PCIe link failure?

If `head` reads normally as 0 but a surprise removal occurs before reading
`tail`, `tail` could read as 0xFFFFFFFF. The saturating subtraction would
leave 0xFFFFFFFF, and the subsequent `+ 4` will overflow, causing a panic in
debug builds where CONFIG_RUST_OVERFLOW_CHECKS=y is set.

Should this use `saturating_add(4)` or another form of checked math to handle
untrusted hardware inputs safely?

> +    }
> +
> +    /// Writes `packet` to FSP EMEM and updates the queue pointers to notify FSP.
> +    ///
> +    /// Returns `EINVAL` if `packet` is empty or its length is not 4-byte aligned.
> +    #[expect(dead_code)]
> +    pub(crate) fn send_msg(&mut self, bar: &Bar0, packet: &[u8]) -> Result {
> +        if packet.is_empty() {
> +            return Err(EINVAL);
> +        }
> +
> +        // Write message to EMEM at offset 0 (validates 4-byte alignment)
> +        self.write_emem(bar, 0, packet)?;
> +
> +        // Update queue pointers. TAIL points at the last DWORD written.
> +        let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Does this create a silent failure if a 4-byte message is sent?

If `packet.len()` is 4, `tail_offset` is calculated as 0. Both NV_PFSP_QUEUE_HEAD
and NV_PFSP_QUEUE_TAIL will then be written as 0. Because the FSP protocol treats
`HEAD == TAIL` as an empty queue, the FSP will ignore the message while the
function still returns `Ok(())`.

Should this enforce a minimum packet length of 8 bytes or return an error
for 4-byte messages?

> +        bar.write_reg(regs::NV_PFSP_QUEUE_TAIL::zeroed().with_address(tail_offset));
> +        bar.write_reg(regs::NV_PFSP_QUEUE_HEAD::zeroed().with_address(0));
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Is there a race condition with the FSP hardware due to the order of these
writes?

By writing NV_PFSP_QUEUE_TAIL before NV_PFSP_QUEUE_HEAD, if QUEUE_HEAD was
non-zero from a prior transaction, we momentarily create a state where
`TAIL != HEAD`. The FSP may wake up immediately and read from the old, incorrect
QUEUE_HEAD offset before the driver can write 0 to QUEUE_HEAD.

Should NV_PFSP_QUEUE_HEAD be initialized to 0 before writing the
NV_PFSP_QUEUE_TAIL doorbell?

> +
> +        Ok(())
> +    }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260602032111.224790-1-jhubbard@nvidia.com?part=15

next prev parent reply	other threads:[~2026-06-02  3:33 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-02  3:20 [PATCH v12 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
2026-06-02  3:20 ` [PATCH v12 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-06-02  6:40   ` Eliot Courtney
2026-06-02  3:20 ` [PATCH v12 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-06-02  3:20 ` [PATCH v12 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
2026-06-02  3:20 ` [PATCH v12 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-06-02  3:20 ` [PATCH v12 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-06-02  3:20 ` [PATCH v12 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-06-02  3:30   ` sashiko-bot
2026-06-02  8:00     ` Alexandre Courbot
2026-06-02  7:12   ` Eliot Courtney
2026-06-02  8:26     ` Alexandre Courbot
2026-06-02  3:20 ` [PATCH v12 07/22] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-06-02  3:20 ` [PATCH v12 08/22] gpu: nova-core: add support for 32-bit " John Hubbard
2026-06-02  3:20 ` [PATCH v12 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-06-02  3:20 ` [PATCH v12 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-06-02  6:50   ` Eliot Courtney
2026-06-02  3:20 ` [PATCH v12 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image John Hubbard
2026-06-02  7:18   ` Eliot Courtney
2026-06-02  3:21 ` [PATCH v12 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-06-02  7:56   ` Eliot Courtney
2026-06-02  8:22     ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-06-02  3:32   ` sashiko-bot
2026-06-02  7:56     ` Alexandre Courbot
2026-06-02  8:11   ` Eliot Courtney
2026-06-02  8:28     ` Alexandre Courbot
2026-06-03  0:04   ` Timur Tabi
2026-06-03  0:20     ` Alexandre Courbot
2026-06-03  3:09       ` Timur Tabi
2026-06-03  3:53         ` John Hubbard
2026-06-02  3:21 ` [PATCH v12 14/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-06-02 11:42   ` Eliot Courtney
2026-06-02 14:55     ` Alexandre Courbot
2026-06-02 15:02   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-06-02  3:33   ` sashiko-bot [this message]
2026-06-03  1:14     ` Alexandre Courbot
2026-06-03  1:41       ` Eliot Courtney
2026-06-02 12:21   ` Eliot Courtney
2026-06-03  1:34     ` Alexandre Courbot
2026-06-03  4:49       ` Eliot Courtney
2026-06-03  5:00         ` Alexandre Courbot
2026-06-03  1:00   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 16/22] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
2026-06-02  5:36   ` sashiko-bot
2026-06-03  2:41     ` Alexandre Courbot
2026-06-02 12:53   ` Eliot Courtney
2026-06-02  3:21 ` [PATCH v12 17/22] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-06-02  3:35   ` sashiko-bot
2026-06-02  3:21 ` [PATCH v12 18/22] gpu: nova-core: Hopper/Blackwell: select FSP Chain of Trust version John Hubbard
2026-06-02 12:55   ` Eliot Courtney
2026-06-02  3:21 ` [PATCH v12 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-06-02  3:40   ` sashiko-bot
2026-06-03  5:23     ` Alexandre Courbot
2026-06-03  5:19   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 20/22] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-06-02  3:38   ` sashiko-bot
2026-06-03  5:45   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 21/22] gpu: nova-core: add non-sec2 unload path John Hubbard
2026-06-02  3:21 ` [PATCH v12 22/22] gpu: nova-core: gsp: enable FSP boot path John Hubbard
2026-06-02  3:38   ` sashiko-bot
2026-06-02 12:38 ` [PATCH v12 00/22] gpu: nova-core: firmware: Hopper/Blackwell support Danilo Krummrich
2026-06-02 13:37 ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260602033358.5F2B71F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=apopple@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=ecourtney@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=nova-gpu@lists.linux.dev \
    --cc=ojeda@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=shashanks@nvidia.com \
    --cc=simona@ffwll.ch \
    --cc=tmgross@umich.edu \
    --cc=ttabi@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox