Re: [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Danilo Krummrich" <dakr@kernel.org>,
	"Alexandre Courbot" <acourbot@nvidia.com>,
	"John Hubbard" <jhubbard@nvidia.com>
Cc: "Simona Vetter" <simona@ffwll.ch>,
	"Benno Lossin" <lossin@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	"Eliot Courtney" <ecourtney@nvidia.com>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Timur Tabi" <ttabi@nvidia.com>,
	"Shashank Sharma" <shashanks@nvidia.com>,
	"Zhi Wang" <zhiw@nvidia.com>, "Boqun Feng" <boqun.feng@gmail.com>,
	"Alistair Popple" <apopple@nvidia.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Alexandre Courbot" <acourbot@nvidia.com>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"David Airlie" <airlied@gmail.com>,
	nova-gpu@lists.linux.dev, LKML <linux-kernel@vger.kernel.org>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Danilo Krummrich" <dakr@kernel.org>
Subject: Re: [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
Date: Tue, 02 Jun 2026 03:33:57 +0000	[thread overview]
Message-ID: <20260602033358.5F2B71F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260602032111.224790-16-jhubbard@nvidia.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Race condition due to incorrect order of writing queue pointers in `send_msg`.
- [High] Silent failure when sending a 4-byte message.
- [High] Unchecked integer addition in `poll_msgq` can cause a kernel panic on untrusted MMIO reads.
--

commit 8a59b49984011320aa631b20a05550242e2baef2
Author: John Hubbard <jhubbard@nvidia.com>

gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure

This commit adds the foundation security processor (FSP) message
infrastructure for the nova-core GPU driver. It implements queue
registers and low-level helpers to communicate with the FSP falcon's EMEM.

> diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
> index 6b057d958115a..0ec1c55213bc8 100644
> --- a/drivers/gpu/nova-core/falcon/fsp.rs
> +++ b/drivers/gpu/nova-core/falcon/fsp.rs
[ ... ]
> @@ -145,4 +143,63 @@ fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
>  
>          Ok(())
>      }
> +
> +    /// Poll FSP for incoming data.
> +    ///
> +    /// Returns the size of available data in bytes, or 0 if no data is available.
> +    ///
> +    /// The FSP message queue is not circular. Pointers are reset to 0 after each
> +    /// message exchange, so `tail >= head` is always true when data is present.
> +    #[expect(dead_code)]
> +    pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
> +        let head = bar.read(regs::NV_PFSP_MSGQ_HEAD).address();
> +        let tail = bar.read(regs::NV_PFSP_MSGQ_TAIL).address();
> +
> +        if head == tail {
> +            return 0;
> +        }
> +
> +        // TAIL points at last DWORD written, so add 4 to get total size
> +        tail.saturating_sub(head) + 4
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Can this unchecked addition panic if we encounter a PCIe link failure?

If `head` reads normally as 0 but a surprise removal occurs before reading
`tail`, `tail` could read as 0xFFFFFFFF. The saturating subtraction would
leave 0xFFFFFFFF, and the subsequent `+ 4` will overflow, causing a panic in
debug builds where CONFIG_RUST_OVERFLOW_CHECKS=y is set.

Should this use `saturating_add(4)` or another form of checked math to handle
untrusted hardware inputs safely?

> +    }
> +
> +    /// Writes `packet` to FSP EMEM and updates the queue pointers to notify FSP.
> +    ///
> +    /// Returns `EINVAL` if `packet` is empty or its length is not 4-byte aligned.
> +    #[expect(dead_code)]
> +    pub(crate) fn send_msg(&mut self, bar: &Bar0, packet: &[u8]) -> Result {
> +        if packet.is_empty() {
> +            return Err(EINVAL);
> +        }
> +
> +        // Write message to EMEM at offset 0 (validates 4-byte alignment)
> +        self.write_emem(bar, 0, packet)?;
> +
> +        // Update queue pointers. TAIL points at the last DWORD written.
> +        let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Does this create a silent failure if a 4-byte message is sent?

If `packet.len()` is 4, `tail_offset` is calculated as 0. Both NV_PFSP_QUEUE_HEAD
and NV_PFSP_QUEUE_TAIL will then be written as 0. Because the FSP protocol treats
`HEAD == TAIL` as an empty queue, the FSP will ignore the message while the
function still returns `Ok(())`.

Should this enforce a minimum packet length of 8 bytes or return an error
for 4-byte messages?

> +        bar.write_reg(regs::NV_PFSP_QUEUE_TAIL::zeroed().with_address(tail_offset));
> +        bar.write_reg(regs::NV_PFSP_QUEUE_HEAD::zeroed().with_address(0));
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[Severity: High]
Is there a race condition with the FSP hardware due to the order of these
writes?

By writing NV_PFSP_QUEUE_TAIL before NV_PFSP_QUEUE_HEAD, if QUEUE_HEAD was
non-zero from a prior transaction, we momentarily create a state where
`TAIL != HEAD`. The FSP may wake up immediately and read from the old, incorrect
QUEUE_HEAD offset before the driver can write 0 to QUEUE_HEAD.

Should NV_PFSP_QUEUE_HEAD be initialized to 0 before writing the
NV_PFSP_QUEUE_TAIL doorbell?

> +
> +        Ok(())
> +    }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260602032111.224790-1-jhubbard@nvidia.com?part=15

next prev parent reply	other threads:[~2026-06-02  3:33 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-02  3:20 [PATCH v12 00/22] gpu: nova-core: firmware: Hopper/Blackwell support John Hubbard
2026-06-02  3:20 ` [PATCH v12 01/22] gpu: nova-core: set DMA mask width based on GPU architecture John Hubbard
2026-06-02  6:40   ` Eliot Courtney
2026-06-02  3:20 ` [PATCH v12 02/22] gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror John Hubbard
2026-06-02  3:20 ` [PATCH v12 03/22] gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size John Hubbard
2026-06-02  3:20 ` [PATCH v12 04/22] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap John Hubbard
2026-06-02  3:20 ` [PATCH v12 05/22] gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap John Hubbard
2026-06-02  3:20 ` [PATCH v12 06/22] gpu: nova-core: Blackwell: use correct sysmem flush registers John Hubbard
2026-06-02  3:30   ` sashiko-bot
2026-06-02  8:00     ` Alexandre Courbot
2026-06-02  7:12   ` Eliot Courtney
2026-06-02  8:26     ` Alexandre Courbot
2026-06-02  3:20 ` [PATCH v12 07/22] gpu: nova-core: don't assume 64-bit firmware images John Hubbard
2026-06-02  3:20 ` [PATCH v12 08/22] gpu: nova-core: add support for 32-bit " John Hubbard
2026-06-02  3:20 ` [PATCH v12 09/22] gpu: nova-core: add auto-detection of 32-bit, 64-bit " John Hubbard
2026-06-02  3:20 ` [PATCH v12 10/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub John Hubbard
2026-06-02  6:50   ` Eliot Courtney
2026-06-02  3:20 ` [PATCH v12 11/22] gpu: nova-core: Hopper/Blackwell: add FMC firmware image John Hubbard
2026-06-02  7:18   ` Eliot Courtney
2026-06-02  3:21 ` [PATCH v12 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting John Hubbard
2026-06-02  7:56   ` Eliot Courtney
2026-06-02  8:22     ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 13/22] gpu: nova-core: Hopper/Blackwell: add FMC signature extraction John Hubbard
2026-06-02  3:32   ` sashiko-bot
2026-06-02  7:56     ` Alexandre Courbot
2026-06-02  8:11   ` Eliot Courtney
2026-06-02  8:28     ` Alexandre Courbot
2026-06-03  0:04   ` Timur Tabi
2026-06-03  0:20     ` Alexandre Courbot
2026-06-03  3:09       ` Timur Tabi
2026-06-03  3:53         ` John Hubbard
2026-06-03 16:52           ` Timur Tabi
2026-06-02  3:21 ` [PATCH v12 14/22] gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations John Hubbard
2026-06-02 11:42   ` Eliot Courtney
2026-06-02 14:55     ` Alexandre Courbot
2026-06-02 15:02   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure John Hubbard
2026-06-02  3:33   ` sashiko-bot [this message]
2026-06-03  1:14     ` Alexandre Courbot
2026-06-03  1:41       ` Eliot Courtney
2026-06-02 12:21   ` Eliot Courtney
2026-06-03  1:34     ` Alexandre Courbot
2026-06-03  4:49       ` Eliot Courtney
2026-06-03  5:00         ` Alexandre Courbot
2026-06-03  1:00   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 16/22] gpu: nova-core: add MCTP/NVDM protocol types for firmware communication John Hubbard
2026-06-02  5:36   ` sashiko-bot
2026-06-03  2:41     ` Alexandre Courbot
2026-06-02 12:53   ` Eliot Courtney
2026-06-02  3:21 ` [PATCH v12 17/22] gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging John Hubbard
2026-06-02  3:35   ` sashiko-bot
2026-06-02  3:21 ` [PATCH v12 18/22] gpu: nova-core: Hopper/Blackwell: select FSP Chain of Trust version John Hubbard
2026-06-02 12:55   ` Eliot Courtney
2026-06-02  3:21 ` [PATCH v12 19/22] gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot John Hubbard
2026-06-02  3:40   ` sashiko-bot
2026-06-03  5:23     ` Alexandre Courbot
2026-06-03  5:19   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 20/22] gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling John Hubbard
2026-06-02  3:38   ` sashiko-bot
2026-06-03  5:45   ` Alexandre Courbot
2026-06-02  3:21 ` [PATCH v12 21/22] gpu: nova-core: add non-sec2 unload path John Hubbard
2026-06-02  3:21 ` [PATCH v12 22/22] gpu: nova-core: gsp: enable FSP boot path John Hubbard
2026-06-02  3:38   ` sashiko-bot
2026-06-02 12:38 ` [PATCH v12 00/22] gpu: nova-core: firmware: Hopper/Blackwell support Danilo Krummrich
2026-06-02 13:37 ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260602033358.5F2B71F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=airlied@gmail.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=apopple@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=dakr@kernel.org \
    --cc=ecourtney@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=nova-gpu@lists.linux.dev \
    --cc=ojeda@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=shashanks@nvidia.com \
    --cc=simona@ffwll.ch \
    --cc=tmgross@umich.edu \
    --cc=ttabi@nvidia.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.