From: "Eliot Courtney" <ecourtney@nvidia.com>
To: "Alexandre Courbot" <acourbot@nvidia.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"Alice Ryhl" <aliceryhl@google.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Alistair Popple" <apopple@nvidia.com>
Cc: "John Hubbard" <jhubbard@nvidia.com>,
"Joel Fernandes" <joelagnelf@nvidia.com>,
"Timur Tabi" <ttabi@nvidia.com>, "Zhi Wang" <zhiw@nvidia.com>,
"Eliot Courtney" <ecourtney@nvidia.com>,
<rust-for-linux@vger.kernel.org>,
<dri-devel@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] gpu: nova-core: gsp: fix undefined behavior in command queue code
Date: Thu, 19 Mar 2026 15:41:26 +0900 [thread overview]
Message-ID: <DH6JY86QZEYL.PKQ2W4ZA9UUE@nvidia.com> (raw)
In-Reply-To: <20260319-cmdq-ub-fix-v1-1-0f9f6e8f3ce3@nvidia.com>
On Thu Mar 19, 2026 at 2:36 PM JST, Alexandre Courbot wrote:
> `driver_read_area` and `driver_write_area` are internal methods that
> return slices containing the area of the command queue buffer that the
> driver has exclusive read of write access, respectively.
>
> While their returned value is correct and safe to use, internally they
> temporarily create a reference to the whole command-buffer slice,
> including GSP-owned regions. These regions can change without notice,
> and thus creating a slice to them is undefined behavior.
>
> Fix this by replacing the slice logic with pointer arithmetic and
> creating slices to valid regions only. It relies on unsafe code, but
> should be mostly replaced by `IoView` and `IoSlice` once they land.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Link: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
> drivers/gpu/nova-core/gsp/cmdq.rs | 135 ++++++++++++++++++++++++++++----------
> 1 file changed, 100 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index d36a62ba1c60..4200e7986774 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -251,38 +251,77 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
> /// As the message queue is a circular buffer, the region may be discontiguous in memory. In
> /// that case the second slice will have a non-zero length.
> fn driver_write_area(&mut self) -> (&mut [[u8; GSP_PAGE_SIZE]], &mut [[u8; GSP_PAGE_SIZE]]) {
> - let tx = self.cpu_write_ptr() as usize;
> - let rx = self.gsp_read_ptr() as usize;
> + let tx = num::u32_as_usize(self.cpu_write_ptr());
> + let rx = num::u32_as_usize(self.gsp_read_ptr());
> + // Number of pages between `tx` and the end of the command queue.
> + // PANIC: Per the invariant of `cpu_write_ptr`, `tx < MSGQ_NUM_PAGES`.
> + let after_tx_len = num::u32_as_usize(MSGQ_NUM_PAGES) - tx;
>
> + // Pointer to the start of the CPU message queue.
> + //
> // SAFETY:
> - // - The `CoherentAllocation` contains exactly one object.
> - // - We will only access the driver-owned part of the shared memory.
> - // - Per the safety statement of the function, no concurrent access will be performed.
> - let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
> - // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
> - let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
> + // - `self.0` contains exactly one element.
> + // - `cpuq.msgq.data[0]` is within the bounds of that element.
> + let data = unsafe { &raw mut (*self.0.start_ptr_mut()).cpuq.msgq.data[0] };
>
> - // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
> - // belongs to the driver for writing.
> + // Safety/Panic comments to be referenced by the code below.
> + //
> + // SAFETY[1]:
> + // - `data` points to an array of `MSGQ_NUM_PAGES` elements.
> + // - The area starting at `tx` and ending at `rx - 2` modulo `MSGQ_NUM_PAGES`,
> + // inclusive, belongs to the driver for writing and is not accessed concurrently by
> + // the GSP.
> + // - `tx + after_tx_len` == `MSGQ_NUM_PAGES`.
> + //
> + // PANIC[1]:
> + // - Per the invariant of `cpu_write_ptr`, `tx < MSGQ_NUM_PAGES`.
> + // - Per the invariant of `gsp_read_ptr`, `rx < MSGQ_NUM_PAGES`.
>
> if rx == 0 {
> - // Since `rx` is zero, leave an empty slot at end of the buffer.
> - let last = after_tx.len() - 1;
> - (&mut after_tx[..last], &mut [])
> + (
> + // SAFETY: See SAFETY[1].
> + unsafe {
> + core::slice::from_raw_parts_mut(
> + data.add(tx),
> + // Since `rx` is zero, leave an empty slot at end of the buffer.
> + // PANIC: See PANIC[1].
> + after_tx_len - 1,
> + )
> + },
> + &mut [],
> + )
> } else if rx <= tx {
> // The area is discontiguous and we leave an empty slot before `rx`.
> - // PANIC:
> - // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
> - // - The index does not exceed `before_tx.len()` (which equals `tx`) because
> - // `rx <= tx` in this branch.
> - (after_tx, &mut before_tx[..(rx - 1)])
> + (
> + // SAFETY: See SAFETY[1].
> + unsafe { core::slice::from_raw_parts_mut(data.add(tx), after_tx_len) },
> + // SAFETY: See SAFETY[1].
> + unsafe {
> + core::slice::from_raw_parts_mut(
> + data,
> + // Leave one empty slot before `rx`.
> + // PANIC:
> + // - See PANIC[1].
> + // - `rx - 1` is non-negative because `rx != 0` in this branch.
> + rx - 1,
> + )
> + },
> + )
> } else {
> // The area is contiguous and we leave an empty slot before `rx`.
> - // PANIC:
> - // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
> - // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
> - // because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
> - (&mut after_tx[..(rx - tx - 1)], &mut [])
> + (
> + // SAFETY: See SAFETY[1].
> + unsafe {
> + core::slice::from_raw_parts_mut(
> + data.add(tx),
> + // PANIC:
> + // - See PANIC[1].
> + // - `rx - tx - 1` is non-negative because `rx > tx` in this branch.
> + rx - tx - 1,
> + )
> + },
> + &mut [],
> + )
> }
> }
>
> @@ -308,24 +347,50 @@ fn driver_write_area_size(&self) -> usize {
> let tx = self.gsp_write_ptr() as usize;
> let rx = self.cpu_read_ptr() as usize;
Should we use u32_as_usize here too?
Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
next prev parent reply other threads:[~2026-03-19 6:41 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 5:36 [PATCH] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
2026-03-19 6:41 ` Eliot Courtney [this message]
2026-03-19 8:24 ` Alexandre Courbot
2026-03-20 12:54 ` Danilo Krummrich
2026-03-23 2:58 ` Alexandre Courbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DH6JY86QZEYL.PKQ2W4ZA9UUE@nvidia.com \
--to=ecourtney@nvidia.com \
--cc=acourbot@nvidia.com \
--cc=airlied@gmail.com \
--cc=aliceryhl@google.com \
--cc=apopple@nvidia.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=jhubbard@nvidia.com \
--cc=joelagnelf@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=simona@ffwll.ch \
--cc=ttabi@nvidia.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.