* [PATCH v4 0/2] gpu: nova-core: gsp: fix undefined behavior in command queue code
@ 2026-04-01 14:29 Alexandre Courbot
2026-04-01 14:29 ` [PATCH v4 1/2] gpu: nova-core: gsp: inline methods providing queue range invariants Alexandre Courbot
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Alexandre Courbot @ 2026-04-01 14:29 UTC (permalink / raw)
To: Danilo Krummrich, Gary Guo, Alice Ryhl, David Airlie,
Simona Vetter, Alistair Popple
Cc: John Hubbard, Joel Fernandes, Timur Tabi, Zhi Wang,
Eliot Courtney, rust-for-linux, dri-devel, linux-kernel,
Alexandre Courbot
This revision is mostly to fix the testbot build failures of v3. They
were due to the inability of the compiler optimizer to preserve the
invariant that `rx == 0` in `driver_write_area`. To preserve it, a
nested branch is used, and the methods providing the range invariants
are made inline defensively.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
Changes in v4:
- Make some methods providing the `ptr_project!` invariants inline.
- Use code paths that preserve the invariants `ptr_project!` depends on
more obviously to fix these testbot build failures:
- https://lore.kernel.org/all/202603280326.ucDKVaf2-lkp@intel.com/
- https://lore.kernel.org/all/202603281331.1ESuqgfz-lkp@intel.com/
- Improve safety comment when creating the mutable slices (thanks Danilo!).
- Link to v3: https://patch.msgid.link/20260326-cmdq-ub-fix-v3-1-96af2148ca5c@nvidia.com
Changes in v3:
- Rebase on top of latest `drm-rust-next` (with `Coherent` patches).
- Use pointer projections. (thanks Gary!)
- Link to v2: https://patch.msgid.link/20260323-cmdq-ub-fix-v2-1-77d1213c3f7f@nvidia.com
Changes in v2:
- Use `u32_as_usize` consistently.
- Reduce the number of `unsafe` blocks by computing the end offset of
the returned slices and creating them at the end, in one step.
- Take advantage of the fact that both slices have the same start index
regardless of the branch chosen.
- Improve safety comments.
- Link to v1: https://patch.msgid.link/20260319-cmdq-ub-fix-v1-1-0f9f6e8f3ce3@nvidia.com
---
Alexandre Courbot (2):
gpu: nova-core: gsp: inline methods providing queue range invariants
gpu: nova-core: gsp: fix undefined behavior in command queue code
drivers/gpu/nova-core/gsp/cmdq.rs | 118 ++++++++++++++++++++++----------------
drivers/gpu/nova-core/gsp/fw.rs | 4 ++
2 files changed, 73 insertions(+), 49 deletions(-)
---
base-commit: 7c50d748b4a635bc39802ea3f6b120e66b1b9067
change-id: 20260319-cmdq-ub-fix-d57b09a745b9
Best regards,
--
Alexandre Courbot <acourbot@nvidia.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v4 1/2] gpu: nova-core: gsp: inline methods providing queue range invariants
2026-04-01 14:29 [PATCH v4 0/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
@ 2026-04-01 14:29 ` Alexandre Courbot
2026-04-01 15:28 ` Gary Guo
2026-04-01 14:29 ` [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
2026-04-03 17:37 ` [PATCH v4 0/2] " Danilo Krummrich
2 siblings, 1 reply; 9+ messages in thread
From: Alexandre Courbot @ 2026-04-01 14:29 UTC (permalink / raw)
To: Danilo Krummrich, Gary Guo, Alice Ryhl, David Airlie,
Simona Vetter, Alistair Popple
Cc: John Hubbard, Joel Fernandes, Timur Tabi, Zhi Wang,
Eliot Courtney, rust-for-linux, dri-devel, linux-kernel,
Alexandre Courbot
These methods hold a range invariant for the read/write pointers
(namely, that the pointers are always in the `0..MSGQ_NUM_PAGES` range).
The calling code depends on these invariants to build properly, so make
them `#[inline(always)]` to ensure that they are propagated where they
need to be.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 4 ++++
drivers/gpu/nova-core/gsp/fw.rs | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 2224896ccc89..72e9b79619eb 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -384,6 +384,7 @@ fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand
// # Invariants
//
// - The returned value is within `0..MSGQ_NUM_PAGES`.
+ #[inline(always)]
fn gsp_write_ptr(&self) -> u32 {
super::fw::gsp_mem::gsp_write_ptr(&self.0)
}
@@ -393,6 +394,7 @@ fn gsp_write_ptr(&self) -> u32 {
// # Invariants
//
// - The returned value is within `0..MSGQ_NUM_PAGES`.
+ #[inline(always)]
fn gsp_read_ptr(&self) -> u32 {
super::fw::gsp_mem::gsp_read_ptr(&self.0)
}
@@ -402,6 +404,7 @@ fn gsp_read_ptr(&self) -> u32 {
// # Invariants
//
// - The returned value is within `0..MSGQ_NUM_PAGES`.
+ #[inline(always)]
fn cpu_read_ptr(&self) -> u32 {
super::fw::gsp_mem::cpu_read_ptr(&self.0)
}
@@ -416,6 +419,7 @@ fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
// # Invariants
//
// - The returned value is within `0..MSGQ_NUM_PAGES`.
+ #[inline(always)]
fn cpu_write_ptr(&self) -> u32 {
super::fw::gsp_mem::cpu_write_ptr(&self.0)
}
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 0c8a74f0e8ac..e2c0e9bf310e 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -58,14 +58,17 @@ pub(super) mod gsp_mem {
MSGQ_NUM_PAGES, //
};
+ #[inline(always)]
pub(in crate::gsp) fn gsp_write_ptr(qs: &Coherent<GspMem>) -> u32 {
dma_read!(qs, .gspq.tx.0.writePtr) % MSGQ_NUM_PAGES
}
+ #[inline(always)]
pub(in crate::gsp) fn gsp_read_ptr(qs: &Coherent<GspMem>) -> u32 {
dma_read!(qs, .gspq.rx.0.readPtr) % MSGQ_NUM_PAGES
}
+ #[inline(always)]
pub(in crate::gsp) fn cpu_read_ptr(qs: &Coherent<GspMem>) -> u32 {
dma_read!(qs, .cpuq.rx.0.readPtr) % MSGQ_NUM_PAGES
}
@@ -79,6 +82,7 @@ pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &Coherent<GspMem>, count: u32) {
dma_write!(qs, .cpuq.rx.0.readPtr, rptr);
}
+ #[inline(always)]
pub(in crate::gsp) fn cpu_write_ptr(qs: &Coherent<GspMem>) -> u32 {
dma_read!(qs, .cpuq.tx.0.writePtr) % MSGQ_NUM_PAGES
}
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code
2026-04-01 14:29 [PATCH v4 0/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
2026-04-01 14:29 ` [PATCH v4 1/2] gpu: nova-core: gsp: inline methods providing queue range invariants Alexandre Courbot
@ 2026-04-01 14:29 ` Alexandre Courbot
2026-04-03 23:47 ` John Hubbard
2026-04-03 17:37 ` [PATCH v4 0/2] " Danilo Krummrich
2 siblings, 1 reply; 9+ messages in thread
From: Alexandre Courbot @ 2026-04-01 14:29 UTC (permalink / raw)
To: Danilo Krummrich, Gary Guo, Alice Ryhl, David Airlie,
Simona Vetter, Alistair Popple
Cc: John Hubbard, Joel Fernandes, Timur Tabi, Zhi Wang,
Eliot Courtney, rust-for-linux, dri-devel, linux-kernel,
Alexandre Courbot
`driver_read_area` and `driver_write_area` are internal methods that
return slices containing the area of the command queue buffer that the
driver has exclusive read or write access, respectively.
While their returned value is correct and safe to use, internally they
temporarily create a reference to the whole command-buffer slice,
including GSP-owned regions. These regions can change without notice,
and thus creating a slice to them, even if never accessed, is undefined
behavior.
Fix this by rewriting these methods to use pointer projections in order
to create slices to valid regions only. It should eventually be replaced
by `IoView` and `IoSlice` once they land.
Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Reported-by: Danilo Krummrich <dakr@kernel.org>
Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
drivers/gpu/nova-core/gsp/cmdq.rs | 114 ++++++++++++++++++++++----------------
1 file changed, 65 insertions(+), 49 deletions(-)
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 72e9b79619eb..485d0c0f2a4b 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -17,6 +17,7 @@
},
new_mutex,
prelude::*,
+ ptr,
sync::{
aref::ARef,
Mutex, //
@@ -255,38 +256,52 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
/// As the message queue is a circular buffer, the region may be discontiguous in memory. In
/// that case the second slice will have a non-zero length.
fn driver_write_area(&mut self) -> (&mut [[u8; GSP_PAGE_SIZE]], &mut [[u8; GSP_PAGE_SIZE]]) {
- let tx = self.cpu_write_ptr() as usize;
- let rx = self.gsp_read_ptr() as usize;
+ let tx = num::u32_as_usize(self.cpu_write_ptr());
+ let rx = num::u32_as_usize(self.gsp_read_ptr());
+
+ // Command queue data.
+ let data = ptr::project!(mut self.0.as_mut_ptr(), .cpuq.msgq.data);
+
+ let (tail_slice, wrap_slice) = if rx == 0 {
+ // The write area is non-wrapping, and stops at the second-to-last entry of the command
+ // queue (to leave the last one empty).
+ (
+ ptr::project!(mut data, [tx..num::u32_as_usize(MSGQ_NUM_PAGES) - 1]),
+ ptr::project!(mut data, [..0]),
+ )
+ } else {
+ // Leave an empty slot before `rx`.
+ let end = rx - 1;
+
+ if rx <= tx {
+ // The write area wraps and continues until `end`.
+ (
+ ptr::project!(mut data, [tx..]),
+ ptr::project!(mut data, [..end]),
+ )
+ } else {
+ // The write area doesn't wrap and stops at `end`.
+ (
+ ptr::project!(mut data, [tx..end]),
+ ptr::project!(mut data, [..0]),
+ )
+ }
+ };
// SAFETY:
- // - We will only access the driver-owned part of the shared memory.
- // - Per the safety statement of the function, no concurrent access will be performed.
- let gsp_mem = unsafe { &mut *self.0.as_mut() };
- // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
- let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
-
- // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
- // belongs to the driver for writing.
-
- if rx == 0 {
- // Since `rx` is zero, leave an empty slot at end of the buffer.
- let last = after_tx.len() - 1;
- (&mut after_tx[..last], &mut [])
- } else if rx <= tx {
- // The area is discontiguous and we leave an empty slot before `rx`.
- // PANIC:
- // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
- // - The index does not exceed `before_tx.len()` (which equals `tx`) because
- // `rx <= tx` in this branch.
- (after_tx, &mut before_tx[..(rx - 1)])
- } else {
- // The area is contiguous and we leave an empty slot before `rx`.
- // PANIC:
- // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
- // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
- // because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
- (&mut after_tx[..(rx - tx - 1)], &mut [])
- }
+ // - Since `data` was created from a valid pointer, both `tail_slice` and `wrap_slice` are
+ // pointers to valid arrays.
+ // - The area starting at `tx` and ending at `rx - 2` modulo `MSGQ_NUM_PAGES`,
+ // inclusive, belongs to the driver for writing and is not accessed concurrently by
+ // the GSP.
+ // - The caller holds a reference to `self` for as long as the returned slices are live,
+ // meaning the CPU write pointer cannot be advanced and thus that the returned area
+ // remains exclusive to the CPU for the duration of the slices.
+ // - `tail_slice` and `wrap_slice` point to non-overlapping sub-ranges of `data` in all
+ // branches (in the `rx <= tx` case, `wrap_slice` ends at `rx - 1` which is strictly less
+ // than `tx` where `tail_slice` starts; in the other cases `wrap_slice` is empty), so
+ // creating two `&mut` references from them does not violate aliasing rules.
+ (unsafe { &mut *tail_slice }, unsafe { &mut *wrap_slice })
}
/// Returns the size of the region of the CPU message queue that the driver is currently allowed
@@ -308,27 +323,28 @@ fn driver_write_area_size(&self) -> usize {
/// As the message queue is a circular buffer, the region may be discontiguous in memory. In
/// that case the second slice will have a non-zero length.
fn driver_read_area(&self) -> (&[[u8; GSP_PAGE_SIZE]], &[[u8; GSP_PAGE_SIZE]]) {
- let tx = self.gsp_write_ptr() as usize;
- let rx = self.cpu_read_ptr() as usize;
+ let tx = num::u32_as_usize(self.gsp_write_ptr());
+ let rx = num::u32_as_usize(self.cpu_read_ptr());
+
+ // Message queue data.
+ let data = ptr::project!(self.0.as_ptr(), .gspq.msgq.data);
+
+ let (tail_slice, wrap_slice) = if rx <= tx {
+ (ptr::project!(data, [rx..tx]), ptr::project!(data, [..0]))
+ } else {
+ (ptr::project!(data, [rx..]), ptr::project!(data, [..tx]))
+ };
// SAFETY:
- // - We will only access the driver-owned part of the shared memory.
- // - Per the safety statement of the function, no concurrent access will be performed.
- let gsp_mem = unsafe { &*self.0.as_ptr() };
- let data = &gsp_mem.gspq.msgq.data;
-
- // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
- // belongs to the driver for reading.
- // PANIC:
- // - per the invariant of `cpu_read_ptr`, `rx < MSGQ_NUM_PAGES`
- // - per the invariant of `gsp_write_ptr`, `tx < MSGQ_NUM_PAGES`
- if rx <= tx {
- // The area is contiguous.
- (&data[rx..tx], &[])
- } else {
- // The area is discontiguous.
- (&data[rx..], &data[..tx])
- }
+ // - Since `data` was created from a valid pointer, both `tail_slice` and `wrap_slice` are
+ // pointers to valid arrays.
+ // - The area starting at `rx` and ending at `tx - 1` modulo `MSGQ_NUM_PAGES`,
+ // inclusive, belongs to the driver for reading and is not accessed concurrently by
+ // the GSP.
+ // - The caller holds a reference to `self` for as long as the returned slices are live,
+ // meaning the CPU read pointer cannot be advanced and thus that the returned area
+ // remains exclusive to the CPU for the duration of the slices.
+ (unsafe { &*tail_slice }, unsafe { &*wrap_slice })
}
/// Allocates a region on the command queue that is large enough to send a command of `size`
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v4 1/2] gpu: nova-core: gsp: inline methods providing queue range invariants
2026-04-01 14:29 ` [PATCH v4 1/2] gpu: nova-core: gsp: inline methods providing queue range invariants Alexandre Courbot
@ 2026-04-01 15:28 ` Gary Guo
0 siblings, 0 replies; 9+ messages in thread
From: Gary Guo @ 2026-04-01 15:28 UTC (permalink / raw)
To: Alexandre Courbot, Danilo Krummrich, Gary Guo, Alice Ryhl,
David Airlie, Simona Vetter, Alistair Popple
Cc: John Hubbard, Joel Fernandes, Timur Tabi, Zhi Wang,
Eliot Courtney, rust-for-linux, dri-devel, linux-kernel
On Wed Apr 1, 2026 at 3:29 PM BST, Alexandre Courbot wrote:
> These methods hold a range invariant for the read/write pointers
> (namely, that the pointers are always in the `0..MSGQ_NUM_PAGES` range).
>
> The calling code depends on these invariants to build properly, so make
> them `#[inline(always)]` to ensure that they are propagated where they
> need to be.
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
> ---
> drivers/gpu/nova-core/gsp/cmdq.rs | 4 ++++
> drivers/gpu/nova-core/gsp/fw.rs | 4 ++++
> 2 files changed, 8 insertions(+)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 0/2] gpu: nova-core: gsp: fix undefined behavior in command queue code
2026-04-01 14:29 [PATCH v4 0/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
2026-04-01 14:29 ` [PATCH v4 1/2] gpu: nova-core: gsp: inline methods providing queue range invariants Alexandre Courbot
2026-04-01 14:29 ` [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
@ 2026-04-03 17:37 ` Danilo Krummrich
2 siblings, 0 replies; 9+ messages in thread
From: Danilo Krummrich @ 2026-04-03 17:37 UTC (permalink / raw)
To: Alexandre Courbot
Cc: Gary Guo, Alice Ryhl, David Airlie, Simona Vetter,
Alistair Popple, John Hubbard, Joel Fernandes, Timur Tabi,
Zhi Wang, Eliot Courtney, rust-for-linux, dri-devel, linux-kernel
On Wed Apr 1, 2026 at 4:29 PM CEST, Alexandre Courbot wrote:
> Alexandre Courbot (2):
> gpu: nova-core: gsp: inline methods providing queue range invariants
> gpu: nova-core: gsp: fix undefined behavior in command queue code
Applied to drm-rust-next, thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code
2026-04-01 14:29 ` [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
@ 2026-04-03 23:47 ` John Hubbard
2026-04-04 0:05 ` Danilo Krummrich
0 siblings, 1 reply; 9+ messages in thread
From: John Hubbard @ 2026-04-03 23:47 UTC (permalink / raw)
To: Alexandre Courbot, Danilo Krummrich, Gary Guo, Alice Ryhl,
David Airlie, Simona Vetter, Alistair Popple
Cc: Joel Fernandes, Timur Tabi, Zhi Wang, Eliot Courtney,
rust-for-linux, dri-devel, linux-kernel
On 4/1/26 7:29 AM, Alexandre Courbot wrote:
> `driver_read_area` and `driver_write_area` are internal methods that
> return slices containing the area of the command queue buffer that the
> driver has exclusive read or write access, respectively.
>
> While their returned value is correct and safe to use, internally they
> temporarily create a reference to the whole command-buffer slice,
> including GSP-owned regions. These regions can change without notice,
> and thus creating a slice to them, even if never accessed, is undefined
> behavior.
>
> Fix this by rewriting these methods to use pointer projections in order
> to create slices to valid regions only. It should eventually be replaced
> by `IoView` and `IoSlice` once they land.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Reported-by: Danilo Krummrich <dakr@kernel.org>
> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
> Reviewed-by: Gary Guo <gary@garyguo.net>
> Reviewed-by: Danilo Krummrich <dakr@kernel.org>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
> drivers/gpu/nova-core/gsp/cmdq.rs | 114 ++++++++++++++++++++++----------------
> 1 file changed, 65 insertions(+), 49 deletions(-)
This is causing a build_assert failure in the latest drm-rust-next, with
rustc 1.85.0, and also with 1.78.0.
rustc 1.93.0 does not show the problem.
I suggest that we revert this commit: we're very late in the cycle and
it appears to be relying on fragile compiler behavior (my best guess so
far--I don't yet understand the root cause).
thanks,
--
John Hubbard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code
2026-04-03 23:47 ` John Hubbard
@ 2026-04-04 0:05 ` Danilo Krummrich
2026-04-04 4:37 ` Alexandre Courbot
0 siblings, 1 reply; 9+ messages in thread
From: Danilo Krummrich @ 2026-04-04 0:05 UTC (permalink / raw)
To: John Hubbard
Cc: Alexandre Courbot, Gary Guo, Alice Ryhl, David Airlie,
Simona Vetter, Alistair Popple, Joel Fernandes, Timur Tabi,
Zhi Wang, Eliot Courtney, rust-for-linux, dri-devel, linux-kernel
On Sat Apr 4, 2026 at 1:47 AM CEST, John Hubbard wrote:
> On 4/1/26 7:29 AM, Alexandre Courbot wrote:
>> `driver_read_area` and `driver_write_area` are internal methods that
>> return slices containing the area of the command queue buffer that the
>> driver has exclusive read or write access, respectively.
>>
>> While their returned value is correct and safe to use, internally they
>> temporarily create a reference to the whole command-buffer slice,
>> including GSP-owned regions. These regions can change without notice,
>> and thus creating a slice to them, even if never accessed, is undefined
>> behavior.
>>
>> Fix this by rewriting these methods to use pointer projections in order
>> to create slices to valid regions only. It should eventually be replaced
>> by `IoView` and `IoSlice` once they land.
>>
>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>> Reported-by: Danilo Krummrich <dakr@kernel.org>
>> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
>> Reviewed-by: Gary Guo <gary@garyguo.net>
>> Reviewed-by: Danilo Krummrich <dakr@kernel.org>
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>> drivers/gpu/nova-core/gsp/cmdq.rs | 114 ++++++++++++++++++++++----------------
>> 1 file changed, 65 insertions(+), 49 deletions(-)
>
> This is causing a build_assert failure in the latest drm-rust-next, with
> rustc 1.85.0, and also with 1.78.0.
>
> rustc 1.93.0 does not show the problem.
Odd, it did pass all the testing at my end. Anyways, it only proves once again
that this is pretty fragile.
> I suggest that we revert this commit: we're very late in the cycle and
> it appears to be relying on fragile compiler behavior (my best guess so
> far--I don't yet understand the root cause).
As an exception, as it made more sense in this case, I dropped it from the
queue.
Thanks,
Danilo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code
2026-04-04 0:05 ` Danilo Krummrich
@ 2026-04-04 4:37 ` Alexandre Courbot
2026-04-04 12:33 ` Gary Guo
0 siblings, 1 reply; 9+ messages in thread
From: Alexandre Courbot @ 2026-04-04 4:37 UTC (permalink / raw)
To: Danilo Krummrich, Gary Guo
Cc: John Hubbard, Gary Guo, Alice Ryhl, David Airlie, Simona Vetter,
Alistair Popple, Joel Fernandes, Timur Tabi, Zhi Wang,
Eliot Courtney, rust-for-linux, dri-devel, linux-kernel
On Sat Apr 4, 2026 at 9:05 AM JST, Danilo Krummrich wrote:
> On Sat Apr 4, 2026 at 1:47 AM CEST, John Hubbard wrote:
>> On 4/1/26 7:29 AM, Alexandre Courbot wrote:
>>> `driver_read_area` and `driver_write_area` are internal methods that
>>> return slices containing the area of the command queue buffer that the
>>> driver has exclusive read or write access, respectively.
>>>
>>> While their returned value is correct and safe to use, internally they
>>> temporarily create a reference to the whole command-buffer slice,
>>> including GSP-owned regions. These regions can change without notice,
>>> and thus creating a slice to them, even if never accessed, is undefined
>>> behavior.
>>>
>>> Fix this by rewriting these methods to use pointer projections in order
>>> to create slices to valid regions only. It should eventually be replaced
>>> by `IoView` and `IoSlice` once they land.
>>>
>>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>>> Reported-by: Danilo Krummrich <dakr@kernel.org>
>>> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
>>> Reviewed-by: Gary Guo <gary@garyguo.net>
>>> Reviewed-by: Danilo Krummrich <dakr@kernel.org>
>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>> ---
>>> drivers/gpu/nova-core/gsp/cmdq.rs | 114 ++++++++++++++++++++++----------------
>>> 1 file changed, 65 insertions(+), 49 deletions(-)
>>
>> This is causing a build_assert failure in the latest drm-rust-next, with
>> rustc 1.85.0, and also with 1.78.0.
>>
>> rustc 1.93.0 does not show the problem.
>
> Odd, it did pass all the testing at my end. Anyways, it only proves once again
> that this is pretty fragile.
Same here, although I did not test with all the supported rustc versions
(but pretty sure I did test 1.78). This seems to be heavily
configuration dependent.
Gary, do you know of a way to restrict the `project_pointer` macro to
only take constant values as indices? That problem is pretty sneaky and
likely to happen again, so we should probably harden that part a bit.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code
2026-04-04 4:37 ` Alexandre Courbot
@ 2026-04-04 12:33 ` Gary Guo
0 siblings, 0 replies; 9+ messages in thread
From: Gary Guo @ 2026-04-04 12:33 UTC (permalink / raw)
To: Alexandre Courbot, Danilo Krummrich, Gary Guo
Cc: John Hubbard, Alice Ryhl, David Airlie, Simona Vetter,
Alistair Popple, Joel Fernandes, Timur Tabi, Zhi Wang,
Eliot Courtney, rust-for-linux, dri-devel, linux-kernel
On Sat Apr 4, 2026 at 5:37 AM BST, Alexandre Courbot wrote:
> On Sat Apr 4, 2026 at 9:05 AM JST, Danilo Krummrich wrote:
>> On Sat Apr 4, 2026 at 1:47 AM CEST, John Hubbard wrote:
>>> On 4/1/26 7:29 AM, Alexandre Courbot wrote:
>>>> `driver_read_area` and `driver_write_area` are internal methods that
>>>> return slices containing the area of the command queue buffer that the
>>>> driver has exclusive read or write access, respectively.
>>>>
>>>> While their returned value is correct and safe to use, internally they
>>>> temporarily create a reference to the whole command-buffer slice,
>>>> including GSP-owned regions. These regions can change without notice,
>>>> and thus creating a slice to them, even if never accessed, is undefined
>>>> behavior.
>>>>
>>>> Fix this by rewriting these methods to use pointer projections in order
>>>> to create slices to valid regions only. It should eventually be replaced
>>>> by `IoView` and `IoSlice` once they land.
>>>>
>>>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>>>> Reported-by: Danilo Krummrich <dakr@kernel.org>
>>>> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@kernel.org/
>>>> Reviewed-by: Gary Guo <gary@garyguo.net>
>>>> Reviewed-by: Danilo Krummrich <dakr@kernel.org>
>>>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>>>> ---
>>>> drivers/gpu/nova-core/gsp/cmdq.rs | 114 ++++++++++++++++++++++----------------
>>>> 1 file changed, 65 insertions(+), 49 deletions(-)
>>>
>>> This is causing a build_assert failure in the latest drm-rust-next, with
>>> rustc 1.85.0, and also with 1.78.0.
>>>
>>> rustc 1.93.0 does not show the problem.
>>
>> Odd, it did pass all the testing at my end. Anyways, it only proves once again
>> that this is pretty fragile.
>
> Same here, although I did not test with all the supported rustc versions
> (but pretty sure I did test 1.78). This seems to be heavily
> configuration dependent.
>
> Gary, do you know of a way to restrict the `project_pointer` macro to
> only take constant values as indices? That problem is pretty sneaky and
> likely to happen again, so we should probably harden that part a bit.
It doesn't necessary need to be const, for plain indexing (not slicing), if you
do modular arithmetic or bound-checking before accessing it's pretty reliably
optimized out. Slicing is more tricky, as there're more checking involved here.
For the macro, it doesn't really know whether you're slicing or indexing, so
this restriction is not trivial. We discussed this at length in last RfL
meeting, and we've come up with keyworded projection variants like this:
project!(p, [try foo]) // fallible
project!(p, [build foo]) // build_assert! variant, for indexing or constants (indexing/slicing)
project!(p, [panic foo]) // panicking variant
I'll be working on this next week.
Best,
Gary
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-04-04 12:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-01 14:29 [PATCH v4 0/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
2026-04-01 14:29 ` [PATCH v4 1/2] gpu: nova-core: gsp: inline methods providing queue range invariants Alexandre Courbot
2026-04-01 15:28 ` Gary Guo
2026-04-01 14:29 ` [PATCH v4 2/2] gpu: nova-core: gsp: fix undefined behavior in command queue code Alexandre Courbot
2026-04-03 23:47 ` John Hubbard
2026-04-04 0:05 ` Danilo Krummrich
2026-04-04 4:37 ` Alexandre Courbot
2026-04-04 12:33 ` Gary Guo
2026-04-03 17:37 ` [PATCH v4 0/2] " Danilo Krummrich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox