public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] gpu: nova-core: gsp: fix command queue ring buffer bugs
@ 2026-01-23 12:12 Eliot Courtney
  2026-01-23 12:12 ` [PATCH v2 1/4] gpu: nova-core: gsp: fix incorrect advancing of write pointer Eliot Courtney
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Eliot Courtney @ 2026-01-23 12:12 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel, Eliot Courtney

This series fixes a few bugs in the GSP command queue ring buffer
implementation in nova-core and also clarifies some of the comments.

The ring buffer uses read and write pointers (rx/tx) to track which areas
are available for the CPU vs the GSP to read/write into.

In the ring buffers there were some indexing issues which could end up
causing panics, so I fixed those and added more rigorous proofs of
correctness in the panic comments.

Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
Changes in v2:
- Modified PANIC comments to match existing style.
- Link to v1: https://lore.kernel.org/r/20260122-nova-core-cmdq1-v1-0-7f8fe4683f11@nvidia.com

---
Eliot Courtney (4):
      gpu: nova-core: gsp: fix incorrect advancing of write pointer
      gpu: nova-core: gsp: clarify comments about invariants and pointer roles
      gpu: nova-core: gsp: fix improper handling of empty slot in cmdq
      gpu: nova-core: gsp: fix improper indexing in driver_read_area

 drivers/gpu/nova-core/gsp/cmdq.rs | 72 ++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 28 deletions(-)
---
base-commit: 6ea52b6d8f33ae627f4dcf43b12b6e713a8b9331
change-id: 20260121-nova-core-cmdq1-6aaa369824c4

Best regards,
-- 
Eliot Courtney <ecourtney@nvidia.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/4] gpu: nova-core: gsp: fix incorrect advancing of write pointer
  2026-01-23 12:12 [PATCH v2 0/4] gpu: nova-core: gsp: fix command queue ring buffer bugs Eliot Courtney
@ 2026-01-23 12:12 ` Eliot Courtney
  2026-01-23 12:12 ` [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles Eliot Courtney
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Eliot Courtney @ 2026-01-23 12:12 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel, Eliot Courtney

We should modulo not bitwise-and here. The current code could, for
example, set wptr to MSGQ_NUM_PAGES which is not valid.

Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 46819a82a51a..f139aad7af3f 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -384,7 +384,7 @@ fn cpu_write_ptr(&self) -> u32 {
 
     // Informs the GSP that it can process `elem_count` new pages from the command queue.
     fn advance_cpu_write_ptr(&mut self, elem_count: u32) {
-        let wptr = self.cpu_write_ptr().wrapping_add(elem_count) & MSGQ_NUM_PAGES;
+        let wptr = self.cpu_write_ptr().wrapping_add(elem_count) % MSGQ_NUM_PAGES;
         let gsp_mem = self.0.start_ptr_mut();
 
         // SAFETY:

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles
  2026-01-23 12:12 [PATCH v2 0/4] gpu: nova-core: gsp: fix command queue ring buffer bugs Eliot Courtney
  2026-01-23 12:12 ` [PATCH v2 1/4] gpu: nova-core: gsp: fix incorrect advancing of write pointer Eliot Courtney
@ 2026-01-23 12:12 ` Eliot Courtney
  2026-01-26 18:04   ` Gary Guo
  2026-01-23 12:12 ` [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq Eliot Courtney
  2026-01-23 12:12 ` [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area Eliot Courtney
  3 siblings, 1 reply; 15+ messages in thread
From: Eliot Courtney @ 2026-01-23 12:12 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel, Eliot Courtney

Disambiguate a few things in comments in cmdq.rs.

Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index f139aad7af3f..09c28eeb6f12 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -161,12 +161,14 @@ struct GspMem {
     /// Self-mapping page table entries.
     ptes: PteArray<{ GSP_PAGE_SIZE / size_of::<u64>() }>,
     /// CPU queue: the driver writes commands here, and the GSP reads them. It also contains the
-    /// write and read pointers that the CPU updates.
+    /// write and read pointers that the CPU updates. This means that the read pointer here is an
+    /// index into the GSP queue.
     ///
     /// This member is read-only for the GSP.
     cpuq: Msgq,
     /// GSP queue: the GSP writes messages here, and the driver reads them. It also contains the
-    /// write and read pointers that the GSP updates.
+    /// write and read pointers that the GSP updates. This means that the read pointer here is an
+    /// index into the CPU queue.
     ///
     /// This member is read-only for the driver.
     gspq: Msgq,
@@ -222,7 +224,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         // - We will only access the driver-owned part of the shared memory.
         // - Per the safety statement of the function, no concurrent access will be performed.
         let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
-        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `<= MSGQ_NUM_PAGES`.
+        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
         let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
 
         if rx <= tx {
@@ -257,7 +259,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         // - We will only access the driver-owned part of the shared memory.
         // - Per the safety statement of the function, no concurrent access will be performed.
         let gsp_mem = &unsafe { self.0.as_slice(0, 1) }.unwrap()[0];
-        // PANIC: per the invariant of `cpu_read_ptr`, `xx` is `<= MSGQ_NUM_PAGES`.
+        // PANIC: per the invariant of `cpu_read_ptr`, `rx` is `< MSGQ_NUM_PAGES`.
         let (before_rx, after_rx) = gsp_mem.gspq.msgq.data.split_at(rx);
 
         match tx.cmp(&rx) {
@@ -315,7 +317,7 @@ fn allocate_command(&mut self, size: usize) -> Result<GspCommand<'_>> {
     //
     // # Invariants
     //
-    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
+    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
     fn gsp_write_ptr(&self) -> u32 {
         let gsp_mem = self.0.start_ptr();
 
@@ -329,7 +331,7 @@ fn gsp_write_ptr(&self) -> u32 {
     //
     // # Invariants
     //
-    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
+    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
     fn gsp_read_ptr(&self) -> u32 {
         let gsp_mem = self.0.start_ptr();
 
@@ -343,7 +345,7 @@ fn gsp_read_ptr(&self) -> u32 {
     //
     // # Invariants
     //
-    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
+    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
     fn cpu_read_ptr(&self) -> u32 {
         let gsp_mem = self.0.start_ptr();
 
@@ -372,7 +374,7 @@ fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
     //
     // # Invariants
     //
-    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
+    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
     fn cpu_write_ptr(&self) -> u32 {
         let gsp_mem = self.0.start_ptr();
 

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq
  2026-01-23 12:12 [PATCH v2 0/4] gpu: nova-core: gsp: fix command queue ring buffer bugs Eliot Courtney
  2026-01-23 12:12 ` [PATCH v2 1/4] gpu: nova-core: gsp: fix incorrect advancing of write pointer Eliot Courtney
  2026-01-23 12:12 ` [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles Eliot Courtney
@ 2026-01-23 12:12 ` Eliot Courtney
  2026-01-26 18:26   ` Gary Guo
  2026-01-28 11:39   ` Alexandre Courbot
  2026-01-23 12:12 ` [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area Eliot Courtney
  3 siblings, 2 replies; 15+ messages in thread
From: Eliot Courtney @ 2026-01-23 12:12 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel, Eliot Courtney

The current code hands out buffers that go all the way up to and
including `rx - 1`, but we need to maintain an empty slot to prevent the
ring buffer from wrapping around into having 'tx == rx', which means
empty.

Also add more rigorous no-panic proofs.

Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 09c28eeb6f12..aa8758fc7723 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
         let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
 
-        if rx <= tx {
-            // The area from `tx` up to the end of the ring, and from the beginning of the ring up
-            // to `rx`, minus one unit, belongs to the driver.
-            if rx == 0 {
-                let last = after_tx.len() - 1;
-                (&mut after_tx[..last], &mut before_tx[0..0])
-            } else {
-                (after_tx, &mut before_tx[..rx])
-            }
+        // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
+        // belongs to the driver for writing.
+        if rx == 0 {
+            // Since `rx` is zero, leave an empty slot at end of the buffer.
+            let last = after_tx.len() - 1;
+            (&mut after_tx[..last], &mut before_tx[0..0])
+        } else if rx > tx {
+            // The area is contiguous and we leave an empty slot before `rx`.
+            // PANIC:
+            // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
+            // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
+            //   because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
+            (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
         } else {
-            // The area from `tx` to `rx`, minus one unit, belongs to the driver.
-            //
-            // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
-            // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
-            (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
+            // The area is discontiguous and we leave an empty slot before `rx`.
+            // PANIC:
+            // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
+            // - The index does not exceed `before_tx.len()` (which equals `tx`) because
+            //   `rx <= tx` in this branch.
+            (after_tx, &mut before_tx[..(rx - 1)])
         }
     }
 

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area
  2026-01-23 12:12 [PATCH v2 0/4] gpu: nova-core: gsp: fix command queue ring buffer bugs Eliot Courtney
                   ` (2 preceding siblings ...)
  2026-01-23 12:12 ` [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq Eliot Courtney
@ 2026-01-23 12:12 ` Eliot Courtney
  2026-01-26 18:30   ` Gary Guo
  2026-01-28 11:57   ` Alexandre Courbot
  3 siblings, 2 replies; 15+ messages in thread
From: Eliot Courtney @ 2026-01-23 12:12 UTC (permalink / raw)
  To: Danilo Krummrich, Alexandre Courbot, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel, Eliot Courtney

The current code indexes into `after_rx` using `tx` which is an index
for the whole buffer, not the split buffer `after_rx`.

Also add more rigorous no-panic proofs.

Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index aa8758fc7723..c26396fda29c 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -1,7 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
 use core::{
-    cmp,
     mem,
     sync::atomic::{
         fence,
@@ -267,10 +266,20 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         // PANIC: per the invariant of `cpu_read_ptr`, `rx` is `< MSGQ_NUM_PAGES`.
         let (before_rx, after_rx) = gsp_mem.gspq.msgq.data.split_at(rx);
 
-        match tx.cmp(&rx) {
-            cmp::Ordering::Equal => (&after_rx[0..0], &after_rx[0..0]),
-            cmp::Ordering::Greater => (&after_rx[..tx], &before_rx[0..0]),
-            cmp::Ordering::Less => (after_rx, &before_rx[..tx]),
+        // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
+        // belongs to the driver for reading.
+        if rx <= tx {
+            // The area is contiguous.
+            // PANIC:
+            // - The index `tx - rx` is non-negative because `rx <= tx` in this branch.
+            // - The index does not exceed `after_rx.len()` (which is `MSGQ_NUM_PAGES - rx`)
+            //   because `tx < MSGQ_NUM_PAGES` by the `gsp_write_ptr` invariant.
+            (&after_rx[..(tx - rx)], &after_rx[0..0])
+        } else {
+            // The area is discontiguous.
+            // PANIC: `tx` does not exceed `before_rx.len()` (which equals `rx`) because
+            //   `tx < rx` in this branch.
+            (after_rx, &before_rx[..tx])
         }
     }
 

-- 
2.52.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles
  2026-01-23 12:12 ` [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles Eliot Courtney
@ 2026-01-26 18:04   ` Gary Guo
  2026-01-28  4:35     ` Eliot Courtney
  0 siblings, 1 reply; 15+ messages in thread
From: Gary Guo @ 2026-01-26 18:04 UTC (permalink / raw)
  To: Eliot Courtney, Danilo Krummrich, Alexandre Courbot, Alice Ryhl,
	David Airlie, Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel

On Fri Jan 23, 2026 at 12:12 PM GMT, Eliot Courtney wrote:
> Disambiguate a few things in comments in cmdq.rs.
>
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index f139aad7af3f..09c28eeb6f12 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -161,12 +161,14 @@ struct GspMem {
>      /// Self-mapping page table entries.
>      ptes: PteArray<{ GSP_PAGE_SIZE / size_of::<u64>() }>,
>      /// CPU queue: the driver writes commands here, and the GSP reads them. It also contains the
> -    /// write and read pointers that the CPU updates.
> +    /// write and read pointers that the CPU updates. This means that the read pointer here is an
> +    /// index into the GSP queue.
>      ///
>      /// This member is read-only for the GSP.
>      cpuq: Msgq,
>      /// GSP queue: the GSP writes messages here, and the driver reads them. It also contains the
> -    /// write and read pointers that the GSP updates.
> +    /// write and read pointers that the GSP updates. This means that the read pointer here is an
> +    /// index into the CPU queue.
>      ///
>      /// This member is read-only for the driver.
>      gspq: Msgq,
> @@ -222,7 +224,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          // - We will only access the driver-owned part of the shared memory.
>          // - Per the safety statement of the function, no concurrent access will be performed.
>          let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
> -        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `<= MSGQ_NUM_PAGES`.
> +        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.

Can this just be `tx < MSGQ_NUM_PAGES`?

>          let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>  
>          if rx <= tx {
> @@ -257,7 +259,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          // - We will only access the driver-owned part of the shared memory.
>          // - Per the safety statement of the function, no concurrent access will be performed.
>          let gsp_mem = &unsafe { self.0.as_slice(0, 1) }.unwrap()[0];
> -        // PANIC: per the invariant of `cpu_read_ptr`, `xx` is `<= MSGQ_NUM_PAGES`.
> +        // PANIC: per the invariant of `cpu_read_ptr`, `rx` is `< MSGQ_NUM_PAGES`.
>          let (before_rx, after_rx) = gsp_mem.gspq.msgq.data.split_at(rx);
>  
>          match tx.cmp(&rx) {
> @@ -315,7 +317,7 @@ fn allocate_command(&mut self, size: usize) -> Result<GspCommand<'_>> {
>      //
>      // # Invariants
>      //
> -    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
> +    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.

I wonder if this can be `is within 0..MSGQ_NUM_PAGES`. What do others think?

Best,
Gary

>      fn gsp_write_ptr(&self) -> u32 {
>          let gsp_mem = self.0.start_ptr();
>  
> @@ -329,7 +331,7 @@ fn gsp_write_ptr(&self) -> u32 {
>      //
>      // # Invariants
>      //
> -    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
> +    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
>      fn gsp_read_ptr(&self) -> u32 {
>          let gsp_mem = self.0.start_ptr();
>  
> @@ -343,7 +345,7 @@ fn gsp_read_ptr(&self) -> u32 {
>      //
>      // # Invariants
>      //
> -    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
> +    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
>      fn cpu_read_ptr(&self) -> u32 {
>          let gsp_mem = self.0.start_ptr();
>  
> @@ -372,7 +374,7 @@ fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
>      //
>      // # Invariants
>      //
> -    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
> +    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
>      fn cpu_write_ptr(&self) -> u32 {
>          let gsp_mem = self.0.start_ptr();
>  


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq
  2026-01-23 12:12 ` [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq Eliot Courtney
@ 2026-01-26 18:26   ` Gary Guo
  2026-01-28 11:42     ` Alexandre Courbot
  2026-01-28 11:39   ` Alexandre Courbot
  1 sibling, 1 reply; 15+ messages in thread
From: Gary Guo @ 2026-01-26 18:26 UTC (permalink / raw)
  To: Eliot Courtney, Danilo Krummrich, Alexandre Courbot, Alice Ryhl,
	David Airlie, Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel

On Fri Jan 23, 2026 at 12:12 PM GMT, Eliot Courtney wrote:
> The current code hands out buffers that go all the way up to and
> including `rx - 1`, but we need to maintain an empty slot to prevent the
> ring buffer from wrapping around into having 'tx == rx', which means
> empty.
>
> Also add more rigorous no-panic proofs.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
>  1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 09c28eeb6f12..aa8758fc7723 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
>          let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>  
> -        if rx <= tx {
> -            // The area from `tx` up to the end of the ring, and from the beginning of the ring up
> -            // to `rx`, minus one unit, belongs to the driver.
> -            if rx == 0 {
> -                let last = after_tx.len() - 1;
> -                (&mut after_tx[..last], &mut before_tx[0..0])
> -            } else {
> -                (after_tx, &mut before_tx[..rx])
> -            }
> +        // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
> +        // belongs to the driver for writing.
> +        if rx == 0 {
> +            // Since `rx` is zero, leave an empty slot at end of the buffer.
> +            let last = after_tx.len() - 1;
> +            (&mut after_tx[..last], &mut before_tx[0..0])

Does the address actually matter? Otherwise I would find `&mut []` easier to
understand than an empty indexing.

> +        } else if rx > tx {
> +            // The area is contiguous and we leave an empty slot before `rx`.
> +            // PANIC:
> +            // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
> +            // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
> +            //   because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
> +            (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
>          } else {
> -            // The area from `tx` to `rx`, minus one unit, belongs to the driver.
> -            //
> -            // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
> -            // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
> -            (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
> +            // The area is discontiguous and we leave an empty slot before `rx`.
> +            // PANIC:
> +            // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
> +            // - The index does not exceed `before_tx.len()` (which equals `tx`) because
> +            //   `rx <= tx` in this branch.
> +            (after_tx, &mut before_tx[..(rx - 1)])

If this is written with get_disjoint_mut, the indices would be so much easier to
understand... To bad that that API is only available from 1.86 onwards.

Best,
Gary

>          }
>      }
>  


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area
  2026-01-23 12:12 ` [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area Eliot Courtney
@ 2026-01-26 18:30   ` Gary Guo
  2026-01-28 12:18     ` Alexandre Courbot
  2026-01-28 11:57   ` Alexandre Courbot
  1 sibling, 1 reply; 15+ messages in thread
From: Gary Guo @ 2026-01-26 18:30 UTC (permalink / raw)
  To: Eliot Courtney, Danilo Krummrich, Alexandre Courbot, Alice Ryhl,
	David Airlie, Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel

On Fri Jan 23, 2026 at 12:12 PM GMT, Eliot Courtney wrote:
> The current code indexes into `after_rx` using `tx` which is an index
> for the whole buffer, not the split buffer `after_rx`.
>
> Also add more rigorous no-panic proofs.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index aa8758fc7723..c26396fda29c 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -1,7 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0
>  
>  use core::{
> -    cmp,
>      mem,
>      sync::atomic::{
>          fence,
> @@ -267,10 +266,20 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          // PANIC: per the invariant of `cpu_read_ptr`, `rx` is `< MSGQ_NUM_PAGES`.
>          let (before_rx, after_rx) = gsp_mem.gspq.msgq.data.split_at(rx);

This code doesn't need splitting as it doesn't have the uniqueness issue that
mutable references have. While you're at it, probably it's chance to simplify
the code.

>  
> -        match tx.cmp(&rx) {
> -            cmp::Ordering::Equal => (&after_rx[0..0], &after_rx[0..0]),
> -            cmp::Ordering::Greater => (&after_rx[..tx], &before_rx[0..0]),
> -            cmp::Ordering::Less => (after_rx, &before_rx[..tx]),
> +        // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
> +        // belongs to the driver for reading.
> +        if rx <= tx {
> +            // The area is contiguous.
> +            // PANIC:
> +            // - The index `tx - rx` is non-negative because `rx <= tx` in this branch.
> +            // - The index does not exceed `after_rx.len()` (which is `MSGQ_NUM_PAGES - rx`)
> +            //   because `tx < MSGQ_NUM_PAGES` by the `gsp_write_ptr` invariant.
> +            (&after_rx[..(tx - rx)], &after_rx[0..0])

This can be just `(&data[rx..tx], &[])` without the split.

> +        } else {
> +            // The area is discontiguous.
> +            // PANIC: `tx` does not exceed `before_rx.len()` (which equals `rx`) because
> +            //   `tx < rx` in this branch.
> +            (after_rx, &before_rx[..tx])

This can be just `(&data[rx..], &data[..tx])` without the split.

Best,
Gary


>          }
>      }
>  


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles
  2026-01-26 18:04   ` Gary Guo
@ 2026-01-28  4:35     ` Eliot Courtney
  2026-01-28  8:17       ` Alexandre Courbot
  0 siblings, 1 reply; 15+ messages in thread
From: Eliot Courtney @ 2026-01-28  4:35 UTC (permalink / raw)
  To: Gary Guo, Eliot Courtney, Danilo Krummrich, Alexandre Courbot,
	Alice Ryhl, David Airlie, Simona Vetter, Alistair Popple
  Cc: nouveau, rust-for-linux, dri-devel, linux-kernel, dri-devel

On Tue Jan 27, 2026 at 3:04 AM JST, Gary Guo wrote:
>>          // - We will only access the driver-owned part of the shared memory.
>>          // - Per the safety statement of the function, no concurrent access will be performed.
>>          let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
>> -        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `<= MSGQ_NUM_PAGES`.
>> +        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
>
> Can this just be `tx < MSGQ_NUM_PAGES`?

In previous discussion[1] it's been noted that it's important to
explain why the preconditions are satisfied, not just what they are,
so that's the reason I kept this in. Happy to hear arguments either
way though.

[1]: https://lore.kernel.org/all/CAH5fLgg0O=t2T2MQH2hvm4eZnCOa_NffzRw=XZPi9+7+=XsmRg@mail.gmail.com/

>>      // # Invariants
>>      //
>> -    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
>> +    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
>
> I wonder if this can be `is within 0..MSGQ_NUM_PAGES`. What do others think?

I think this is very reasonable, since this is part of the rust
range syntax so it should be understandable. I also considered the
mathematical syntax `[0, MSGQ_NUM_PAGES)`, but not sure if this would
be conventional - it does seem that this notation is used in a bunch
of places though. Will apply your suggestion in the next version unless
there is a definitive convention for this.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles
  2026-01-28  4:35     ` Eliot Courtney
@ 2026-01-28  8:17       ` Alexandre Courbot
  2026-01-28 10:46         ` Danilo Krummrich
  0 siblings, 1 reply; 15+ messages in thread
From: Alexandre Courbot @ 2026-01-28  8:17 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Gary Guo, Danilo Krummrich, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple, nouveau, rust-for-linux,
	dri-devel, linux-kernel, dri-devel

On Wed Jan 28, 2026 at 1:35 PM JST, Eliot Courtney wrote:
> On Tue Jan 27, 2026 at 3:04 AM JST, Gary Guo wrote:
>>>          // - We will only access the driver-owned part of the shared memory.
>>>          // - Per the safety statement of the function, no concurrent access will be performed.
>>>          let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
>>> -        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `<= MSGQ_NUM_PAGES`.
>>> +        // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
>>
>> Can this just be `tx < MSGQ_NUM_PAGES`?
>
> In previous discussion[1] it's been noted that it's important to
> explain why the preconditions are satisfied, not just what they are,
> so that's the reason I kept this in. Happy to hear arguments either
> way though.
>
> [1]: https://lore.kernel.org/all/CAH5fLgg0O=t2T2MQH2hvm4eZnCOa_NffzRw=XZPi9+7+=XsmRg@mail.gmail.com/

I agree it's a good idea to mention the source of the invariant, lest we
may lose track of why it is indeed true.

>
>>>      // # Invariants
>>>      //
>>> -    // - The returned value is between `0` and `MSGQ_NUM_PAGES`.
>>> +    // - The returned value is between `0` and `MSGQ_NUM_PAGES - 1`, inclusive.
>>
>> I wonder if this can be `is within 0..MSGQ_NUM_PAGES`. What do others think?
>
> I think this is very reasonable, since this is part of the rust
> range syntax so it should be understandable. I also considered the
> mathematical syntax `[0, MSGQ_NUM_PAGES)`, but not sure if this would
> be conventional - it does seem that this notation is used in a bunch
> of places though. Will apply your suggestion in the next version unless
> there is a definitive convention for this.

Since this is Rust code, the Rust syntax to express ranges (within ``
quotes) makes sense IMHO.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles
  2026-01-28  8:17       ` Alexandre Courbot
@ 2026-01-28 10:46         ` Danilo Krummrich
  0 siblings, 0 replies; 15+ messages in thread
From: Danilo Krummrich @ 2026-01-28 10:46 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Eliot Courtney, Gary Guo, Alice Ryhl, David Airlie, Simona Vetter,
	Alistair Popple, nouveau, rust-for-linux, dri-devel, linux-kernel,
	dri-devel

On Wed Jan 28, 2026 at 9:17 AM CET, Alexandre Courbot wrote:
> On Wed Jan 28, 2026 at 1:35 PM JST, Eliot Courtney wrote:
>> On Tue Jan 27, 2026 at 3:04 AM JST, Gary Guo wrote:
>>> I wonder if this can be `is within 0..MSGQ_NUM_PAGES`. What do others think?
>>
>> I think this is very reasonable, since this is part of the rust
>> range syntax so it should be understandable. I also considered the
>> mathematical syntax `[0, MSGQ_NUM_PAGES)`, but not sure if this would
>> be conventional - it does seem that this notation is used in a bunch
>> of places though. Will apply your suggestion in the next version unless
>> there is a definitive convention for this.
>
> Since this is Rust code, the Rust syntax to express ranges (within ``
> quotes) makes sense IMHO.

While I really like the mathematical syntax, I think using the Rust syntax is
superior, as it requires zero mental cycles to translate it to what is likely to
be found in the code as well.

(There also have been some considerations of using tools to validate safety
comments or invariants to some extend eventually.)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq
  2026-01-23 12:12 ` [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq Eliot Courtney
  2026-01-26 18:26   ` Gary Guo
@ 2026-01-28 11:39   ` Alexandre Courbot
  1 sibling, 0 replies; 15+ messages in thread
From: Alexandre Courbot @ 2026-01-28 11:39 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Alistair Popple, nouveau, rust-for-linux, dri-devel, linux-kernel

On Fri Jan 23, 2026 at 9:12 PM JST, Eliot Courtney wrote:
> The current code hands out buffers that go all the way up to and
> including `rx - 1`, but we need to maintain an empty slot to prevent the
> ring buffer from wrapping around into having 'tx == rx', which means
> empty.
>
> Also add more rigorous no-panic proofs.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
>  1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 09c28eeb6f12..aa8758fc7723 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
>          let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>  
> -        if rx <= tx {
> -            // The area from `tx` up to the end of the ring, and from the beginning of the ring up
> -            // to `rx`, minus one unit, belongs to the driver.
> -            if rx == 0 {
> -                let last = after_tx.len() - 1;
> -                (&mut after_tx[..last], &mut before_tx[0..0])
> -            } else {
> -                (after_tx, &mut before_tx[..rx])

Indeed, the comment clearly states "minus one unit" but the second
branch (and the one below) ignores that. >_<

> -            }
> +        // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
> +        // belongs to the driver for writing.

I have trouble parsing this comment - it says `rx - 2` but nowhere does
that expression appear in the code. It's particularly confusing because
it apparently applies to all 3 blocks, but the position suggests it is
only the first.  I'd add a blank line after it to signal that it is
relevant to the rest of the method.

> +        if rx == 0 {
> +            // Since `rx` is zero, leave an empty slot at end of the buffer.
> +            let last = after_tx.len() - 1;
> +            (&mut after_tx[..last], &mut before_tx[0..0])
> +        } else if rx > tx {

In the original code, the `rx < tx` case appears first. Can we preserve
this order so the corresponding diffs also appear next to one another?

> +            // The area is contiguous and we leave an empty slot before `rx`.
> +            // PANIC:
> +            // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
> +            // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
> +            //   because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
> +            (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
>          } else {
> -            // The area from `tx` to `rx`, minus one unit, belongs to the driver.
> -            //
> -            // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
> -            // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
> -            (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
> +            // The area is discontiguous and we leave an empty slot before `rx`.
> +            // PANIC:
> +            // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
> +            // - The index does not exceed `before_tx.len()` (which equals `tx`) because
> +            //   `rx <= tx` in this branch.
> +            (after_tx, &mut before_tx[..(rx - 1)])

In any case, nice catch! This would have bitten us sooner or later.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq
  2026-01-26 18:26   ` Gary Guo
@ 2026-01-28 11:42     ` Alexandre Courbot
  0 siblings, 0 replies; 15+ messages in thread
From: Alexandre Courbot @ 2026-01-28 11:42 UTC (permalink / raw)
  To: Gary Guo
  Cc: Eliot Courtney, Danilo Krummrich, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple, nouveau, rust-for-linux,
	dri-devel, linux-kernel

On Tue Jan 27, 2026 at 3:26 AM JST, Gary Guo wrote:
> On Fri Jan 23, 2026 at 12:12 PM GMT, Eliot Courtney wrote:
>> The current code hands out buffers that go all the way up to and
>> including `rx - 1`, but we need to maintain an empty slot to prevent the
>> ring buffer from wrapping around into having 'tx == rx', which means
>> empty.
>>
>> Also add more rigorous no-panic proofs.
>>
>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/gsp/cmdq.rs | 33 +++++++++++++++++++--------------
>>  1 file changed, 19 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 09c28eeb6f12..aa8758fc7723 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -227,21 +227,26 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>>          // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
>>          let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>>  
>> -        if rx <= tx {
>> -            // The area from `tx` up to the end of the ring, and from the beginning of the ring up
>> -            // to `rx`, minus one unit, belongs to the driver.
>> -            if rx == 0 {
>> -                let last = after_tx.len() - 1;
>> -                (&mut after_tx[..last], &mut before_tx[0..0])
>> -            } else {
>> -                (after_tx, &mut before_tx[..rx])
>> -            }
>> +        // The area starting at `tx` and ending at `rx - 2` modulo MSGQ_NUM_PAGES, inclusive,
>> +        // belongs to the driver for writing.
>> +        if rx == 0 {
>> +            // Since `rx` is zero, leave an empty slot at end of the buffer.
>> +            let last = after_tx.len() - 1;
>> +            (&mut after_tx[..last], &mut before_tx[0..0])
>
> Does the address actually matter? Otherwise I would find `&mut []` easier to
> understand than an empty indexing.

The address doesn't matter, and indeed I am not sure why we did that
(possibly a lifetime issue with some previous version of the code?).

Your suggestion seems to work fine, so Eliot feel free to include it in
your series (as a separate patch please to make sure we can isolate the
effects of both changes).

>
>> +        } else if rx > tx {
>> +            // The area is contiguous and we leave an empty slot before `rx`.
>> +            // PANIC:
>> +            // - The index `rx - tx - 1` is non-negative because `rx > tx` in this branch.
>> +            // - The index does not exceed `after_tx.len()` (which is `MSGQ_NUM_PAGES - tx`)
>> +            //   because `rx < MSGQ_NUM_PAGES` by the `gsp_read_ptr` invariant.
>> +            (&mut after_tx[..(rx - tx - 1)], &mut before_tx[0..0])
>>          } else {
>> -            // The area from `tx` to `rx`, minus one unit, belongs to the driver.
>> -            //
>> -            // PANIC: per the invariants of `cpu_write_ptr` and `gsp_read_ptr`, `rx` and `tx` are
>> -            // `<= MSGQ_NUM_PAGES`, and the test above ensured that `rx > tx`.
>> -            (after_tx.split_at_mut(rx - tx).0, &mut before_tx[0..0])
>> +            // The area is discontiguous and we leave an empty slot before `rx`.
>> +            // PANIC:
>> +            // - The index `rx - 1` is non-negative because `rx != 0` in this branch.
>> +            // - The index does not exceed `before_tx.len()` (which equals `tx`) because
>> +            //   `rx <= tx` in this branch.
>> +            (after_tx, &mut before_tx[..(rx - 1)])
>
> If this is written with get_disjoint_mut, the indices would be so much easier to
> understand... To bad that that API is only available from 1.86 onwards.

I for one am longing for `split_at_checked`... :)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area
  2026-01-23 12:12 ` [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area Eliot Courtney
  2026-01-26 18:30   ` Gary Guo
@ 2026-01-28 11:57   ` Alexandre Courbot
  1 sibling, 0 replies; 15+ messages in thread
From: Alexandre Courbot @ 2026-01-28 11:57 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Alistair Popple, nouveau, rust-for-linux, dri-devel, linux-kernel

On Fri Jan 23, 2026 at 9:12 PM JST, Eliot Courtney wrote:
> The current code indexes into `after_rx` using `tx` which is an index
> for the whole buffer, not the split buffer `after_rx`.

Another dangerous one! Well found.

>
> Also add more rigorous no-panic proofs.
>
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index aa8758fc7723..c26396fda29c 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -1,7 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0
>  
>  use core::{
> -    cmp,
>      mem,
>      sync::atomic::{
>          fence,
> @@ -267,10 +266,20 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          // PANIC: per the invariant of `cpu_read_ptr`, `rx` is `< MSGQ_NUM_PAGES`.
>          let (before_rx, after_rx) = gsp_mem.gspq.msgq.data.split_at(rx);
>  
> -        match tx.cmp(&rx) {
> -            cmp::Ordering::Equal => (&after_rx[0..0], &after_rx[0..0]),
> -            cmp::Ordering::Greater => (&after_rx[..tx], &before_rx[0..0]),
> -            cmp::Ordering::Less => (after_rx, &before_rx[..tx]),
> +        // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
> +        // belongs to the driver for reading.
> +        if rx <= tx {
> +            // The area is contiguous.
> +            // PANIC:
> +            // - The index `tx - rx` is non-negative because `rx <= tx` in this branch.
> +            // - The index does not exceed `after_rx.len()` (which is `MSGQ_NUM_PAGES - rx`)
> +            //   because `tx < MSGQ_NUM_PAGES` by the `gsp_write_ptr` invariant.
> +            (&after_rx[..(tx - rx)], &after_rx[0..0])

I guess we can also use `&[]` here (maybe convert the whole file in the
same patch).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area
  2026-01-26 18:30   ` Gary Guo
@ 2026-01-28 12:18     ` Alexandre Courbot
  0 siblings, 0 replies; 15+ messages in thread
From: Alexandre Courbot @ 2026-01-28 12:18 UTC (permalink / raw)
  To: Gary Guo
  Cc: Eliot Courtney, Danilo Krummrich, Alice Ryhl, David Airlie,
	Simona Vetter, Alistair Popple, nouveau, rust-for-linux,
	dri-devel, linux-kernel

On Tue Jan 27, 2026 at 3:30 AM JST, Gary Guo wrote:
> On Fri Jan 23, 2026 at 12:12 PM GMT, Eliot Courtney wrote:
>> The current code indexes into `after_rx` using `tx` which is an index
>> for the whole buffer, not the split buffer `after_rx`.
>>
>> Also add more rigorous no-panic proofs.
>>
>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/gsp/cmdq.rs | 19 ++++++++++++++-----
>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index aa8758fc7723..c26396fda29c 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -1,7 +1,6 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  
>>  use core::{
>> -    cmp,
>>      mem,
>>      sync::atomic::{
>>          fence,
>> @@ -267,10 +266,20 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>>          // PANIC: per the invariant of `cpu_read_ptr`, `rx` is `< MSGQ_NUM_PAGES`.
>>          let (before_rx, after_rx) = gsp_mem.gspq.msgq.data.split_at(rx);
>
> This code doesn't need splitting as it doesn't have the uniqueness issue that
> mutable references have. While you're at it, probably it's chance to simplify
> the code.
>
>>  
>> -        match tx.cmp(&rx) {
>> -            cmp::Ordering::Equal => (&after_rx[0..0], &after_rx[0..0]),
>> -            cmp::Ordering::Greater => (&after_rx[..tx], &before_rx[0..0]),
>> -            cmp::Ordering::Less => (after_rx, &before_rx[..tx]),
>> +        // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
>> +        // belongs to the driver for reading.
>> +        if rx <= tx {
>> +            // The area is contiguous.
>> +            // PANIC:
>> +            // - The index `tx - rx` is non-negative because `rx <= tx` in this branch.
>> +            // - The index does not exceed `after_rx.len()` (which is `MSGQ_NUM_PAGES - rx`)
>> +            //   because `tx < MSGQ_NUM_PAGES` by the `gsp_write_ptr` invariant.
>> +            (&after_rx[..(tx - rx)], &after_rx[0..0])
>
> This can be just `(&data[rx..tx], &[])` without the split.
>
>> +        } else {
>> +            // The area is discontiguous.
>> +            // PANIC: `tx` does not exceed `before_rx.len()` (which equals `rx`) because
>> +            //   `tx < rx` in this branch.
>> +            (after_rx, &before_rx[..tx])
>
> This can be just `(&data[rx..], &data[..tx])` without the split.

Indeed, this is arguably easier to understand.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-01-28 12:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-23 12:12 [PATCH v2 0/4] gpu: nova-core: gsp: fix command queue ring buffer bugs Eliot Courtney
2026-01-23 12:12 ` [PATCH v2 1/4] gpu: nova-core: gsp: fix incorrect advancing of write pointer Eliot Courtney
2026-01-23 12:12 ` [PATCH v2 2/4] gpu: nova-core: gsp: clarify comments about invariants and pointer roles Eliot Courtney
2026-01-26 18:04   ` Gary Guo
2026-01-28  4:35     ` Eliot Courtney
2026-01-28  8:17       ` Alexandre Courbot
2026-01-28 10:46         ` Danilo Krummrich
2026-01-23 12:12 ` [PATCH v2 3/4] gpu: nova-core: gsp: fix improper handling of empty slot in cmdq Eliot Courtney
2026-01-26 18:26   ` Gary Guo
2026-01-28 11:42     ` Alexandre Courbot
2026-01-28 11:39   ` Alexandre Courbot
2026-01-23 12:12 ` [PATCH v2 4/4] gpu: nova-core: gsp: fix improper indexing in driver_read_area Eliot Courtney
2026-01-26 18:30   ` Gary Guo
2026-01-28 12:18     ` Alexandre Courbot
2026-01-28 11:57   ` Alexandre Courbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox