[PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support
@ 2026-03-04  1:42 Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically Eliot Courtney
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

GSP commands over 16 pages need to be sent using "continuation records"
which essentially means splitting the payload over multiple commands.

This series adds a command type `ContinuationRecord` which just writes
its header and whatever payload it is given. It also adds types
`SplitState` and `SplitCommand` which support splitting a large RPC
into smaller ones while transparently letting regular sized RPCs be sent
without extra copies.

The send pathway uses `SplitState` to send all commands, but if
the command fits into 16 pages, it still writes directly into the
command queue. If it is larger than 16 pages and needs continuation
records, it writes into a staging buffer, so there is one copy.

Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
Changes in v5:
- Moved continuation module under cmdq.
- Made timeout into a constant.
- Renamed command_size to size_in_bytes and made it a default method
  on CommandToGsp
- Link to v4: https://lore.kernel.org/r/20260302-cmdq-continuation-v4-0-c011f15aad58@nvidia.com

Changes in v4:
- Removed Tested-by Zhi on the ContinuationRecord patch, since it
  changed substantively.
- Incorporated Alex's design for SplitState
- Moved the continuation record types into their own module.
- Tightened visibility on continuation record types.
- Used non-zero sized command in tests.
- Link to v3: https://lore.kernel.org/r/20260226-cmdq-continuation-v3-0-572ab9916766@nvidia.com

Changes in v3:
- Rebased onto drm-rust-next
- Added some doc comments
- Reworded command_size doc comment
- Link to v2: https://lore.kernel.org/r/20260219-cmdq-continuation-v2-0-2e8b7615536f@nvidia.com

Changes in v2:
- Added doccoments
- Renamed driver_bytes_available_to_write to driver_write_area_size
- allocate_command_with_timeout merged allocate_command with timeout parameter
- Replaced hardcoded GSP_PAGE_SIZE * 16 with bindings
- Changed oversized command error from EIO to EMSGSIZE
- Added EMSGSIZE to kernel/error.rs
- Split WrappingCommand functionality into SplitState + SplitCommand enum
- Made max_size a const (MAX_CMD_SIZE)
- Removed send_continuation_record + added comment for type inference
- send_single_command now consumes the command
- Extracted command_size + used in SplitState 
- Link to v1: https://lore.kernel.org/r/20260212-cmdq-continuation-v1-0-73079ded55e6@nvidia.com

---
Eliot Courtney (9):
      gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically
      gpu: nova-core: gsp: add mechanism to wait for space on command queue
      rust: add EMSGSIZE error code
      gpu: nova-core: gsp: add checking oversized commands
      gpu: nova-core: gsp: clarify invariant on command queue
      gpu: nova-core: gsp: unconditionally call variable payload handling
      gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
      gpu: nova-core: gsp: support large RPCs via continuation record
      gpu: nova-core: gsp: add tests for continuation records

 drivers/gpu/nova-core/gsp/cmdq.rs                 | 119 +++++++--
 drivers/gpu/nova-core/gsp/cmdq/continuation.rs    | 301 ++++++++++++++++++++++
 drivers/gpu/nova-core/gsp/fw.rs                   |  75 +++---
 drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs |   1 +
 rust/kernel/error.rs                              |   1 +
 5 files changed, 439 insertions(+), 58 deletions(-)
---
base-commit: 4a49fe23e357b48845e31fe9c28a802c05458198
change-id: 20260203-cmdq-continuation-b99f3d5966c3

Best regards,
-- 
Eliot Courtney <ecourtney@nvidia.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

There is no particular order required here and keeping them alphabetical
will help preventing future mistakes.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/fw.rs | 67 +++++++++++++++++++++--------------------
 1 file changed, 35 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index f1797e1f0d9d..4b998485360b 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -191,34 +191,34 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
 #[repr(u32)]
 pub(crate) enum MsgFunction {
     // Common function codes
-    Nop = bindings::NV_VGPU_MSG_FUNCTION_NOP,
-    SetGuestSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO,
-    AllocRoot = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT,
+    AllocChannelDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA,
+    AllocCtxDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA,
     AllocDevice = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_DEVICE,
     AllocMemory = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_MEMORY,
-    AllocCtxDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA,
-    AllocChannelDma = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA,
-    MapMemory = bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY,
-    BindCtxDma = bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA,
     AllocObject = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT,
+    AllocRoot = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT,
+    BindCtxDma = bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA,
     Free = bindings::NV_VGPU_MSG_FUNCTION_FREE,
-    Log = bindings::NV_VGPU_MSG_FUNCTION_LOG,
     GetGspStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO,
-    SetRegistry = bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY,
-    GspSetSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO,
+    GetStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO,
     GspInitPostObjGpu = bindings::NV_VGPU_MSG_FUNCTION_GSP_INIT_POST_OBJGPU,
     GspRmControl = bindings::NV_VGPU_MSG_FUNCTION_GSP_RM_CONTROL,
-    GetStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO,
+    GspSetSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO,
+    Log = bindings::NV_VGPU_MSG_FUNCTION_LOG,
+    MapMemory = bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY,
+    Nop = bindings::NV_VGPU_MSG_FUNCTION_NOP,
+    SetGuestSystemInfo = bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO,
+    SetRegistry = bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY,
 
     // Event codes
     GspInitDone = bindings::NV_VGPU_MSG_EVENT_GSP_INIT_DONE,
+    GspLockdownNotice = bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE,
+    GspPostNoCat = bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD,
     GspRunCpuSequencer = bindings::NV_VGPU_MSG_EVENT_GSP_RUN_CPU_SEQUENCER,
-    PostEvent = bindings::NV_VGPU_MSG_EVENT_POST_EVENT,
-    RcTriggered = bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED,
     MmuFaultQueued = bindings::NV_VGPU_MSG_EVENT_MMU_FAULT_QUEUED,
     OsErrorLog = bindings::NV_VGPU_MSG_EVENT_OS_ERROR_LOG,
-    GspPostNoCat = bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD,
-    GspLockdownNotice = bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE,
+    PostEvent = bindings::NV_VGPU_MSG_EVENT_POST_EVENT,
+    RcTriggered = bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED,
     UcodeLibOsPrint = bindings::NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT,
 }
 
@@ -227,38 +227,41 @@ impl TryFrom<u32> for MsgFunction {
 
     fn try_from(value: u32) -> Result<MsgFunction> {
         match value {
-            bindings::NV_VGPU_MSG_FUNCTION_NOP => Ok(MsgFunction::Nop),
-            bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO => {
-                Ok(MsgFunction::SetGuestSystemInfo)
-            }
-            bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT => Ok(MsgFunction::AllocRoot),
+            // Common function codes
+            bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA => Ok(MsgFunction::AllocChannelDma),
+            bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA => Ok(MsgFunction::AllocCtxDma),
             bindings::NV_VGPU_MSG_FUNCTION_ALLOC_DEVICE => Ok(MsgFunction::AllocDevice),
             bindings::NV_VGPU_MSG_FUNCTION_ALLOC_MEMORY => Ok(MsgFunction::AllocMemory),
-            bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CTX_DMA => Ok(MsgFunction::AllocCtxDma),
-            bindings::NV_VGPU_MSG_FUNCTION_ALLOC_CHANNEL_DMA => Ok(MsgFunction::AllocChannelDma),
-            bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY => Ok(MsgFunction::MapMemory),
-            bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA => Ok(MsgFunction::BindCtxDma),
             bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT => Ok(MsgFunction::AllocObject),
+            bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT => Ok(MsgFunction::AllocRoot),
+            bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA => Ok(MsgFunction::BindCtxDma),
             bindings::NV_VGPU_MSG_FUNCTION_FREE => Ok(MsgFunction::Free),
-            bindings::NV_VGPU_MSG_FUNCTION_LOG => Ok(MsgFunction::Log),
             bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO => Ok(MsgFunction::GetGspStaticInfo),
-            bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY => Ok(MsgFunction::SetRegistry),
-            bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO => Ok(MsgFunction::GspSetSystemInfo),
+            bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO => Ok(MsgFunction::GetStaticInfo),
             bindings::NV_VGPU_MSG_FUNCTION_GSP_INIT_POST_OBJGPU => {
                 Ok(MsgFunction::GspInitPostObjGpu)
             }
             bindings::NV_VGPU_MSG_FUNCTION_GSP_RM_CONTROL => Ok(MsgFunction::GspRmControl),
-            bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO => Ok(MsgFunction::GetStaticInfo),
+            bindings::NV_VGPU_MSG_FUNCTION_GSP_SET_SYSTEM_INFO => Ok(MsgFunction::GspSetSystemInfo),
+            bindings::NV_VGPU_MSG_FUNCTION_LOG => Ok(MsgFunction::Log),
+            bindings::NV_VGPU_MSG_FUNCTION_MAP_MEMORY => Ok(MsgFunction::MapMemory),
+            bindings::NV_VGPU_MSG_FUNCTION_NOP => Ok(MsgFunction::Nop),
+            bindings::NV_VGPU_MSG_FUNCTION_SET_GUEST_SYSTEM_INFO => {
+                Ok(MsgFunction::SetGuestSystemInfo)
+            }
+            bindings::NV_VGPU_MSG_FUNCTION_SET_REGISTRY => Ok(MsgFunction::SetRegistry),
+
+            // Event codes
             bindings::NV_VGPU_MSG_EVENT_GSP_INIT_DONE => Ok(MsgFunction::GspInitDone),
+            bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE => Ok(MsgFunction::GspLockdownNotice),
+            bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD => Ok(MsgFunction::GspPostNoCat),
             bindings::NV_VGPU_MSG_EVENT_GSP_RUN_CPU_SEQUENCER => {
                 Ok(MsgFunction::GspRunCpuSequencer)
             }
-            bindings::NV_VGPU_MSG_EVENT_POST_EVENT => Ok(MsgFunction::PostEvent),
-            bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED => Ok(MsgFunction::RcTriggered),
             bindings::NV_VGPU_MSG_EVENT_MMU_FAULT_QUEUED => Ok(MsgFunction::MmuFaultQueued),
             bindings::NV_VGPU_MSG_EVENT_OS_ERROR_LOG => Ok(MsgFunction::OsErrorLog),
-            bindings::NV_VGPU_MSG_EVENT_GSP_POST_NOCAT_RECORD => Ok(MsgFunction::GspPostNoCat),
-            bindings::NV_VGPU_MSG_EVENT_GSP_LOCKDOWN_NOTICE => Ok(MsgFunction::GspLockdownNotice),
+            bindings::NV_VGPU_MSG_EVENT_POST_EVENT => Ok(MsgFunction::PostEvent),
+            bindings::NV_VGPU_MSG_EVENT_RC_TRIGGERED => Ok(MsgFunction::RcTriggered),
             bindings::NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT => Ok(MsgFunction::UcodeLibOsPrint),
             _ => Err(EINVAL),
         }

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04 11:39   ` Danilo Krummrich
  2026-03-04  1:42 ` [PATCH v5 3/9] rust: add EMSGSIZE error code Eliot Courtney
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

Add a timeout to `allocate_command` which waits for space on the GSP
command queue. It uses a similar timeout to nouveau.

This lets `send_command` wait for space to free up in the command queue.
This is required to support continuation records which can fill up the
queue.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 42 ++++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 87dbbd6d1be9..5cb845bae123 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -250,6 +250,19 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         }
     }
 
+    /// Returns the size of the region of the CPU message queue that the driver is currently allowed
+    /// to write to, in bytes.
+    fn driver_write_area_size(&self) -> usize {
+        let tx = self.cpu_write_ptr();
+        let rx = self.gsp_read_ptr();
+
+        // `rx` and `tx` are both in `0..MSGQ_NUM_PAGES` per the invariants of `gsp_read_ptr` and
+        // `cpu_write_ptr`. The minimum value case is where `rx == 0` and `tx == MSGQ_NUM_PAGES -
+        // 1`, which gives `0 + MSGQ_NUM_PAGES - (MSGQ_NUM_PAGES - 1) - 1 == 0`.
+        let slots = (rx + MSGQ_NUM_PAGES - tx - 1) % MSGQ_NUM_PAGES;
+        num::u32_as_usize(slots) * GSP_PAGE_SIZE
+    }
+
     /// Returns the region of the GSP message queue that the driver is currently allowed to read
     /// from.
     ///
@@ -281,15 +294,22 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
     }
 
     /// Allocates a region on the command queue that is large enough to send a command of `size`
-    /// bytes.
+    /// bytes, waiting for space to become available based on the provided timeout.
     ///
     /// This returns a [`GspCommand`] ready to be written to by the caller.
     ///
     /// # Errors
     ///
-    /// - `EAGAIN` if the driver area is too small to hold the requested command.
+    /// - `ETIMEDOUT` if space does not become available within the timeout.
     /// - `EIO` if the command header is not properly aligned.
-    fn allocate_command(&mut self, size: usize) -> Result<GspCommand<'_>> {
+    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
+        read_poll_timeout(
+            || Ok(self.driver_write_area_size()),
+            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
+            Delta::ZERO,
+            timeout,
+        )?;
+
         // Get the current writable area as an array of bytes.
         let (slice_1, slice_2) = {
             let (slice_1, slice_2) = self.driver_write_area();
@@ -298,13 +318,6 @@ fn allocate_command(&mut self, size: usize) -> Result<GspCommand<'_>> {
             (slice_1.as_flattened_mut(), slice_2.as_flattened_mut())
         };
 
-        // If the GSP is still processing previous messages the shared region
-        // may be full in which case we will have to retry once the GSP has
-        // processed the existing commands.
-        if size_of::<GspMsgElement>() + size > slice_1.len() + slice_2.len() {
-            return Err(EAGAIN);
-        }
-
         // Extract area for the `GspMsgElement`.
         let (header, slice_1) = GspMsgElement::from_bytes_mut_prefix(slice_1).ok_or(EIO)?;
 
@@ -462,6 +475,9 @@ impl Cmdq {
     /// Number of page table entries for the GSP shared region.
     pub(crate) const NUM_PTES: usize = size_of::<GspMem>() >> GSP_PAGE_SHIFT;
 
+    /// Timeout for waiting for space on the command queue.
+    const ALLOCATE_TIMEOUT: Delta = Delta::from_secs(1);
+
     /// Creates a new command queue for `dev`.
     pub(crate) fn new(dev: &device::Device<device::Bound>) -> Result<Cmdq> {
         let gsp_mem = DmaGspMem::new(dev)?;
@@ -497,7 +513,7 @@ fn notify_gsp(bar: &Bar0) {
     ///
     /// # Errors
     ///
-    /// - `EAGAIN` if there was not enough space in the command queue to send the command.
+    /// - `ETIMEDOUT` if space does not become available within the timeout.
     /// - `EIO` if the variable payload requested by the command has not been entirely
     ///   written to by its [`CommandToGsp::init_variable_payload`] method.
     ///
@@ -509,7 +525,9 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
         Error: From<M::InitError>,
     {
         let command_size = size_of::<M::Command>() + command.variable_payload_len();
-        let dst = self.gsp_mem.allocate_command(command_size)?;
+        let dst = self
+            .gsp_mem
+            .allocate_command(command_size, Self::ALLOCATE_TIMEOUT)?;
 
         // Extract area for the command itself.
         let (cmd, payload_1) = M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-04  1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
@ 2026-03-04 11:39   ` Danilo Krummrich
  2026-03-05  7:37     ` Eliot Courtney
  0 siblings, 1 reply; 22+ messages in thread
From: Danilo Krummrich @ 2026-03-04 11:39 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Alice Ryhl, Alexandre Courbot, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux

On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
> +    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
> +        read_poll_timeout(
> +            || Ok(self.driver_write_area_size()),
> +            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
> +            Delta::ZERO,

Isn't this either creating unneccessary thrashing of the memory controller or
unnecessary contention at the cache-coherency level?

I think we should probably add at least a small delay of something around 1us.

> +            timeout,
> +        )?;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-04 11:39   ` Danilo Krummrich
@ 2026-03-05  7:37     ` Eliot Courtney
  2026-03-05  7:50       ` John Hubbard
                         ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-05  7:37 UTC (permalink / raw)
  To: Danilo Krummrich, Eliot Courtney
  Cc: Alice Ryhl, Alexandre Courbot, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux, dri-devel

On Wed Mar 4, 2026 at 8:39 PM JST, Danilo Krummrich wrote:
> On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
>> +    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
>> +        read_poll_timeout(
>> +            || Ok(self.driver_write_area_size()),
>> +            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
>> +            Delta::ZERO,
>
> Isn't this either creating unneccessary thrashing of the memory controller or
> unnecessary contention at the cache-coherency level?
>
> I think we should probably add at least a small delay of something around 1us.

This is what nouveau does (specifically `usleep_range(1, 2)`). OTOH,
openrm just does a busy wait, which is what I replicated here for now.
GSP command queue not having space IIUC is meant to be very exceptional.
I am not sure which is best, maybe Alex has an opinion, but also happy
to change it because that reasoning makes sense to me and I don't know
enough about the distribution of how often it would actually need
to wait to know if 0 delay is justified.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-05  7:37     ` Eliot Courtney
@ 2026-03-05  7:50       ` John Hubbard
  2026-03-05  8:01         ` Eliot Courtney
  2026-03-05 10:38       ` Alexandre Courbot
  2026-03-05 11:16       ` Danilo Krummrich
  2 siblings, 1 reply; 22+ messages in thread
From: John Hubbard @ 2026-03-05  7:50 UTC (permalink / raw)
  To: Eliot Courtney, Danilo Krummrich
  Cc: Alice Ryhl, Alexandre Courbot, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	Alistair Popple, Joel Fernandes, Timur Tabi, nouveau, dri-devel,
	linux-kernel, rust-for-linux, dri-devel

On 3/4/26 11:37 PM, Eliot Courtney wrote:
> On Wed Mar 4, 2026 at 8:39 PM JST, Danilo Krummrich wrote:
>> On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
>>> +    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
>>> +        read_poll_timeout(
>>> +            || Ok(self.driver_write_area_size()),
>>> +            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
>>> +            Delta::ZERO,
>>
>> Isn't this either creating unneccessary thrashing of the memory controller or
>> unnecessary contention at the cache-coherency level?
>>
>> I think we should probably add at least a small delay of something around 1us.
> 
> This is what nouveau does (specifically `usleep_range(1, 2)`). OTOH,
> openrm just does a busy wait, which is what I replicated here for now.

Open RM has some ancient bad habits!

> GSP command queue not having space IIUC is meant to be very exceptional.
> I am not sure which is best, maybe Alex has an opinion, but also happy
> to change it because that reasoning makes sense to me and I don't know
> enough about the distribution of how often it would actually need
> to wait to know if 0 delay is justified.

Almost never! There is a big big difference in how the OS behaves,
between 0 delay and non-zero delay. Sleeping, if you can afford to
(and we can, here, IIUC) is far better than a hard spin loop: the
scheduler can do something reasonable, for one thing.


thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-05  7:50       ` John Hubbard
@ 2026-03-05  8:01         ` Eliot Courtney
  0 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-05  8:01 UTC (permalink / raw)
  To: John Hubbard, Eliot Courtney, Danilo Krummrich
  Cc: Alice Ryhl, Alexandre Courbot, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	Alistair Popple, Joel Fernandes, Timur Tabi, nouveau, dri-devel,
	linux-kernel, rust-for-linux, dri-devel

On Thu Mar 5, 2026 at 4:50 PM JST, John Hubbard wrote:
> On 3/4/26 11:37 PM, Eliot Courtney wrote:
>> On Wed Mar 4, 2026 at 8:39 PM JST, Danilo Krummrich wrote:
>>> On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
>>>> +    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
>>>> +        read_poll_timeout(
>>>> +            || Ok(self.driver_write_area_size()),
>>>> +            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
>>>> +            Delta::ZERO,
>>>
>>> Isn't this either creating unneccessary thrashing of the memory controller or
>>> unnecessary contention at the cache-coherency level?
>>>
>>> I think we should probably add at least a small delay of something around 1us.
>> 
>> This is what nouveau does (specifically `usleep_range(1, 2)`). OTOH,
>> openrm just does a busy wait, which is what I replicated here for now.
>
> Open RM has some ancient bad habits!
>
>> GSP command queue not having space IIUC is meant to be very exceptional.
>> I am not sure which is best, maybe Alex has an opinion, but also happy
>> to change it because that reasoning makes sense to me and I don't know
>> enough about the distribution of how often it would actually need
>> to wait to know if 0 delay is justified.
>
> Almost never! There is a big big difference in how the OS behaves,
> between 0 delay and non-zero delay. Sleeping, if you can afford to
> (and we can, here, IIUC) is far better than a hard spin loop: the
> scheduler can do something reasonable, for one thing.

In this case the rust `read_poll_timeout` has a preemption point, so
maybe the scheduler would be fine?

But, if we can afford to sleep here it does seem better to me.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-05  7:37     ` Eliot Courtney
  2026-03-05  7:50       ` John Hubbard
@ 2026-03-05 10:38       ` Alexandre Courbot
  2026-03-05 11:16       ` Danilo Krummrich
  2 siblings, 0 replies; 22+ messages in thread
From: Alexandre Courbot @ 2026-03-05 10:38 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux, dri-devel

On Thu Mar 5, 2026 at 4:37 PM JST, Eliot Courtney wrote:
> On Wed Mar 4, 2026 at 8:39 PM JST, Danilo Krummrich wrote:
>> On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
>>> +    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
>>> +        read_poll_timeout(
>>> +            || Ok(self.driver_write_area_size()),
>>> +            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
>>> +            Delta::ZERO,
>>
>> Isn't this either creating unneccessary thrashing of the memory controller or
>> unnecessary contention at the cache-coherency level?
>>
>> I think we should probably add at least a small delay of something around 1us.
>
> This is what nouveau does (specifically `usleep_range(1, 2)`). OTOH,
> openrm just does a busy wait, which is what I replicated here for now.
> GSP command queue not having space IIUC is meant to be very exceptional.
> I am not sure which is best, maybe Alex has an opinion, but also happy
> to change it because that reasoning makes sense to me and I don't know
> enough about the distribution of how often it would actually need
> to wait to know if 0 delay is justified.

We tend to align on OpenRM generally speaking, but since waiting should
be exceptional anyway it shouldn't really matter if we add a very short
delay between poll. Agree that it looks better than a brute busy-loop.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-05  7:37     ` Eliot Courtney
  2026-03-05  7:50       ` John Hubbard
  2026-03-05 10:38       ` Alexandre Courbot
@ 2026-03-05 11:16       ` Danilo Krummrich
  2026-03-06  0:48         ` Alexandre Courbot
  2026-03-06  5:21         ` Alexandre Courbot
  2 siblings, 2 replies; 22+ messages in thread
From: Danilo Krummrich @ 2026-03-05 11:16 UTC (permalink / raw)
  To: Eliot Courtney, Alexandre Courbot
  Cc: Alice Ryhl, Alexandre Courbot, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux, dri-devel

On Thu Mar 5, 2026 at 8:37 AM CET, Eliot Courtney wrote:
> On Wed Mar 4, 2026 at 8:39 PM JST, Danilo Krummrich wrote:
>> On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
>>> +    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
>>> +        read_poll_timeout(
>>> +            || Ok(self.driver_write_area_size()),
>>> +            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
>>> +            Delta::ZERO,
>>
>> Isn't this either creating unneccessary thrashing of the memory controller or
>> unnecessary contention at the cache-coherency level?
>>
>> I think we should probably add at least a small delay of something around 1us.
>
> This is what nouveau does (specifically `usleep_range(1, 2)`). OTOH,
> openrm just does a busy wait, which is what I replicated here for now.
> GSP command queue not having space IIUC is meant to be very exceptional.
> I am not sure which is best, maybe Alex has an opinion, but also happy
> to change it because that reasoning makes sense to me and I don't know
> enough about the distribution of how often it would actually need
> to wait to know if 0 delay is justified.

Well, what this code says is "let's hammer the cache / memory controller as fast
as we can for up to one second".

This really should come with some justification why it is actually needed for
proper operation of the driver.

@Alex: It also seems that this is based on broken code, i.e. I noticed how the
DMA read is done in this case in e.g. gsp_read_ptr().

	fn cpu_read_ptr(&self) -> u32 {
	    let gsp_mem = self.0.start_ptr();
	
	    // SAFETY:
	    //  - The ['CoherentAllocation'] contains at least one object.
	    //  - By the invariants of CoherentAllocation the pointer is valid.
	    (unsafe { (*gsp_mem).cpuq.rx.read_ptr() } % MSGQ_NUM_PAGES)
	}

Why isn't this using dma_read!()? I think creating this reference is UB.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-05 11:16       ` Danilo Krummrich
@ 2026-03-06  0:48         ` Alexandre Courbot
  2026-03-06  1:46           ` Gary Guo
  2026-03-06 11:03           ` Danilo Krummrich
  2026-03-06  5:21         ` Alexandre Courbot
  1 sibling, 2 replies; 22+ messages in thread
From: Alexandre Courbot @ 2026-03-06  0:48 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Eliot Courtney, Alice Ryhl, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux, dri-devel

On Thu Mar 5, 2026 at 8:16 PM JST, Danilo Krummrich wrote:
> @Alex: It also seems that this is based on broken code, i.e. I noticed how the
> DMA read is done in this case in e.g. gsp_read_ptr().
>
> 	fn cpu_read_ptr(&self) -> u32 {
> 	    let gsp_mem = self.0.start_ptr();
> 	
> 	    // SAFETY:
> 	    //  - The ['CoherentAllocation'] contains at least one object.
> 	    //  - By the invariants of CoherentAllocation the pointer is valid.
> 	    (unsafe { (*gsp_mem).cpuq.rx.read_ptr() } % MSGQ_NUM_PAGES)
> 	}
>
> Why isn't this using dma_read!()? I think creating this reference is UB.

We can't - technically we would have to have the `dma_read` in `cmdq.rs`
so it can access the `CoherentAllocation` (and do an unwrap in the
process):

    dma_read!(self.0, 0, .gspq.rx.0.readPtr).unwrap()

... but that cannot be done as `MsgqRxHeader` is part of the bindings
(i.e. in `fw.rs`) and thus its internal fields are not visible to
`cmdq.rs`, as per our policy of making the bindigns opaque.

This can probably be done better with I/O projections, but for now we
have to do the read_volatile by ourselves. What makes this reference UB
btw?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-06  0:48         ` Alexandre Courbot
@ 2026-03-06  1:46           ` Gary Guo
  2026-03-06 11:03           ` Danilo Krummrich
  1 sibling, 0 replies; 22+ messages in thread
From: Gary Guo @ 2026-03-06  1:46 UTC (permalink / raw)
  To: Alexandre Courbot, Danilo Krummrich
  Cc: Eliot Courtney, Alice Ryhl, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux, dri-devel

On Fri Mar 6, 2026 at 12:48 AM GMT, Alexandre Courbot wrote:
> On Thu Mar 5, 2026 at 8:16 PM JST, Danilo Krummrich wrote:
>> @Alex: It also seems that this is based on broken code, i.e. I noticed how the
>> DMA read is done in this case in e.g. gsp_read_ptr().
>>
>> 	fn cpu_read_ptr(&self) -> u32 {
>> 	    let gsp_mem = self.0.start_ptr();
>> 	
>> 	    // SAFETY:
>> 	    //  - The ['CoherentAllocation'] contains at least one object.
>> 	    //  - By the invariants of CoherentAllocation the pointer is valid.
>> 	    (unsafe { (*gsp_mem).cpuq.rx.read_ptr() } % MSGQ_NUM_PAGES)
>> 	}
>>
>> Why isn't this using dma_read!()? I think creating this reference is UB.
>
> We can't - technically we would have to have the `dma_read` in `cmdq.rs`
> so it can access the `CoherentAllocation` (and do an unwrap in the
> process):
>
>     dma_read!(self.0, 0, .gspq.rx.0.readPtr).unwrap()
>
> ... but that cannot be done as `MsgqRxHeader` is part of the bindings
> (i.e. in `fw.rs`) and thus its internal fields are not visible to
> `cmdq.rs`, as per our policy of making the bindigns opaque.
>
> This can probably be done better with I/O projections, but for now we
> have to do the read_volatile by ourselves. What makes this reference UB
> btw?

MsgqRxHeader does not have interior mutability and is not pinned. Thus data must
not change underneath a `&MsgRxHeader`, which isn't true.

To correct this you need to make all methods take a pointer rather than
reference, or wrap the raw binding data inside `Opaque<>`.

Best,
Gary

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-06  0:48         ` Alexandre Courbot
  2026-03-06  1:46           ` Gary Guo
@ 2026-03-06 11:03           ` Danilo Krummrich
  1 sibling, 0 replies; 22+ messages in thread
From: Danilo Krummrich @ 2026-03-06 11:03 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Eliot Courtney, Alice Ryhl, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux, dri-devel

On Fri Mar 6, 2026 at 1:48 AM CET, Alexandre Courbot wrote:
> On Thu Mar 5, 2026 at 8:16 PM JST, Danilo Krummrich wrote:
>> @Alex: It also seems that this is based on broken code, i.e. I noticed how the
>> DMA read is done in this case in e.g. gsp_read_ptr().
>>
>> 	fn cpu_read_ptr(&self) -> u32 {
>> 	    let gsp_mem = self.0.start_ptr();
>> 	
>> 	    // SAFETY:
>> 	    //  - The ['CoherentAllocation'] contains at least one object.
>> 	    //  - By the invariants of CoherentAllocation the pointer is valid.
>> 	    (unsafe { (*gsp_mem).cpuq.rx.read_ptr() } % MSGQ_NUM_PAGES)
>> 	}
>>
>> Why isn't this using dma_read!()? I think creating this reference is UB.
>
> We can't - technically we would have to have the `dma_read` in `cmdq.rs`
> so it can access the `CoherentAllocation` (and do an unwrap in the
> process):
>
>     dma_read!(self.0, 0, .gspq.rx.0.readPtr).unwrap()
>
> ... but that cannot be done as `MsgqRxHeader` is part of the bindings
> (i.e. in `fw.rs`) and thus its internal fields are not visible to
> `cmdq.rs`, as per our policy of making the bindigns opaque.

We can have a helpers for doing such dma_read!() calls in gsp/fw.rs instead, and
just forward from the actual methods.

	fn cpu_read_ptr(&self) -> u32 {
	    fw:gsp_mem::cpu_rx_ptr(self) % MSGQ_NUM_PAGES
	}

> This can probably be done better with I/O projections, but for now we have to
> do the read_volatile by ourselves.

Not necessarily, see above.

> What makes this reference UB btw?

Gary explained this in another reply already; I think fixing this with Opaque or
passing raw pointers instead involves even more unsafe. Whereas the simple
indirection from above is fully safe and can easily replaced with I/O
projections once we have them.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue
  2026-03-05 11:16       ` Danilo Krummrich
  2026-03-06  0:48         ` Alexandre Courbot
@ 2026-03-06  5:21         ` Alexandre Courbot
  1 sibling, 0 replies; 22+ messages in thread
From: Alexandre Courbot @ 2026-03-06  5:21 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Eliot Courtney, Alice Ryhl, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	nouveau, dri-devel, linux-kernel, rust-for-linux, dri-devel

On Thu Mar 5, 2026 at 8:16 PM JST, Danilo Krummrich wrote:
> On Thu Mar 5, 2026 at 8:37 AM CET, Eliot Courtney wrote:
>> On Wed Mar 4, 2026 at 8:39 PM JST, Danilo Krummrich wrote:
>>> On Wed Mar 4, 2026 at 2:42 AM CET, Eliot Courtney wrote:
>>>> +    fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
>>>> +        read_poll_timeout(
>>>> +            || Ok(self.driver_write_area_size()),
>>>> +            |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
>>>> +            Delta::ZERO,
>>>
>>> Isn't this either creating unneccessary thrashing of the memory controller or
>>> unnecessary contention at the cache-coherency level?
>>>
>>> I think we should probably add at least a small delay of something around 1us.
>>
>> This is what nouveau does (specifically `usleep_range(1, 2)`). OTOH,
>> openrm just does a busy wait, which is what I replicated here for now.
>> GSP command queue not having space IIUC is meant to be very exceptional.
>> I am not sure which is best, maybe Alex has an opinion, but also happy
>> to change it because that reasoning makes sense to me and I don't know
>> enough about the distribution of how often it would actually need
>> to wait to know if 0 delay is justified.
>
> Well, what this code says is "let's hammer the cache / memory controller as fast
> as we can for up to one second".
>
> This really should come with some justification why it is actually needed for
> proper operation of the driver.

A 1us delay sounds very reasonable. I can add it when applying if there
is no feedback justifying a respin.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 3/9] rust: add EMSGSIZE error code
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands Eliot Courtney
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

Add the EMSGSIZE error code, which indicates that a message is too
long.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 rust/kernel/error.rs | 1 +
 1 file changed, 1 insertion(+)

diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index 258b12afdcba..10fcf1f0404d 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -67,6 +67,7 @@ macro_rules! declare_err {
     declare_err!(EDOM, "Math argument out of domain of func.");
     declare_err!(ERANGE, "Math result not representable.");
     declare_err!(EOVERFLOW, "Value too large for defined data type.");
+    declare_err!(EMSGSIZE, "Message too long.");
     declare_err!(ETIMEDOUT, "Connection timed out.");
     declare_err!(ERESTARTSYS, "Restart the system call.");
     declare_err!(ERESTARTNOINTR, "System call was interrupted by a signal and will be restarted.");

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
                   ` (2 preceding siblings ...)
  2026-03-04  1:42 ` [PATCH v5 3/9] rust: add EMSGSIZE error code Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue Eliot Courtney
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

The limit is 16 pages for a single command sent to the GSP. Return an
error if `allocate_command` is called with a too large size.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs                 | 7 ++++++-
 drivers/gpu/nova-core/gsp/fw.rs                   | 4 ++++
 drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs | 1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 5cb845bae123..73012da436e5 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -32,7 +32,8 @@
             GspMsgElement,
             MsgFunction,
             MsgqRxHeader,
-            MsgqTxHeader, //
+            MsgqTxHeader,
+            GSP_MSG_QUEUE_ELEMENT_SIZE_MAX, //
         },
         PteArray,
         GSP_PAGE_SHIFT,
@@ -300,9 +301,13 @@ fn driver_write_area_size(&self) -> usize {
     ///
     /// # Errors
     ///
+    /// - `EMSGSIZE` if the command is larger than [`GSP_MSG_QUEUE_ELEMENT_SIZE_MAX`].
     /// - `ETIMEDOUT` if space does not become available within the timeout.
     /// - `EIO` if the command header is not properly aligned.
     fn allocate_command(&mut self, size: usize, timeout: Delta) -> Result<GspCommand<'_>> {
+        if size_of::<GspMsgElement>() + size > GSP_MSG_QUEUE_ELEMENT_SIZE_MAX {
+            return Err(EMSGSIZE);
+        }
         read_poll_timeout(
             || Ok(self.driver_write_area_size()),
             |available_bytes| *available_bytes >= size_of::<GspMsgElement>() + size,
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 4b998485360b..6005362450cb 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -39,6 +39,10 @@
     },
 };
 
+/// Maximum size of a single GSP message queue element in bytes.
+pub(crate) const GSP_MSG_QUEUE_ELEMENT_SIZE_MAX: usize =
+    num::u32_as_usize(bindings::GSP_MSG_QUEUE_ELEMENT_SIZE_MAX);
+
 /// Empty type to group methods related to heap parameters for running the GSP firmware.
 enum GspFwHeapParams {}
 
diff --git a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
index 6d25fe0bffa9..334e8be5fde8 100644
--- a/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
+++ b/drivers/gpu/nova-core/gsp/fw/r570_144/bindings.rs
@@ -43,6 +43,7 @@ fn fmt(&self, fmt: &mut ::core::fmt::Formatter<'_>) -> ::core::fmt::Result {
 pub const GSP_FW_WPR_META_REVISION: u32 = 1;
 pub const GSP_FW_WPR_META_MAGIC: i64 = -2577556379034558285;
 pub const REGISTRY_TABLE_ENTRY_TYPE_DWORD: u32 = 1;
+pub const GSP_MSG_QUEUE_ELEMENT_SIZE_MAX: u32 = 65536;
 pub type __u8 = ffi::c_uchar;
 pub type __u16 = ffi::c_ushort;
 pub type __u32 = ffi::c_uint;

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
                   ` (3 preceding siblings ...)
  2026-03-04  1:42 ` [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling Eliot Courtney
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

Clarify why using only the first returned slice from allocate_command
for the message headers is okay.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 73012da436e5..c91da368206d 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -534,7 +534,9 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
             .gsp_mem
             .allocate_command(command_size, Self::ALLOCATE_TIMEOUT)?;
 
-        // Extract area for the command itself.
+        // Extract area for the command itself. The GSP message header and the command header
+        // together are guaranteed to fit entirely into a single page, so it's ok to only look
+        // at `dst.contents.0` here.
         let (cmd, payload_1) = M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;
 
         // Fill the header and command in-place.

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
                   ` (4 preceding siblings ...)
  2026-03-04  1:42 ` [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

Unconditionally call the variable length payload code, which is a no-op
if there is no such payload but could defensively catch some coding
errors by e.g. checking that the allocated size is completely filled.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index c91da368206d..48cf28b41f39 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -548,16 +548,14 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
             command.init().__init(core::ptr::from_mut(cmd))?;
         }
 
-        // Fill the variable-length payload.
-        if command_size > size_of::<M::Command>() {
-            let mut sbuffer =
-                SBufferIter::new_writer([&mut payload_1[..], &mut dst.contents.1[..]]);
-            command.init_variable_payload(&mut sbuffer)?;
+        // Fill the variable-length payload, which may be empty.
+        let mut sbuffer = SBufferIter::new_writer([&mut payload_1[..], &mut dst.contents.1[..]]);
+        command.init_variable_payload(&mut sbuffer)?;
 
-            if !sbuffer.is_empty() {
-                return Err(EIO);
-            }
+        if !sbuffer.is_empty() {
+            return Err(EIO);
         }
+        drop(sbuffer);
 
         // Compute checksum now that the whole message is ready.
         dst.header

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
                   ` (5 preceding siblings ...)
  2026-03-04  1:42 ` [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04  2:53   ` Alexandre Courbot
  2026-03-04  1:42 ` [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records Eliot Courtney
  8 siblings, 1 reply; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

Add a default method to `CommandToGsp` which computes the size of a
command.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 48cf28b41f39..3424be4e15f8 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -94,6 +94,12 @@ fn init_variable_payload(
     ) -> Result {
         Ok(())
     }
+
+    /// Total size of the command (including its variable-length payload) without the
+    /// [`GspMsgElement`] header.
+    fn size_in_bytes(&self) -> usize {
+        size_of::<Self::Command>() + self.variable_payload_len()
+    }
 }
 
 /// Trait representing messages received from the GSP.
@@ -529,10 +535,10 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
         // This allows all error types, including `Infallible`, to be used for `M::InitError`.
         Error: From<M::InitError>,
     {
-        let command_size = size_of::<M::Command>() + command.variable_payload_len();
+        let size_in_bytes = command.size_in_bytes();
         let dst = self
             .gsp_mem
-            .allocate_command(command_size, Self::ALLOCATE_TIMEOUT)?;
+            .allocate_command(size_in_bytes, Self::ALLOCATE_TIMEOUT)?;
 
         // Extract area for the command itself. The GSP message header and the command header
         // together are guaranteed to fit entirely into a single page, so it's ok to only look
@@ -540,7 +546,7 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
         let (cmd, payload_1) = M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;
 
         // Fill the header and command in-place.
-        let msg_element = GspMsgElement::init(self.seq, command_size, M::FUNCTION);
+        let msg_element = GspMsgElement::init(self.seq, size_in_bytes, M::FUNCTION);
         // SAFETY: `msg_header` and `cmd` are valid references, and not touched if the initializer
         // fails.
         unsafe {

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
  2026-03-04  1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
@ 2026-03-04  2:53   ` Alexandre Courbot
  2026-03-04  3:10     ` Eliot Courtney
  0 siblings, 1 reply; 22+ messages in thread
From: Alexandre Courbot @ 2026-03-04  2:53 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	Alistair Popple, Joel Fernandes, nouveau, dri-devel, linux-kernel,
	rust-for-linux

On Wed Mar 4, 2026 at 10:42 AM JST, Eliot Courtney wrote:
> Add a default method to `CommandToGsp` which computes the size of a
> command.
>
> Tested-by: Zhi Wang <zhiw@nvidia.com>
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 48cf28b41f39..3424be4e15f8 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -94,6 +94,12 @@ fn init_variable_payload(
>      ) -> Result {
>          Ok(())
>      }
> +
> +    /// Total size of the command (including its variable-length payload) without the
> +    /// [`GspMsgElement`] header.
> +    fn size_in_bytes(&self) -> usize {

We usually don't specify the unit in the method name (the doccomment is
a better place for that). I can fix this when applying.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp`
  2026-03-04  2:53   ` Alexandre Courbot
@ 2026-03-04  3:10     ` Eliot Courtney
  0 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  3:10 UTC (permalink / raw)
  To: Alexandre Courbot, Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Zhi Wang,
	Alistair Popple, Joel Fernandes, nouveau, dri-devel, linux-kernel,
	rust-for-linux, dri-devel

On Wed Mar 4, 2026 at 11:53 AM JST, Alexandre Courbot wrote:
> On Wed Mar 4, 2026 at 10:42 AM JST, Eliot Courtney wrote:
>> Add a default method to `CommandToGsp` which computes the size of a
>> command.
>>
>> Tested-by: Zhi Wang <zhiw@nvidia.com>
>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/gsp/cmdq.rs | 12 +++++++++---
>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 48cf28b41f39..3424be4e15f8 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -94,6 +94,12 @@ fn init_variable_payload(
>>      ) -> Result {
>>          Ok(())
>>      }
>> +
>> +    /// Total size of the command (including its variable-length payload) without the
>> +    /// [`GspMsgElement`] header.
>> +    fn size_in_bytes(&self) -> usize {
>
> We usually don't specify the unit in the method name (the doccomment is
> a better place for that). I can fix this when applying.

Thanks, I will make a note of this so I don't make this mistake again. I
wanted to clarify this since size for a command could mean a few things.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
                   ` (6 preceding siblings ...)
  2026-03-04  1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  2026-03-04  1:42 ` [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records Eliot Courtney
  8 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

Splits large RPCs if necessary and sends the remaining parts using
continuation records. RPCs that do not need continuation records
continue to write directly into the command buffer. Ones that do write
into a staging buffer first, so there is one copy.

Continuation record for receive is not necessary to support at the
moment because those replies do not need to be read and are currently
drained by retrying `receive_msg` on `ERANGE`.

Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs              |  42 ++++++-
 drivers/gpu/nova-core/gsp/cmdq/continuation.rs | 163 +++++++++++++++++++++++++
 drivers/gpu/nova-core/gsp/fw.rs                |   4 +
 3 files changed, 207 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 3424be4e15f8..492e9489e808 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
+mod continuation;
+
 use core::{
     mem,
     sync::atomic::{
@@ -25,6 +27,11 @@
     },
 };
 
+use continuation::{
+    ContinuationRecord,
+    SplitState, //
+};
+
 use crate::{
     driver::Bar0,
     gsp::{
@@ -520,7 +527,7 @@ fn notify_gsp(bar: &Bar0) {
             .write(bar);
     }
 
-    /// Sends `command` to the GSP.
+    /// Sends `command` to the GSP, without splitting it.
     ///
     /// # Errors
     ///
@@ -529,7 +536,7 @@ fn notify_gsp(bar: &Bar0) {
     ///   written to by its [`CommandToGsp::init_variable_payload`] method.
     ///
     /// Error codes returned by the command initializers are propagated as-is.
-    pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
+    fn send_single_command<M>(&mut self, bar: &Bar0, command: M) -> Result
     where
         M: CommandToGsp,
         // This allows all error types, including `Infallible`, to be used for `M::InitError`.
@@ -588,6 +595,37 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
         Ok(())
     }
 
+    /// Sends `command` to the GSP.
+    ///
+    /// The command may be split into multiple messages if it is large.
+    ///
+    /// # Errors
+    ///
+    /// - `ETIMEDOUT` if space does not become available within the timeout.
+    /// - `EIO` if the variable payload requested by the command has not been entirely
+    ///   written to by its [`CommandToGsp::init_variable_payload`] method.
+    ///
+    /// Error codes returned by the command initializers are propagated as-is.
+    pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
+    where
+        M: CommandToGsp,
+        Error: From<M::InitError>,
+    {
+        match SplitState::new(command)? {
+            SplitState::Single(command) => self.send_single_command(bar, command),
+            SplitState::Split(command, mut continuations) => {
+                self.send_single_command(bar, command)?;
+
+                while let Some(continuation) = continuations.next() {
+                    // Turbofish needed because the compiler cannot infer M here.
+                    self.send_single_command::<ContinuationRecord<'_>>(bar, continuation)?;
+                }
+
+                Ok(())
+            }
+        }
+    }
+
     /// Wait for a message to become available on the message queue.
     ///
     /// This works purely at the transport layer and does not interpret or validate the message
diff --git a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
new file mode 100644
index 000000000000..3b83cdcbb39e
--- /dev/null
+++ b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Support for splitting large GSP commands across continuation records.
+
+use core::convert::Infallible;
+
+use kernel::prelude::*;
+
+use super::CommandToGsp;
+
+use crate::{
+    gsp::fw::{
+        GspMsgElement,
+        MsgFunction,
+        GSP_MSG_QUEUE_ELEMENT_SIZE_MAX, //
+    },
+    sbuffer::SBufferIter,
+};
+
+/// Maximum command size that fits in a single queue element.
+const MAX_CMD_SIZE: usize = GSP_MSG_QUEUE_ELEMENT_SIZE_MAX - size_of::<GspMsgElement>();
+
+/// Acts as an iterator over the continuation records for a split command.
+pub(super) struct ContinuationRecords {
+    payload: KVVec<u8>,
+    offset: usize,
+}
+
+impl ContinuationRecords {
+    /// Creates a new iterator over continuation records for the given payload.
+    fn new(payload: KVVec<u8>) -> Self {
+        Self { payload, offset: 0 }
+    }
+
+    /// Returns the next continuation record, or [`None`] if there are no more.
+    pub(super) fn next(&mut self) -> Option<ContinuationRecord<'_>> {
+        let remaining = self.payload.len() - self.offset;
+
+        if remaining > 0 {
+            let chunk_size = remaining.min(MAX_CMD_SIZE);
+            let record =
+                ContinuationRecord::new(&self.payload[self.offset..(self.offset + chunk_size)]);
+            self.offset += chunk_size;
+            Some(record)
+        } else {
+            None
+        }
+    }
+}
+
+/// The [`ContinuationRecord`] command.
+pub(super) struct ContinuationRecord<'a> {
+    data: &'a [u8],
+}
+
+impl<'a> ContinuationRecord<'a> {
+    /// Creates a new [`ContinuationRecord`] command with the given data.
+    fn new(data: &'a [u8]) -> Self {
+        Self { data }
+    }
+}
+
+impl<'a> CommandToGsp for ContinuationRecord<'a> {
+    const FUNCTION: MsgFunction = MsgFunction::ContinuationRecord;
+    type Command = ();
+    type InitError = Infallible;
+
+    fn init(&self) -> impl Init<Self::Command, Self::InitError> {
+        <()>::init_zeroed()
+    }
+
+    fn variable_payload_len(&self) -> usize {
+        self.data.len()
+    }
+
+    fn init_variable_payload(
+        &self,
+        dst: &mut SBufferIter<core::array::IntoIter<&mut [u8], 2>>,
+    ) -> Result {
+        dst.write_all(self.data)
+    }
+}
+
+/// Whether a command needs to be split across continuation records or not.
+pub(super) enum SplitState<C: CommandToGsp> {
+    /// A command that fits in a single queue element.
+    Single(C),
+    /// A command split across continuation records.
+    Split(SplitCommand<C>, ContinuationRecords),
+}
+
+impl<C: CommandToGsp> SplitState<C> {
+    /// Maximum variable payload size that fits in the first command alongside the command header.
+    const MAX_FIRST_PAYLOAD: usize = MAX_CMD_SIZE - size_of::<C::Command>();
+
+    /// Creates a new [`SplitState`] for the given command.
+    ///
+    /// If the command is too large, it will be split into a main command and some number of
+    /// continuation records.
+    pub(super) fn new(command: C) -> Result<Self> {
+        let payload_len = command.variable_payload_len();
+
+        if command.size_in_bytes() > MAX_CMD_SIZE {
+            let mut command_payload =
+                KVVec::<u8>::from_elem(0u8, payload_len.min(Self::MAX_FIRST_PAYLOAD), GFP_KERNEL)?;
+            let mut continuation_payload =
+                KVVec::<u8>::from_elem(0u8, payload_len - command_payload.len(), GFP_KERNEL)?;
+            let mut sbuffer = SBufferIter::new_writer([
+                command_payload.as_mut_slice(),
+                continuation_payload.as_mut_slice(),
+            ]);
+
+            command.init_variable_payload(&mut sbuffer)?;
+            if !sbuffer.is_empty() {
+                return Err(EIO);
+            }
+            drop(sbuffer);
+
+            Ok(Self::Split(
+                SplitCommand::new(command, command_payload),
+                ContinuationRecords::new(continuation_payload),
+            ))
+        } else {
+            Ok(Self::Single(command))
+        }
+    }
+}
+
+/// A command that has been truncated to maximum accepted length of the command queue.
+///
+/// The remainder of its payload is expected to be sent using [`ContinuationRecords`].
+pub(super) struct SplitCommand<C: CommandToGsp> {
+    command: C,
+    payload: KVVec<u8>,
+}
+
+impl<C: CommandToGsp> SplitCommand<C> {
+    /// Creates a new [`SplitCommand`] wrapping `command` with the given truncated payload.
+    fn new(command: C, payload: KVVec<u8>) -> Self {
+        Self { command, payload }
+    }
+}
+
+impl<C: CommandToGsp> CommandToGsp for SplitCommand<C> {
+    const FUNCTION: MsgFunction = C::FUNCTION;
+    type Command = C::Command;
+    type InitError = C::InitError;
+
+    fn init(&self) -> impl Init<Self::Command, Self::InitError> {
+        self.command.init()
+    }
+
+    fn variable_payload_len(&self) -> usize {
+        self.payload.len()
+    }
+
+    fn init_variable_payload(
+        &self,
+        dst: &mut SBufferIter<core::array::IntoIter<&mut [u8], 2>>,
+    ) -> Result {
+        dst.write_all(&self.payload)
+    }
+}
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 6005362450cb..25fca1f6db2c 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -202,6 +202,7 @@ pub(crate) enum MsgFunction {
     AllocObject = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT,
     AllocRoot = bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT,
     BindCtxDma = bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA,
+    ContinuationRecord = bindings::NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD,
     Free = bindings::NV_VGPU_MSG_FUNCTION_FREE,
     GetGspStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO,
     GetStaticInfo = bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO,
@@ -239,6 +240,9 @@ fn try_from(value: u32) -> Result<MsgFunction> {
             bindings::NV_VGPU_MSG_FUNCTION_ALLOC_OBJECT => Ok(MsgFunction::AllocObject),
             bindings::NV_VGPU_MSG_FUNCTION_ALLOC_ROOT => Ok(MsgFunction::AllocRoot),
             bindings::NV_VGPU_MSG_FUNCTION_BIND_CTX_DMA => Ok(MsgFunction::BindCtxDma),
+            bindings::NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD => {
+                Ok(MsgFunction::ContinuationRecord)
+            }
             bindings::NV_VGPU_MSG_FUNCTION_FREE => Ok(MsgFunction::Free),
             bindings::NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO => Ok(MsgFunction::GetGspStaticInfo),
             bindings::NV_VGPU_MSG_FUNCTION_GET_STATIC_INFO => Ok(MsgFunction::GetStaticInfo),

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records
  2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
                   ` (7 preceding siblings ...)
  2026-03-04  1:42 ` [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record Eliot Courtney
@ 2026-03-04  1:42 ` Eliot Courtney
  8 siblings, 0 replies; 22+ messages in thread
From: Eliot Courtney @ 2026-03-04  1:42 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross
  Cc: Zhi Wang, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, nouveau, dri-devel, linux-kernel, rust-for-linux,
	Eliot Courtney

Add tests for continuation record splitting. They cover boundary
conditions at the split points to make sure the right number of
continuation records are made. They also check that the data
concatenated is correct.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq/continuation.rs | 138 +++++++++++++++++++++++++
 1 file changed, 138 insertions(+)

diff --git a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
index 3b83cdcbb39e..637942917237 100644
--- a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
@@ -161,3 +161,141 @@ fn init_variable_payload(
         dst.write_all(&self.payload)
     }
 }
+
+#[kunit_tests(nova_core_gsp_continuation)]
+mod tests {
+    use super::*;
+
+    use kernel::transmute::{
+        AsBytes,
+        FromBytes, //
+    };
+
+    /// Non-zero-sized command header for testing.
+    #[repr(C)]
+    #[derive(Clone, Copy, Zeroable)]
+    struct TestHeader([u8; 64]);
+
+    // SAFETY: `TestHeader` is a plain array of bytes for which all bit patterns are valid.
+    unsafe impl FromBytes for TestHeader {}
+
+    // SAFETY: `TestHeader` is a plain array of bytes for which all bit patterns are valid.
+    unsafe impl AsBytes for TestHeader {}
+
+    struct TestPayload {
+        data: KVVec<u8>,
+    }
+
+    impl TestPayload {
+        fn generate_pattern(len: usize) -> Result<KVVec<u8>> {
+            let mut data = KVVec::with_capacity(len, GFP_KERNEL)?;
+            for i in 0..len {
+                // Mix in higher bits so the pattern does not repeat every 256 bytes.
+                data.push((i ^ (i >> 8)) as u8, GFP_KERNEL)?;
+            }
+            Ok(data)
+        }
+
+        fn new(len: usize) -> Result<Self> {
+            Ok(Self {
+                data: Self::generate_pattern(len)?,
+            })
+        }
+    }
+
+    impl CommandToGsp for TestPayload {
+        const FUNCTION: MsgFunction = MsgFunction::Nop;
+        type Command = TestHeader;
+        type InitError = Infallible;
+
+        fn init(&self) -> impl Init<Self::Command, Self::InitError> {
+            TestHeader::init_zeroed()
+        }
+
+        fn variable_payload_len(&self) -> usize {
+            self.data.len()
+        }
+
+        fn init_variable_payload(
+            &self,
+            dst: &mut SBufferIter<core::array::IntoIter<&mut [u8], 2>>,
+        ) -> Result {
+            dst.write_all(self.data.as_slice())
+        }
+    }
+
+    /// Maximum variable payload size that fits in the first command alongside the header.
+    const MAX_FIRST_PAYLOAD: usize = SplitState::<TestPayload>::MAX_FIRST_PAYLOAD;
+
+    fn read_payload(cmd: impl CommandToGsp) -> Result<KVVec<u8>> {
+        let len = cmd.variable_payload_len();
+        let mut buf = KVVec::from_elem(0u8, len, GFP_KERNEL)?;
+        let mut sbuf = SBufferIter::new_writer([buf.as_mut_slice(), &mut []]);
+        cmd.init_variable_payload(&mut sbuf)?;
+        drop(sbuf);
+        Ok(buf)
+    }
+
+    struct SplitTest {
+        payload_size: usize,
+        num_continuations: usize,
+    }
+
+    fn check_split(t: SplitTest) -> Result {
+        let payload = TestPayload::new(t.payload_size)?;
+        let mut num_continuations = 0;
+
+        let buf = match SplitState::new(payload)? {
+            SplitState::Single(cmd) => read_payload(cmd)?,
+            SplitState::Split(cmd, mut continuations) => {
+                let mut buf = read_payload(cmd)?;
+                assert!(size_of::<TestHeader>() + buf.len() <= MAX_CMD_SIZE);
+
+                while let Some(cont) = continuations.next() {
+                    let payload = read_payload(cont)?;
+                    assert!(payload.len() <= MAX_CMD_SIZE);
+                    buf.extend_from_slice(&payload, GFP_KERNEL)?;
+                    num_continuations += 1;
+                }
+
+                buf
+            }
+        };
+
+        assert_eq!(num_continuations, t.num_continuations);
+        assert_eq!(
+            buf.as_slice(),
+            TestPayload::generate_pattern(t.payload_size)?.as_slice()
+        );
+        Ok(())
+    }
+
+    #[test]
+    fn split_command() -> Result {
+        check_split(SplitTest {
+            payload_size: 0,
+            num_continuations: 0,
+        })?;
+        check_split(SplitTest {
+            payload_size: MAX_FIRST_PAYLOAD,
+            num_continuations: 0,
+        })?;
+        check_split(SplitTest {
+            payload_size: MAX_FIRST_PAYLOAD + 1,
+            num_continuations: 1,
+        })?;
+        check_split(SplitTest {
+            payload_size: MAX_FIRST_PAYLOAD + MAX_CMD_SIZE,
+            num_continuations: 1,
+        })?;
+        check_split(SplitTest {
+            payload_size: MAX_FIRST_PAYLOAD + MAX_CMD_SIZE + 1,
+            num_continuations: 2,
+        })?;
+        check_split(SplitTest {
+            payload_size: MAX_FIRST_PAYLOAD + MAX_CMD_SIZE * 3 + MAX_CMD_SIZE / 2,
+            num_continuations: 4,
+        })?;
+        Ok(())
+    }
+}

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-03-06 11:03 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-04  1:42 [PATCH v5 0/9] gpu: nova-core: gsp: add continuation record support Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 1/9] gpu: nova-core: gsp: sort `MsgFunction` variants alphabetically Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 2/9] gpu: nova-core: gsp: add mechanism to wait for space on command queue Eliot Courtney
2026-03-04 11:39   ` Danilo Krummrich
2026-03-05  7:37     ` Eliot Courtney
2026-03-05  7:50       ` John Hubbard
2026-03-05  8:01         ` Eliot Courtney
2026-03-05 10:38       ` Alexandre Courbot
2026-03-05 11:16       ` Danilo Krummrich
2026-03-06  0:48         ` Alexandre Courbot
2026-03-06  1:46           ` Gary Guo
2026-03-06 11:03           ` Danilo Krummrich
2026-03-06  5:21         ` Alexandre Courbot
2026-03-04  1:42 ` [PATCH v5 3/9] rust: add EMSGSIZE error code Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 4/9] gpu: nova-core: gsp: add checking oversized commands Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 5/9] gpu: nova-core: gsp: clarify invariant on command queue Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 6/9] gpu: nova-core: gsp: unconditionally call variable payload handling Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 7/9] gpu: nova-core: gsp: add `size_in_bytes` helper to `CommandToGsp` Eliot Courtney
2026-03-04  2:53   ` Alexandre Courbot
2026-03-04  3:10     ` Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 8/9] gpu: nova-core: gsp: support large RPCs via continuation record Eliot Courtney
2026-03-04  1:42 ` [PATCH v5 9/9] gpu: nova-core: gsp: add tests for continuation records Eliot Courtney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox