[PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq

public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq
@ 2026-03-18  4:07 Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 1/5] gpu: nova-core: gsp: fix stale doc comments on command queue methods Eliot Courtney
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Eliot Courtney @ 2026-03-18  4:07 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Benno Lossin, Gary Guo
  Cc: John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	dri-devel, linux-kernel, rust-for-linux, Eliot Courtney, Zhi Wang

Add locking to Cmdq. This is required e.g. for unloading the driver,
which needs to send the UnloadingGuestDriver via the command queue
on unbind which may be on a different thread.

We have commands that need a reply and commands that don't. For
commands with a reply we want to make sure that they don't get
the reply of a different command back. The approach this patch series
takes is by making those commands block until they get a response. For
now this should be ok, and we expect GSP to be fast anyway.

To do this, we need to know which commands expect a reply and which
don't. John's existing series[1] adds IS_ASYNC which solves part of the
problem, but we need to know a bit more. So instead, add an
associated type called Reply which tells us what the reply is.

An alternative would be to define traits inheriting CommandToGsp, e.g.
CommandWithReply and CommandWithoutReply, instead of using the
associated type. I implemented the associated type version because it
feels more compositional rather than inherity so seemed a bit better
to me. But both of these approaches work and are fine, IMO.

In summary, this patch series has three steps:
1. Add the type infrastructure to know what replies are expected for a
   command and update each caller to explicitly wait for the reply or
   not.
2. Make Cmdq pinned so we can use Mutex
3. Add a Mutex to protect Cmdq by moving the relevant state to an
   inner struct.

[1]: https://lore.kernel.org/all/20260211000451.192109-1-jhubbard@nvidia.com/

Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
Changes in v5:
- tweak documentation
- rebase on drm-rust-next
- Link to v4: https://lore.kernel.org/r/20260310-cmdq-locking-v4-0-4e5c4753c408@nvidia.com

Changes in v4:
- Change RECEIVE_TIMEOUT from 10s to 5s
- Update NoReply doc comment
- Move CommandToGsp doc comment update
- Fix import formatting in continuation.rs
- Move CmdqInner after Cmdq to reduce diff size
- Link to v3: https://lore.kernel.org/r/20260304-cmdq-locking-v3-0-a6314b708850@nvidia.com

Changes in v3:
- Rename send_sync_command/send_async_command to
  send_command/send_command_no_wait.
- Move `dev` field into `CmdqInner` to avoid passing it through method
  parameters.
- Add `RECEIVE_TIMEOUT` constant for the 10s receive timeout.
- Link to v2: https://lore.kernel.org/r/20260226-cmdq-locking-v2-0-c7e16a6d5885@nvidia.com

Changes in v2:
- Rebase on drm-rust-next
- Link to v1: https://lore.kernel.org/r/20260225-cmdq-locking-v1-0-bbf6b4156706@nvidia.com

---
Eliot Courtney (5):
      gpu: nova-core: gsp: fix stale doc comments on command queue methods
      gpu: nova-core: gsp: add `RECEIVE_TIMEOUT` constant for command queue
      gpu: nova-core: gsp: add reply/no-reply info to `CommandToGsp`
      gpu: nova-core: gsp: make `Cmdq` a pinned type
      gpu: nova-core: gsp: add mutex locking to Cmdq

 drivers/gpu/nova-core/gsp.rs                   |   5 +-
 drivers/gpu/nova-core/gsp/boot.rs              |  13 +-
 drivers/gpu/nova-core/gsp/cmdq.rs              | 159 +++++++++++++++++++------
 drivers/gpu/nova-core/gsp/cmdq/continuation.rs |   8 +-
 drivers/gpu/nova-core/gsp/commands.rs          |  23 ++--
 drivers/gpu/nova-core/gsp/sequencer.rs         |   4 +-
 6 files changed, 151 insertions(+), 61 deletions(-)
---
base-commit: d19ab42867ae7c68be84ed957d95712b7934773f
change-id: 20260225-cmdq-locking-d32928a2c2cf

Best regards,
-- 
Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5 1/5] gpu: nova-core: gsp: fix stale doc comments on command queue methods
  2026-03-18  4:07 [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq Eliot Courtney
@ 2026-03-18  4:07 ` Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 2/5] gpu: nova-core: gsp: add `RECEIVE_TIMEOUT` constant for command queue Eliot Courtney
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Eliot Courtney @ 2026-03-18  4:07 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Benno Lossin, Gary Guo
  Cc: John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	dri-devel, linux-kernel, rust-for-linux, Eliot Courtney, Zhi Wang

Fix some inaccuracies / old doc comments.

Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index efa1aab1568f..f7ca6856ff35 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -502,6 +502,7 @@ fn notify_gsp(bar: &Bar0) {
     ///
     /// # Errors
     ///
+    /// - `EMSGSIZE` if the command exceeds the maximum queue element size.
     /// - `ETIMEDOUT` if space does not become available within the timeout.
     /// - `EIO` if the variable payload requested by the command has not been entirely
     ///   written to by its [`CommandToGsp::init_variable_payload`] method.
@@ -682,22 +683,20 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
 
     /// Receive a message from the GSP.
     ///
-    /// `init` is a closure tasked with processing the message. It receives a reference to the
-    /// message in the message queue, and a [`SBufferIter`] pointing to its variable-length
-    /// payload, if any.
+    /// The expected message type is specified using the `M` generic parameter. If the pending
+    /// message has a different function code, `ERANGE` is returned and the message is consumed.
     ///
-    /// The expected message is specified using the `M` generic parameter. If the pending message
-    /// is different, `EAGAIN` is returned and the unexpected message is dropped.
-    ///
-    /// This design is by no means final, but it is simple and will let us go through GSP
-    /// initialization.
+    /// The read pointer is always advanced past the message, regardless of whether it matched.
     ///
     /// # Errors
     ///
     /// - `ETIMEDOUT` if `timeout` has elapsed before any message becomes available.
     /// - `EIO` if there was some inconsistency (e.g. message shorter than advertised) on the
     ///   message queue.
-    /// - `EINVAL` if the function of the message was unrecognized.
+    /// - `EINVAL` if the function code of the message was not recognized.
+    /// - `ERANGE` if the message had a recognized but non-matching function code.
+    ///
+    /// Error codes returned by [`MessageFromGsp::read`] are propagated as-is.
     pub(crate) fn receive_msg<M: MessageFromGsp>(&mut self, timeout: Delta) -> Result<M>
     where
         // This allows all error types, including `Infallible`, to be used for `M::InitError`.

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 2/5] gpu: nova-core: gsp: add `RECEIVE_TIMEOUT` constant for command queue
  2026-03-18  4:07 [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 1/5] gpu: nova-core: gsp: fix stale doc comments on command queue methods Eliot Courtney
@ 2026-03-18  4:07 ` Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 3/5] gpu: nova-core: gsp: add reply/no-reply info to `CommandToGsp` Eliot Courtney
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Eliot Courtney @ 2026-03-18  4:07 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Benno Lossin, Gary Guo
  Cc: John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	dri-devel, linux-kernel, rust-for-linux, Eliot Courtney, Zhi Wang

Remove magic numbers and add a default timeout for callers to use.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/cmdq.rs      | 3 +++
 drivers/gpu/nova-core/gsp/commands.rs  | 5 ++---
 drivers/gpu/nova-core/gsp/sequencer.rs | 2 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index f7ca6856ff35..c62db727a2a9 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -467,6 +467,9 @@ impl Cmdq {
     /// Timeout for waiting for space on the command queue.
     const ALLOCATE_TIMEOUT: Delta = Delta::from_secs(1);
 
+    /// Default timeout for receiving a message from the GSP.
+    pub(super) const RECEIVE_TIMEOUT: Delta = Delta::from_secs(5);
+
     /// Creates a new command queue for `dev`.
     pub(crate) fn new(dev: &device::Device<device::Bound>) -> Result<Cmdq> {
         let gsp_mem = DmaGspMem::new(dev)?;
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 8f270eca33be..88df117ba575 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -11,7 +11,6 @@
     device,
     pci,
     prelude::*,
-    time::Delta,
     transmute::{
         AsBytes,
         FromBytes, //
@@ -165,7 +164,7 @@ fn read(
 /// Waits for GSP initialization to complete.
 pub(crate) fn wait_gsp_init_done(cmdq: &mut Cmdq) -> Result {
     loop {
-        match cmdq.receive_msg::<GspInitDone>(Delta::from_secs(10)) {
+        match cmdq.receive_msg::<GspInitDone>(Cmdq::RECEIVE_TIMEOUT) {
             Ok(_) => break Ok(()),
             Err(ERANGE) => continue,
             Err(e) => break Err(e),
@@ -235,7 +234,7 @@ pub(crate) fn get_gsp_info(cmdq: &mut Cmdq, bar: &Bar0) -> Result<GetGspStaticIn
     cmdq.send_command(bar, GetGspStaticInfo)?;
 
     loop {
-        match cmdq.receive_msg::<GetGspStaticInfoReply>(Delta::from_secs(5)) {
+        match cmdq.receive_msg::<GetGspStaticInfoReply>(Cmdq::RECEIVE_TIMEOUT) {
             Ok(info) => return Ok(info),
             Err(ERANGE) => continue,
             Err(e) => return Err(e),
diff --git a/drivers/gpu/nova-core/gsp/sequencer.rs b/drivers/gpu/nova-core/gsp/sequencer.rs
index 0cfbedc47fcf..ce2b3bb05d22 100644
--- a/drivers/gpu/nova-core/gsp/sequencer.rs
+++ b/drivers/gpu/nova-core/gsp/sequencer.rs
@@ -358,7 +358,7 @@ pub(crate) struct GspSequencerParams<'a> {
 impl<'a> GspSequencer<'a> {
     pub(crate) fn run(cmdq: &mut Cmdq, params: GspSequencerParams<'a>) -> Result {
         let seq_info = loop {
-            match cmdq.receive_msg::<GspSequence>(Delta::from_secs(10)) {
+            match cmdq.receive_msg::<GspSequence>(Cmdq::RECEIVE_TIMEOUT) {
                 Ok(seq_info) => break seq_info,
                 Err(ERANGE) => continue,
                 Err(e) => return Err(e),

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 3/5] gpu: nova-core: gsp: add reply/no-reply info to `CommandToGsp`
  2026-03-18  4:07 [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 1/5] gpu: nova-core: gsp: fix stale doc comments on command queue methods Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 2/5] gpu: nova-core: gsp: add `RECEIVE_TIMEOUT` constant for command queue Eliot Courtney
@ 2026-03-18  4:07 ` Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 4/5] gpu: nova-core: gsp: make `Cmdq` a pinned type Eliot Courtney
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Eliot Courtney @ 2026-03-18  4:07 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Benno Lossin, Gary Guo
  Cc: John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	dri-devel, linux-kernel, rust-for-linux, Eliot Courtney, Zhi Wang

Add type infrastructure to know what reply is expected from each
`CommandToGsp`. Uses a marker type `NoReply` which does not implement
`MessageFromGsp` to mark commands which don't expect a response.

Update `send_command` to wait for a reply and add `send_command_no_wait`
which sends a command that has no reply, without blocking.

This prepares for adding locking to the queue.

Tested-by: Zhi Wang <zhiw@nvidia.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs              |  5 ++-
 drivers/gpu/nova-core/gsp/cmdq.rs              | 62 ++++++++++++++++++++++++--
 drivers/gpu/nova-core/gsp/cmdq/continuation.rs |  8 +++-
 drivers/gpu/nova-core/gsp/commands.rs          | 16 +++----
 4 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 6db2decbc6f5..ffc478b33640 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -169,8 +169,9 @@ pub(crate) fn boot(
         dma_write!(wpr_meta, [0]?, GspFwWprMeta::new(&gsp_fw, &fb_layout));
 
         self.cmdq
-            .send_command(bar, commands::SetSystemInfo::new(pdev))?;
-        self.cmdq.send_command(bar, commands::SetRegistry::new())?;
+            .send_command_no_wait(bar, commands::SetSystemInfo::new(pdev))?;
+        self.cmdq
+            .send_command_no_wait(bar, commands::SetRegistry::new())?;
 
         gsp_falcon.reset(bar)?;
         let libos_handle = self.libos.dma_handle();
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index c62db727a2a9..4fc14689d38e 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -45,10 +45,14 @@
     sbuffer::SBufferIter, //
 };
 
+/// Marker type representing the absence of a reply for a command. Commands using this as their
+/// reply type are sent using [`Cmdq::send_command_no_wait`].
+pub(crate) struct NoReply;
+
 /// Trait implemented by types representing a command to send to the GSP.
 ///
-/// The main purpose of this trait is to provide [`Cmdq::send_command`] with the information it
-/// needs to send a given command.
+/// The main purpose of this trait is to provide [`Cmdq`] with the information it needs to send
+/// a given command.
 ///
 /// [`CommandToGsp::init`] in particular is responsible for initializing the command directly
 /// into the space reserved for it in the command queue buffer.
@@ -63,6 +67,10 @@ pub(crate) trait CommandToGsp {
     /// Type generated by [`CommandToGsp::init`], to be written into the command queue buffer.
     type Command: FromBytes + AsBytes;
 
+    /// Type of the reply expected from the GSP, or [`NoReply`] for commands that don't
+    /// have a reply.
+    type Reply;
+
     /// Error type returned by [`CommandToGsp::init`].
     type InitError;
 
@@ -581,7 +589,7 @@ fn send_single_command<M>(&mut self, bar: &Bar0, command: M) -> Result
     ///   written to by its [`CommandToGsp::init_variable_payload`] method.
     ///
     /// Error codes returned by the command initializers are propagated as-is.
-    pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
+    fn send_command_internal<M>(&mut self, bar: &Bar0, command: M) -> Result
     where
         M: CommandToGsp,
         Error: From<M::InitError>,
@@ -601,6 +609,54 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
         }
     }
 
+    /// Sends `command` to the GSP and waits for the reply.
+    ///
+    /// Messages with non-matching function codes are silently consumed until the expected reply
+    /// arrives.
+    ///
+    /// # Errors
+    ///
+    /// - `ETIMEDOUT` if space does not become available to send the command, or if the reply is
+    ///   not received within the timeout.
+    /// - `EIO` if the variable payload requested by the command has not been entirely
+    ///   written to by its [`CommandToGsp::init_variable_payload`] method.
+    ///
+    /// Error codes returned by the command and reply initializers are propagated as-is.
+    pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result<M::Reply>
+    where
+        M: CommandToGsp,
+        M::Reply: MessageFromGsp,
+        Error: From<M::InitError>,
+        Error: From<<M::Reply as MessageFromGsp>::InitError>,
+    {
+        self.send_command_internal(bar, command)?;
+
+        loop {
+            match self.receive_msg::<M::Reply>(Self::RECEIVE_TIMEOUT) {
+                Ok(reply) => break Ok(reply),
+                Err(ERANGE) => continue,
+                Err(e) => break Err(e),
+            }
+        }
+    }
+
+    /// Sends `command` to the GSP without waiting for a reply.
+    ///
+    /// # Errors
+    ///
+    /// - `ETIMEDOUT` if space does not become available within the timeout.
+    /// - `EIO` if the variable payload requested by the command has not been entirely
+    ///   written to by its [`CommandToGsp::init_variable_payload`] method.
+    ///
+    /// Error codes returned by the command initializers are propagated as-is.
+    pub(crate) fn send_command_no_wait<M>(&mut self, bar: &Bar0, command: M) -> Result
+    where
+        M: CommandToGsp<Reply = NoReply>,
+        Error: From<M::InitError>,
+    {
+        self.send_command_internal(bar, command)
+    }
+
     /// Wait for a message to become available on the message queue.
     ///
     /// This works purely at the transport layer and does not interpret or validate the message
diff --git a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
index 2aa17caac2e0..05e904f18097 100644
--- a/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq/continuation.rs
@@ -6,7 +6,10 @@
 
 use kernel::prelude::*;
 
-use super::CommandToGsp;
+use super::{
+    CommandToGsp,
+    NoReply, //
+};
 
 use crate::{
     gsp::fw::{
@@ -63,6 +66,7 @@ fn new(data: &'a [u8]) -> Self {
 impl<'a> CommandToGsp for ContinuationRecord<'a> {
     const FUNCTION: MsgFunction = MsgFunction::ContinuationRecord;
     type Command = ();
+    type Reply = NoReply;
     type InitError = Infallible;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
@@ -144,6 +148,7 @@ fn new(command: C, payload: KVVec<u8>) -> Self {
 impl<C: CommandToGsp> CommandToGsp for SplitCommand<C> {
     const FUNCTION: MsgFunction = C::FUNCTION;
     type Command = C::Command;
+    type Reply = C::Reply;
     type InitError = C::InitError;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
@@ -206,6 +211,7 @@ fn new(len: usize) -> Result<Self> {
     impl CommandToGsp for TestPayload {
         const FUNCTION: MsgFunction = MsgFunction::Nop;
         type Command = TestHeader;
+        type Reply = NoReply;
         type InitError = Infallible;
 
         fn init(&self) -> impl Init<Self::Command, Self::InitError> {
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 88df117ba575..77054c92fcc2 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -23,7 +23,8 @@
         cmdq::{
             Cmdq,
             CommandToGsp,
-            MessageFromGsp, //
+            MessageFromGsp,
+            NoReply, //
         },
         fw::{
             commands::*,
@@ -48,6 +49,7 @@ pub(crate) fn new(pdev: &'a pci::Device<device::Bound>) -> Self {
 impl<'a> CommandToGsp for SetSystemInfo<'a> {
     const FUNCTION: MsgFunction = MsgFunction::GspSetSystemInfo;
     type Command = GspSetSystemInfo;
+    type Reply = NoReply;
     type InitError = Error;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
@@ -99,6 +101,7 @@ pub(crate) fn new() -> Self {
 impl CommandToGsp for SetRegistry {
     const FUNCTION: MsgFunction = MsgFunction::SetRegistry;
     type Command = PackedRegistryTable;
+    type Reply = NoReply;
     type InitError = Infallible;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
@@ -178,6 +181,7 @@ pub(crate) fn wait_gsp_init_done(cmdq: &mut Cmdq) -> Result {
 impl CommandToGsp for GetGspStaticInfo {
     const FUNCTION: MsgFunction = MsgFunction::GetGspStaticInfo;
     type Command = GspStaticConfigInfo;
+    type Reply = GetGspStaticInfoReply;
     type InitError = Infallible;
 
     fn init(&self) -> impl Init<Self::Command, Self::InitError> {
@@ -231,13 +235,5 @@ pub(crate) fn gpu_name(&self) -> core::result::Result<&str, GpuNameError> {
 
 /// Send the [`GetGspInfo`] command and awaits for its reply.
 pub(crate) fn get_gsp_info(cmdq: &mut Cmdq, bar: &Bar0) -> Result<GetGspStaticInfoReply> {
-    cmdq.send_command(bar, GetGspStaticInfo)?;
-
-    loop {
-        match cmdq.receive_msg::<GetGspStaticInfoReply>(Cmdq::RECEIVE_TIMEOUT) {
-            Ok(info) => return Ok(info),
-            Err(ERANGE) => continue,
-            Err(e) => return Err(e),
-        }
-    }
+    cmdq.send_command(bar, GetGspStaticInfo)
 }

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 4/5] gpu: nova-core: gsp: make `Cmdq` a pinned type
  2026-03-18  4:07 [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq Eliot Courtney
                   ` (2 preceding siblings ...)
  2026-03-18  4:07 ` [PATCH v5 3/5] gpu: nova-core: gsp: add reply/no-reply info to `CommandToGsp` Eliot Courtney
@ 2026-03-18  4:07 ` Eliot Courtney
  2026-03-18  4:07 ` [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq Eliot Courtney
  2026-03-19  2:10 ` [PATCH v5 0/5] gpu: nova-core: gsp: add " Alexandre Courbot
  5 siblings, 0 replies; 13+ messages in thread
From: Eliot Courtney @ 2026-03-18  4:07 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Benno Lossin, Gary Guo
  Cc: John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	dri-devel, linux-kernel, rust-for-linux, Eliot Courtney, Zhi Wang

Make `Cmdq` a pinned type. This is needed to use Mutex, which is needed
to add locking to `Cmdq`.

Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp.rs      | 5 +++--
 drivers/gpu/nova-core/gsp/cmdq.rs | 9 ++++-----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index c69adaa92bbe..72f173726f87 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -114,6 +114,7 @@ pub(crate) struct Gsp {
     /// RM log buffer.
     logrm: LogBuffer,
     /// Command queue.
+    #[pin]
     pub(crate) cmdq: Cmdq,
     /// RM arguments.
     rmargs: CoherentAllocation<GspArgumentsPadded>,
@@ -134,7 +135,7 @@ pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error
                 loginit: LogBuffer::new(dev)?,
                 logintr: LogBuffer::new(dev)?,
                 logrm: LogBuffer::new(dev)?,
-                cmdq: Cmdq::new(dev)?,
+                cmdq <- Cmdq::new(dev),
                 rmargs: CoherentAllocation::<GspArgumentsPadded>::alloc_coherent(
                     dev,
                     1,
@@ -151,7 +152,7 @@ pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error
                         libos, [1]?, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0)
                     );
                     dma_write!(libos, [2]?, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0));
-                    dma_write!(rmargs, [0]?.inner, fw::GspArgumentsCached::new(cmdq));
+                    dma_write!(rmargs, [0]?.inner, fw::GspArgumentsCached::new(&cmdq));
                     dma_write!(libos, [3]?, LibosMemoryRegionInitArgument::new("RMARGS", rmargs));
                 },
             }))
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 4fc14689d38e..86ff9a3d1732 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -446,6 +446,7 @@ struct GspMessage<'a> {
 ///
 /// Provides the ability to send commands and receive messages from the GSP using a shared memory
 /// area.
+#[pin_data]
 pub(crate) struct Cmdq {
     /// Device this command queue belongs to.
     dev: ARef<device::Device>,
@@ -479,13 +480,11 @@ impl Cmdq {
     pub(super) const RECEIVE_TIMEOUT: Delta = Delta::from_secs(5);
 
     /// Creates a new command queue for `dev`.
-    pub(crate) fn new(dev: &device::Device<device::Bound>) -> Result<Cmdq> {
-        let gsp_mem = DmaGspMem::new(dev)?;
-
-        Ok(Cmdq {
+    pub(crate) fn new(dev: &device::Device<device::Bound>) -> impl PinInit<Self, Error> + '_ {
+        try_pin_init!(Self {
+            gsp_mem: DmaGspMem::new(dev)?,
             dev: dev.into(),
             seq: 0,
-            gsp_mem,
         })
     }
 

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq
  2026-03-18  4:07 [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq Eliot Courtney
                   ` (3 preceding siblings ...)
  2026-03-18  4:07 ` [PATCH v5 4/5] gpu: nova-core: gsp: make `Cmdq` a pinned type Eliot Courtney
@ 2026-03-18  4:07 ` Eliot Courtney
  2026-03-18 15:27   ` Alexandre Courbot
  2026-03-19  2:10 ` [PATCH v5 0/5] gpu: nova-core: gsp: add " Alexandre Courbot
  5 siblings, 1 reply; 13+ messages in thread
From: Eliot Courtney @ 2026-03-18  4:07 UTC (permalink / raw)
  To: Danilo Krummrich, Alice Ryhl, Alexandre Courbot, David Airlie,
	Simona Vetter, Benno Lossin, Gary Guo
  Cc: John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	dri-devel, linux-kernel, rust-for-linux, Eliot Courtney, Zhi Wang

Wrap `Cmdq`'s mutable state in a new struct `CmdqInner` and wrap that in
a Mutex. This lets `Cmdq` methods take &self instead of &mut self, which
lets required commands be sent e.g. while unloading the driver.

The mutex is held over both send and receive in `send_command` to make
sure that it doesn't get the reply of some other command that could have
been sent just beforehand.

Reviewed-by: Zhi Wang <zhiw@nvidia.com>
Tested-by: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
 drivers/gpu/nova-core/gsp/boot.rs      |   8 +-
 drivers/gpu/nova-core/gsp/cmdq.rs      | 170 +++++++++++++++++++--------------
 drivers/gpu/nova-core/gsp/commands.rs  |   4 +-
 drivers/gpu/nova-core/gsp/sequencer.rs |   2 +-
 4 files changed, 107 insertions(+), 77 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index ffc478b33640..5e73bd769dcc 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -137,7 +137,7 @@ fn run_fwsec_frts(
     ///
     /// Upon return, the GSP is up and running, and its runtime object given as return value.
     pub(crate) fn boot(
-        mut self: Pin<&mut Self>,
+        self: Pin<&mut Self>,
         pdev: &pci::Device<device::Bound>,
         bar: &Bar0,
         chipset: Chipset,
@@ -223,13 +223,13 @@ pub(crate) fn boot(
             dev: pdev.as_ref().into(),
             bar,
         };
-        GspSequencer::run(&mut self.cmdq, seq_params)?;
+        GspSequencer::run(&self.cmdq, seq_params)?;
 
         // Wait until GSP is fully initialized.
-        commands::wait_gsp_init_done(&mut self.cmdq)?;
+        commands::wait_gsp_init_done(&self.cmdq)?;
 
         // Obtain and display basic GPU information.
-        let info = commands::get_gsp_info(&mut self.cmdq, bar)?;
+        let info = commands::get_gsp_info(&self.cmdq, bar)?;
         match info.gpu_name() {
             Ok(name) => dev_info!(pdev, "GPU name: {}\n", name),
             Err(e) => dev_warn!(pdev, "GPU name unavailable: {:?}\n", e),
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 86ff9a3d1732..d36a62ba1c60 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -12,8 +12,12 @@
     },
     dma_write,
     io::poll::read_poll_timeout,
+    new_mutex,
     prelude::*,
-    sync::aref::ARef,
+    sync::{
+        aref::ARef,
+        Mutex, //
+    },
     time::Delta,
     transmute::{
         AsBytes,
@@ -448,12 +452,9 @@ struct GspMessage<'a> {
 /// area.
 #[pin_data]
 pub(crate) struct Cmdq {
-    /// Device this command queue belongs to.
-    dev: ARef<device::Device>,
-    /// Current command sequence number.
-    seq: u32,
-    /// Memory area shared with the GSP for communicating commands and messages.
-    gsp_mem: DmaGspMem,
+    /// Inner mutex-protected state.
+    #[pin]
+    inner: Mutex<CmdqInner>,
 }
 
 impl Cmdq {
@@ -473,18 +474,17 @@ impl Cmdq {
     /// Number of page table entries for the GSP shared region.
     pub(crate) const NUM_PTES: usize = size_of::<GspMem>() >> GSP_PAGE_SHIFT;
 
-    /// Timeout for waiting for space on the command queue.
-    const ALLOCATE_TIMEOUT: Delta = Delta::from_secs(1);
-
     /// Default timeout for receiving a message from the GSP.
     pub(super) const RECEIVE_TIMEOUT: Delta = Delta::from_secs(5);
 
     /// Creates a new command queue for `dev`.
     pub(crate) fn new(dev: &device::Device<device::Bound>) -> impl PinInit<Self, Error> + '_ {
         try_pin_init!(Self {
-            gsp_mem: DmaGspMem::new(dev)?,
-            dev: dev.into(),
-            seq: 0,
+            inner <- new_mutex!(CmdqInner {
+                dev: dev.into(),
+                gsp_mem: DmaGspMem::new(dev)?,
+                seq: 0,
+            }),
         })
     }
 
@@ -508,6 +508,89 @@ fn notify_gsp(bar: &Bar0) {
             .write(bar);
     }
 
+    /// Sends `command` to the GSP and waits for the reply.
+    ///
+    /// Messages with non-matching function codes are silently consumed until the expected reply
+    /// arrives.
+    ///
+    /// The queue is locked for the entire send+receive cycle to ensure that no other command can
+    /// be interleaved.
+    ///
+    /// # Errors
+    ///
+    /// - `ETIMEDOUT` if space does not become available to send the command, or if the reply is
+    ///   not received within the timeout.
+    /// - `EIO` if the variable payload requested by the command has not been entirely
+    ///   written to by its [`CommandToGsp::init_variable_payload`] method.
+    ///
+    /// Error codes returned by the command and reply initializers are propagated as-is.
+    pub(crate) fn send_command<M>(&self, bar: &Bar0, command: M) -> Result<M::Reply>
+    where
+        M: CommandToGsp,
+        M::Reply: MessageFromGsp,
+        Error: From<M::InitError>,
+        Error: From<<M::Reply as MessageFromGsp>::InitError>,
+    {
+        let mut inner = self.inner.lock();
+        inner.send_command(bar, command)?;
+
+        loop {
+            match inner.receive_msg::<M::Reply>(Self::RECEIVE_TIMEOUT) {
+                Ok(reply) => break Ok(reply),
+                Err(ERANGE) => continue,
+                Err(e) => break Err(e),
+            }
+        }
+    }
+
+    /// Sends `command` to the GSP without waiting for a reply.
+    ///
+    /// # Errors
+    ///
+    /// - `ETIMEDOUT` if space does not become available within the timeout.
+    /// - `EIO` if the variable payload requested by the command has not been entirely
+    ///   written to by its [`CommandToGsp::init_variable_payload`] method.
+    ///
+    /// Error codes returned by the command initializers are propagated as-is.
+    pub(crate) fn send_command_no_wait<M>(&self, bar: &Bar0, command: M) -> Result
+    where
+        M: CommandToGsp<Reply = NoReply>,
+        Error: From<M::InitError>,
+    {
+        self.inner.lock().send_command(bar, command)
+    }
+
+    /// Receive a message from the GSP.
+    ///
+    /// See [`CmdqInner::receive_msg`] for details.
+    pub(crate) fn receive_msg<M: MessageFromGsp>(&self, timeout: Delta) -> Result<M>
+    where
+        // This allows all error types, including `Infallible`, to be used for `M::InitError`.
+        Error: From<M::InitError>,
+    {
+        self.inner.lock().receive_msg(timeout)
+    }
+
+    /// Returns the DMA handle of the command queue's shared memory region.
+    pub(crate) fn dma_handle(&self) -> DmaAddress {
+        self.inner.lock().gsp_mem.0.dma_handle()
+    }
+}
+
+/// Inner mutex protected state of [`Cmdq`].
+struct CmdqInner {
+    /// Device this command queue belongs to.
+    dev: ARef<device::Device>,
+    /// Current command sequence number.
+    seq: u32,
+    /// Memory area shared with the GSP for communicating commands and messages.
+    gsp_mem: DmaGspMem,
+}
+
+impl CmdqInner {
+    /// Timeout for waiting for space on the command queue.
+    const ALLOCATE_TIMEOUT: Delta = Delta::from_secs(1);
+
     /// Sends `command` to the GSP, without splitting it.
     ///
     /// # Errors
@@ -588,7 +671,7 @@ fn send_single_command<M>(&mut self, bar: &Bar0, command: M) -> Result
     ///   written to by its [`CommandToGsp::init_variable_payload`] method.
     ///
     /// Error codes returned by the command initializers are propagated as-is.
-    fn send_command_internal<M>(&mut self, bar: &Bar0, command: M) -> Result
+    fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
     where
         M: CommandToGsp,
         Error: From<M::InitError>,
@@ -608,54 +691,6 @@ fn send_command_internal<M>(&mut self, bar: &Bar0, command: M) -> Result
         }
     }
 
-    /// Sends `command` to the GSP and waits for the reply.
-    ///
-    /// Messages with non-matching function codes are silently consumed until the expected reply
-    /// arrives.
-    ///
-    /// # Errors
-    ///
-    /// - `ETIMEDOUT` if space does not become available to send the command, or if the reply is
-    ///   not received within the timeout.
-    /// - `EIO` if the variable payload requested by the command has not been entirely
-    ///   written to by its [`CommandToGsp::init_variable_payload`] method.
-    ///
-    /// Error codes returned by the command and reply initializers are propagated as-is.
-    pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result<M::Reply>
-    where
-        M: CommandToGsp,
-        M::Reply: MessageFromGsp,
-        Error: From<M::InitError>,
-        Error: From<<M::Reply as MessageFromGsp>::InitError>,
-    {
-        self.send_command_internal(bar, command)?;
-
-        loop {
-            match self.receive_msg::<M::Reply>(Self::RECEIVE_TIMEOUT) {
-                Ok(reply) => break Ok(reply),
-                Err(ERANGE) => continue,
-                Err(e) => break Err(e),
-            }
-        }
-    }
-
-    /// Sends `command` to the GSP without waiting for a reply.
-    ///
-    /// # Errors
-    ///
-    /// - `ETIMEDOUT` if space does not become available within the timeout.
-    /// - `EIO` if the variable payload requested by the command has not been entirely
-    ///   written to by its [`CommandToGsp::init_variable_payload`] method.
-    ///
-    /// Error codes returned by the command initializers are propagated as-is.
-    pub(crate) fn send_command_no_wait<M>(&mut self, bar: &Bar0, command: M) -> Result
-    where
-        M: CommandToGsp<Reply = NoReply>,
-        Error: From<M::InitError>,
-    {
-        self.send_command_internal(bar, command)
-    }
-
     /// Wait for a message to become available on the message queue.
     ///
     /// This works purely at the transport layer and does not interpret or validate the message
@@ -691,7 +726,7 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
         let (header, slice_1) = GspMsgElement::from_bytes_prefix(slice_1).ok_or(EIO)?;
 
         dev_dbg!(
-            self.dev,
+            &self.dev,
             "GSP RPC: receive: seq# {}, function={:?}, length=0x{:x}\n",
             header.sequence(),
             header.function(),
@@ -726,7 +761,7 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
         ])) != 0
         {
             dev_err!(
-                self.dev,
+                &self.dev,
                 "GSP RPC: receive: Call {} - bad checksum\n",
                 header.sequence()
             );
@@ -755,7 +790,7 @@ fn wait_for_msg(&self, timeout: Delta) -> Result<GspMessage<'_>> {
     /// - `ERANGE` if the message had a recognized but non-matching function code.
     ///
     /// Error codes returned by [`MessageFromGsp::read`] are propagated as-is.
-    pub(crate) fn receive_msg<M: MessageFromGsp>(&mut self, timeout: Delta) -> Result<M>
+    fn receive_msg<M: MessageFromGsp>(&mut self, timeout: Delta) -> Result<M>
     where
         // This allows all error types, including `Infallible`, to be used for `M::InitError`.
         Error: From<M::InitError>,
@@ -791,9 +826,4 @@ pub(crate) fn receive_msg<M: MessageFromGsp>(&mut self, timeout: Delta) -> Resul
 
         result
     }
-
-    /// Returns the DMA handle of the command queue's shared memory region.
-    pub(crate) fn dma_handle(&self) -> DmaAddress {
-        self.gsp_mem.0.dma_handle()
-    }
 }
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 77054c92fcc2..c89c7b57a751 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -165,7 +165,7 @@ fn read(
 }
 
 /// Waits for GSP initialization to complete.
-pub(crate) fn wait_gsp_init_done(cmdq: &mut Cmdq) -> Result {
+pub(crate) fn wait_gsp_init_done(cmdq: &Cmdq) -> Result {
     loop {
         match cmdq.receive_msg::<GspInitDone>(Cmdq::RECEIVE_TIMEOUT) {
             Ok(_) => break Ok(()),
@@ -234,6 +234,6 @@ pub(crate) fn gpu_name(&self) -> core::result::Result<&str, GpuNameError> {
 }
 
 /// Send the [`GetGspInfo`] command and awaits for its reply.
-pub(crate) fn get_gsp_info(cmdq: &mut Cmdq, bar: &Bar0) -> Result<GetGspStaticInfoReply> {
+pub(crate) fn get_gsp_info(cmdq: &Cmdq, bar: &Bar0) -> Result<GetGspStaticInfoReply> {
     cmdq.send_command(bar, GetGspStaticInfo)
 }
diff --git a/drivers/gpu/nova-core/gsp/sequencer.rs b/drivers/gpu/nova-core/gsp/sequencer.rs
index ce2b3bb05d22..474e4c8021db 100644
--- a/drivers/gpu/nova-core/gsp/sequencer.rs
+++ b/drivers/gpu/nova-core/gsp/sequencer.rs
@@ -356,7 +356,7 @@ pub(crate) struct GspSequencerParams<'a> {
 }
 
 impl<'a> GspSequencer<'a> {
-    pub(crate) fn run(cmdq: &mut Cmdq, params: GspSequencerParams<'a>) -> Result {
+    pub(crate) fn run(cmdq: &Cmdq, params: GspSequencerParams<'a>) -> Result {
         let seq_info = loop {
             match cmdq.receive_msg::<GspSequence>(Cmdq::RECEIVE_TIMEOUT) {
                 Ok(seq_info) => break seq_info,

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq
  2026-03-18  4:07 ` [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq Eliot Courtney
@ 2026-03-18 15:27   ` Alexandre Courbot
  2026-03-18 15:40     ` Danilo Krummrich
                       ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Alexandre Courbot @ 2026-03-18 15:27 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Benno Lossin, Gary Guo, John Hubbard, Alistair Popple,
	Joel Fernandes, Timur Tabi, dri-devel, linux-kernel,
	rust-for-linux, Zhi Wang

On Wed Mar 18, 2026 at 1:07 PM JST, Eliot Courtney wrote:
<snip>
> +    /// Returns the DMA handle of the command queue's shared memory region.
> +    pub(crate) fn dma_handle(&self) -> DmaAddress {
> +        self.inner.lock().gsp_mem.0.dma_handle()
> +    }

Just noticed that we now need to lock to get the DMA handle. It's
inconsequential in practice but a bit inelegant.

Since the DMA handle never changes, and is only ever needed during
initialization, I think I will just insert a patch before this one that
adds a `pub(super) dma_handle` member to `Cmdq`. That way we only need
to obtain the handle at construction time and can get rid of this
method, which keeps the public API focused on message handling.

No need to resend, I will apply this patch on top of mine and merge.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq
  2026-03-18 15:27   ` Alexandre Courbot
@ 2026-03-18 15:40     ` Danilo Krummrich
  2026-03-20 14:45       ` Alexandre Courbot
  2026-03-18 23:21     ` Alexandre Courbot
  2026-03-19  0:58     ` Eliot Courtney
  2 siblings, 1 reply; 13+ messages in thread
From: Danilo Krummrich @ 2026-03-18 15:40 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Eliot Courtney, Alice Ryhl, David Airlie, Simona Vetter,
	Benno Lossin, Gary Guo, John Hubbard, Alistair Popple,
	Joel Fernandes, Timur Tabi, dri-devel, linux-kernel,
	rust-for-linux, Zhi Wang

On Wed Mar 18, 2026 at 4:27 PM CET, Alexandre Courbot wrote:
> On Wed Mar 18, 2026 at 1:07 PM JST, Eliot Courtney wrote:
> <snip>
>> +    /// Returns the DMA handle of the command queue's shared memory region.
>> +    pub(crate) fn dma_handle(&self) -> DmaAddress {
>> +        self.inner.lock().gsp_mem.0.dma_handle()
>> +    }
>
> Just noticed that we now need to lock to get the DMA handle. It's
> inconsequential in practice but a bit inelegant.
>
> Since the DMA handle never changes, and is only ever needed during
> initialization, I think I will just insert a patch before this one that
> adds a `pub(super) dma_handle` member to `Cmdq`. That way we only need
> to obtain the handle at construction time and can get rid of this
> method, which keeps the public API focused on message handling.
>
> No need to resend, I will apply this patch on top of mine and merge.

It is a minor inconvinience, but it may indicate that dma::Coherent should
probaly support locking, i.e. you want to protect the data within the
dma::Coherent allocation, not the dma::Coherent object itself.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq
  2026-03-18 15:40     ` Danilo Krummrich
@ 2026-03-20 14:45       ` Alexandre Courbot
  0 siblings, 0 replies; 13+ messages in thread
From: Alexandre Courbot @ 2026-03-20 14:45 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Eliot Courtney, Alice Ryhl, David Airlie, Simona Vetter,
	Benno Lossin, Gary Guo, John Hubbard, Alistair Popple,
	Joel Fernandes, Timur Tabi, dri-devel, linux-kernel,
	rust-for-linux, Zhi Wang

On Thu Mar 19, 2026 at 12:40 AM JST, Danilo Krummrich wrote:
> On Wed Mar 18, 2026 at 4:27 PM CET, Alexandre Courbot wrote:
>> On Wed Mar 18, 2026 at 1:07 PM JST, Eliot Courtney wrote:
>> <snip>
>>> +    /// Returns the DMA handle of the command queue's shared memory region.
>>> +    pub(crate) fn dma_handle(&self) -> DmaAddress {
>>> +        self.inner.lock().gsp_mem.0.dma_handle()
>>> +    }
>>
>> Just noticed that we now need to lock to get the DMA handle. It's
>> inconsequential in practice but a bit inelegant.
>>
>> Since the DMA handle never changes, and is only ever needed during
>> initialization, I think I will just insert a patch before this one that
>> adds a `pub(super) dma_handle` member to `Cmdq`. That way we only need
>> to obtain the handle at construction time and can get rid of this
>> method, which keeps the public API focused on message handling.
>>
>> No need to resend, I will apply this patch on top of mine and merge.
>
> It is a minor inconvinience, but it may indicate that dma::Coherent should
> probaly support locking, i.e. you want to protect the data within the
> dma::Coherent allocation, not the dma::Coherent object itself.

Not quite sure I understand what you mean here - can you elaborate? This
sounds to me like you want to add a `Mutex` to every `dma::Coherent`, so
I am likely misunderstanding. :)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq
  2026-03-18 15:27   ` Alexandre Courbot
  2026-03-18 15:40     ` Danilo Krummrich
@ 2026-03-18 23:21     ` Alexandre Courbot
  2026-03-19  0:34       ` John Hubbard
  2026-03-19  0:58     ` Eliot Courtney
  2 siblings, 1 reply; 13+ messages in thread
From: Alexandre Courbot @ 2026-03-18 23:21 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Benno Lossin, Gary Guo, John Hubbard, Alistair Popple,
	Joel Fernandes, Timur Tabi, dri-devel, linux-kernel,
	rust-for-linux, Zhi Wang

On Thu Mar 19, 2026 at 12:27 AM JST, Alexandre Courbot wrote:
> On Wed Mar 18, 2026 at 1:07 PM JST, Eliot Courtney wrote:
> <snip>
>> +    /// Returns the DMA handle of the command queue's shared memory region.
>> +    pub(crate) fn dma_handle(&self) -> DmaAddress {
>> +        self.inner.lock().gsp_mem.0.dma_handle()
>> +    }
>
> Just noticed that we now need to lock to get the DMA handle. It's
> inconsequential in practice but a bit inelegant.
>
> Since the DMA handle never changes, and is only ever needed during
> initialization, I think I will just insert a patch before this one that
> adds a `pub(super) dma_handle` member to `Cmdq`. That way we only need
> to obtain the handle at construction time and can get rid of this
> method, which keeps the public API focused on message handling.
>
> No need to resend, I will apply this patch on top of mine and merge.

Actually let's fix that afterwards, what I proposed above would
introduce code that hasn't undergone review, and doesn't reduce the
number of patches in the series so doesn't bring any benefit. I'll merge
the series as-is for now.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq
  2026-03-18 23:21     ` Alexandre Courbot
@ 2026-03-19  0:34       ` John Hubbard
  0 siblings, 0 replies; 13+ messages in thread
From: John Hubbard @ 2026-03-19  0:34 UTC (permalink / raw)
  To: Alexandre Courbot, Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Benno Lossin, Gary Guo, Alistair Popple, Joel Fernandes,
	Timur Tabi, dri-devel, linux-kernel, rust-for-linux, Zhi Wang

On 3/18/26 4:21 PM, Alexandre Courbot wrote:
> On Thu Mar 19, 2026 at 12:27 AM JST, Alexandre Courbot wrote:
>> On Wed Mar 18, 2026 at 1:07 PM JST, Eliot Courtney wrote:
>> <snip>
>>> +    /// Returns the DMA handle of the command queue's shared memory region.
>>> +    pub(crate) fn dma_handle(&self) -> DmaAddress {
>>> +        self.inner.lock().gsp_mem.0.dma_handle()
>>> +    }
>>
>> Just noticed that we now need to lock to get the DMA handle. It's
>> inconsequential in practice but a bit inelegant.
>>
>> Since the DMA handle never changes, and is only ever needed during
>> initialization, I think I will just insert a patch before this one that
>> adds a `pub(super) dma_handle` member to `Cmdq`. That way we only need
>> to obtain the handle at construction time and can get rid of this
>> method, which keeps the public API focused on message handling.
>>
>> No need to resend, I will apply this patch on top of mine and merge.
> 
> Actually let's fix that afterwards, what I proposed above would
> introduce code that hasn't undergone review, and doesn't reduce the
> number of patches in the series so doesn't bring any benefit. I'll merge
> the series as-is for now.

For what it's worth, this avoided an email from me, to the effect of
"there is no such thing as a quick locking fixup, don't do this".

So, good. :)

thanks,
-- 
John Hubbard


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq
  2026-03-18 15:27   ` Alexandre Courbot
  2026-03-18 15:40     ` Danilo Krummrich
  2026-03-18 23:21     ` Alexandre Courbot
@ 2026-03-19  0:58     ` Eliot Courtney
  2 siblings, 0 replies; 13+ messages in thread
From: Eliot Courtney @ 2026-03-19  0:58 UTC (permalink / raw)
  To: Alexandre Courbot, Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Benno Lossin, Gary Guo, John Hubbard, Alistair Popple,
	Joel Fernandes, Timur Tabi, dri-devel, linux-kernel,
	rust-for-linux, Zhi Wang

On Thu Mar 19, 2026 at 12:27 AM JST, Alexandre Courbot wrote:
> On Wed Mar 18, 2026 at 1:07 PM JST, Eliot Courtney wrote:
> <snip>
>> +    /// Returns the DMA handle of the command queue's shared memory region.
>> +    pub(crate) fn dma_handle(&self) -> DmaAddress {
>> +        self.inner.lock().gsp_mem.0.dma_handle()
>> +    }
>
> Just noticed that we now need to lock to get the DMA handle. It's
> inconsequential in practice but a bit inelegant.
>
> Since the DMA handle never changes, and is only ever needed during
> initialization, I think I will just insert a patch before this one that
> adds a `pub(super) dma_handle` member to `Cmdq`. That way we only need
> to obtain the handle at construction time and can get rid of this
> method, which keeps the public API focused on message handling.
>
> No need to resend, I will apply this patch on top of mine and merge.

Yerp it's not ideal, but we only call this method in one location, once,
when creating the command queue. The locking in this series is very
basic and we probably want to improve it later to be more performant, so
I think it's simpler and better to just leave this as is for now.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq
  2026-03-18  4:07 [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq Eliot Courtney
                   ` (4 preceding siblings ...)
  2026-03-18  4:07 ` [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq Eliot Courtney
@ 2026-03-19  2:10 ` Alexandre Courbot
  5 siblings, 0 replies; 13+ messages in thread
From: Alexandre Courbot @ 2026-03-19  2:10 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Danilo Krummrich, Alice Ryhl, David Airlie, Simona Vetter,
	Benno Lossin, Gary Guo, John Hubbard, Alistair Popple,
	Joel Fernandes, Timur Tabi, dri-devel, linux-kernel,
	rust-for-linux, Zhi Wang

On Wed Mar 18, 2026 at 1:07 PM JST, Eliot Courtney wrote:
> Add locking to Cmdq. This is required e.g. for unloading the driver,
> which needs to send the UnloadingGuestDriver via the command queue
> on unbind which may be on a different thread.
>
> We have commands that need a reply and commands that don't. For
> commands with a reply we want to make sure that they don't get
> the reply of a different command back. The approach this patch series
> takes is by making those commands block until they get a response. For
> now this should be ok, and we expect GSP to be fast anyway.
>
> To do this, we need to know which commands expect a reply and which
> don't. John's existing series[1] adds IS_ASYNC which solves part of the
> problem, but we need to know a bit more. So instead, add an
> associated type called Reply which tells us what the reply is.
>
> An alternative would be to define traits inheriting CommandToGsp, e.g.
> CommandWithReply and CommandWithoutReply, instead of using the
> associated type. I implemented the associated type version because it
> feels more compositional rather than inherity so seemed a bit better
> to me. But both of these approaches work and are fine, IMO.
>
> In summary, this patch series has three steps:
> 1. Add the type infrastructure to know what replies are expected for a
>    command and update each caller to explicitly wait for the reply or
>    not.
> 2. Make Cmdq pinned so we can use Mutex
> 3. Add a Mutex to protect Cmdq by moving the relevant state to an
>    inner struct.
>
> [1]: https://lore.kernel.org/all/20260211000451.192109-1-jhubbard@nvidia.com/
>
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>

Merged into drm-rust-next, thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-03-20 14:45 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-18  4:07 [PATCH v5 0/5] gpu: nova-core: gsp: add locking to Cmdq Eliot Courtney
2026-03-18  4:07 ` [PATCH v5 1/5] gpu: nova-core: gsp: fix stale doc comments on command queue methods Eliot Courtney
2026-03-18  4:07 ` [PATCH v5 2/5] gpu: nova-core: gsp: add `RECEIVE_TIMEOUT` constant for command queue Eliot Courtney
2026-03-18  4:07 ` [PATCH v5 3/5] gpu: nova-core: gsp: add reply/no-reply info to `CommandToGsp` Eliot Courtney
2026-03-18  4:07 ` [PATCH v5 4/5] gpu: nova-core: gsp: make `Cmdq` a pinned type Eliot Courtney
2026-03-18  4:07 ` [PATCH v5 5/5] gpu: nova-core: gsp: add mutex locking to Cmdq Eliot Courtney
2026-03-18 15:27   ` Alexandre Courbot
2026-03-18 15:40     ` Danilo Krummrich
2026-03-20 14:45       ` Alexandre Courbot
2026-03-18 23:21     ` Alexandre Courbot
2026-03-19  0:34       ` John Hubbard
2026-03-19  0:58     ` Eliot Courtney
2026-03-19  2:10 ` [PATCH v5 0/5] gpu: nova-core: gsp: add " Alexandre Courbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox