Re: [PATCH 1/2] gpu: nova-core: fix GSP RPC send sequence numbers to match Open RM

public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed

From: "Danilo Krummrich" <dakr@kernel.org>
To: "Alexandre Courbot" <acourbot@nvidia.com>
Cc: "John Hubbard" <jhubbard@nvidia.com>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	"Alistair Popple" <apopple@nvidia.com>,
	"Eliot Courtney" <ecourtney@nvidia.com>,
	"Zhi Wang" <zhiw@nvidia.com>, "Simona Vetter" <simona@ffwll.ch>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <lossin@kernel.org>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	nouveau@lists.freedesktop.org, rust-for-linux@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	"Maneet Singh" <mmaneetsingh@nvidia.com>
Subject: Re: [PATCH 1/2] gpu: nova-core: fix GSP RPC send sequence numbers to match Open RM
Date: Tue, 17 Feb 2026 12:52:31 +0100	[thread overview]
Message-ID: <DGH7S2HJRDJJ.210D7DLTNFRNR@kernel.org> (raw)
In-Reply-To: <DGH35X9GCFYO.EBY01G1C6XQM@nvidia.com>

On Tue Feb 17, 2026 at 9:15 AM CET, Alexandre Courbot wrote:
>> diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
>> index 02eec2961b5f..f769e234dae6 100644
>> --- a/drivers/gpu/nova-core/gsp/boot.rs
>> +++ b/drivers/gpu/nova-core/gsp/boot.rs
>> @@ -403,7 +403,11 @@ pub(crate) fn boot(
>>  
>>          dev_dbg!(dev, "RISC-V active? {}\n", gsp_falcon.is_riscv_active(bar));
>>  
>> -        // Now that GSP is active, send system info and registry
>> +        // Now that GSP is active, send system info and registry.
>> +        //
>> +        // These are async (fire-and-forget) RPCs: no response comes back from GSP.
>> +        // GSP does not include them in its sequence number counting today, but a
>> +        // future GSP firmware update will fold them into the normal sequence space.
>
> From the point of view of the caller this is a superfluous
> implementation detail. This comment is actually confusing as it mentions
> a sequence number that the caller does not need to provide, so I'd just
> remove it altogether.

The comment is not a doc-comment, hence it can more be seen as a comment for the
audience of nova-core developers in general, rather than a description of a
specific API. So, I think it is OK to talk about an implementation detail here.

To me personally it seems useful and I'd like to keep it.

>
>>          self.cmdq
>>              .send_command(bar, commands::SetSystemInfo::new(pdev, chipset))?;
>>          self.cmdq.send_command(bar, commands::SetRegistry::new())?;
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 16895f5281b7..7d6d7d81287c 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -58,6 +58,13 @@ pub(crate) trait CommandToGsp {
>>      /// Function identifying this command to the GSP.
>>      const FUNCTION: MsgFunction;
>>  
>> +    /// Whether this command is async (fire-and-forget), meaning no response is expected from GSP.
>> +    ///
>> +    /// Async commands get inner `rpc.sequence` set to 0. Sync commands get inner `rpc.sequence`
>> +    /// set to the transport counter, matching Open RM. The outer `seqNum` always increments
>> +    /// regardless.
>
> This here is too much detail as well IMHO, even more if you follow the
> last suggestion in this review. :)

As this one is a doc-comment, agreed.

>
>> +    const IS_ASYNC: bool = false;
>> +
>>      /// Type generated by [`CommandToGsp::init`], to be written into the command queue buffer.
>>      type Command: FromBytes + AsBytes;
>>  
>> @@ -439,7 +446,8 @@ struct GspMessage<'a> {
>>  pub(crate) struct Cmdq {
>>      /// Device this command queue belongs to.
>>      dev: ARef<device::Device>,
>> -    /// Current command sequence number.
>> +    /// Transport-level sequence number, incremented for every send. Used for the outer
>> +    /// GSP_MSG_QUEUE_ELEMENT.seqNum. Also used as the inner rpc.sequence for sync commands.
>
> Note that these types (especially `GSP_MSG_QUEUE_ELEMENT`) are never
> visible in this module, so mentioning them here is assuming context that
> the reader can't easily get.
>
> Instead, how about just "Used for [`GspMsgElement`]'s sequence counter"?

Same here.

>>      seq: u32,
>>      /// Memory area shared with the GSP for communicating commands and messages.
>>      gsp_mem: DmaGspMem,
>> @@ -514,8 +522,13 @@ pub(crate) fn send_command<M>(&mut self, bar: &Bar0, command: M) -> Result
>>          // Extract area for the command itself.
>>          let (cmd, payload_1) = M::Command::from_bytes_mut_prefix(dst.contents.0).ok_or(EIO)?;
>>  
>> +        // The outer seqNum always increments (transport-level, unique per message).
>> +        // The inner rpc.sequence is 0 for async (fire-and-forget) commands, or the
>> +        // sync counter for command/response pairs, matching Open RM behavior.
>> +        let rpc_seq = if M::IS_ASYNC { 0 } else { self.seq };
>> +
>>          // Fill the header and command in-place.
>> -        let msg_element = GspMsgElement::init(self.seq, command_size, M::FUNCTION);
>> +        let msg_element = GspMsgElement::init(self.seq, rpc_seq, command_size, M::FUNCTION);
>>          // SAFETY: `msg_header` and `cmd` are valid references, and not touched if the initializer
>>          // fails.
>>          unsafe {
>> diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
>> index e6a9a1fc6296..c8a73bd30051 100644
>> --- a/drivers/gpu/nova-core/gsp/commands.rs
>> +++ b/drivers/gpu/nova-core/gsp/commands.rs
>> @@ -50,6 +50,7 @@ pub(crate) fn new(pdev: &'a pci::Device<device::Bound>, chipset: Chipset) -> Sel
>>  
>>  impl<'a> CommandToGsp for SetSystemInfo<'a> {
>>      const FUNCTION: MsgFunction = MsgFunction::GspSetSystemInfo;
>> +    const IS_ASYNC: bool = true;
>>      type Command = GspSetSystemInfo;
>>      type InitError = Error;
>>  
>> @@ -101,6 +102,7 @@ pub(crate) fn new() -> Self {
>>  
>>  impl CommandToGsp for SetRegistry {
>>      const FUNCTION: MsgFunction = MsgFunction::SetRegistry;
>> +    const IS_ASYNC: bool = true;
>>      type Command = PackedRegistryTable;
>>      type InitError = Infallible;
>>  
>> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
>> index 927bcee6a5a5..e417ed58419f 100644
>> --- a/drivers/gpu/nova-core/gsp/fw.rs
>> +++ b/drivers/gpu/nova-core/gsp/fw.rs
>> @@ -260,6 +260,26 @@ pub(crate) enum MsgFunction {
>>      UcodeLibOsPrint = bindings::NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT,
>>  }
>>  
>> +impl MsgFunction {
>> +    /// Returns true if this is a GSP-initiated async event (NV_VGPU_MSG_EVENT_*), as opposed to
>> +    /// a command response (NV_VGPU_MSG_FUNCTION_*).
>> +    #[expect(dead_code)]
>> +    pub(crate) fn is_event(&self) -> bool {
>> +        matches!(
>> +            self,
>> +            Self::GspInitDone
>> +                | Self::GspRunCpuSequencer
>> +                | Self::PostEvent
>> +                | Self::RcTriggered
>> +                | Self::MmuFaultQueued
>> +                | Self::OsErrorLog
>> +                | Self::GspPostNoCat
>> +                | Self::GspLockdownNotice
>> +                | Self::UcodeLibOsPrint //
>> +        )
>> +    }
>
> Mmm using a method for this is quite fragile, as we are always at risk
> of forgetting to handle a newly-added function. I wanted to eventually
> split `MsgFunction` between its command and events constituants, maybe
> that would be a good opportunity to do so.

I also think we should leverage the type system for such things.

> Also, this is marked with `dead_code`? Looks like it belongs to the next
> patch in this case.
>
>> +}
>> +
>>  impl fmt::Display for MsgFunction {
>>      fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>>          match self {
>> @@ -816,7 +836,7 @@ fn new() -> Self {
>>  }
>>  
>>  impl bindings::rpc_message_header_v {
>> -    fn init(cmd_size: usize, function: MsgFunction) -> impl Init<Self, Error> {
>> +    fn init(cmd_size: usize, function: MsgFunction, sequence: u32) -> impl Init<Self, Error> {
>>          type RpcMessageHeader = bindings::rpc_message_header_v;
>>  
>>          try_init!(RpcMessageHeader {
>> @@ -829,6 +849,7 @@ fn init(cmd_size: usize, function: MsgFunction) -> impl Init<Self, Error> {
>>                  .and_then(|v| v.try_into().map_err(|_| EINVAL))?,
>>              rpc_result: 0xffffffff,
>>              rpc_result_private: 0xffffffff,
>> +            sequence,
>>              ..Zeroable::init_zeroed()
>>          })
>>      }
>> @@ -847,26 +868,31 @@ impl GspMsgElement {
>>      ///
>>      /// # Arguments
>>      ///
>> -    /// * `sequence` - Sequence number of the message.
>> +    /// * `transport_seq` - Transport-level sequence number for the outer message header
>> +    ///   (`GSP_MSG_QUEUE_ELEMENT.seqNum`). Must be unique per message.
>> +    /// * `rpc_seq` - RPC-level sequence number for the inner RPC header
>> +    ///   (`rpc_message_header_v.sequence`). Set to 0 for async (fire-and-forget) commands,
>> +    ///   or to the sync counter for command/response pairs.
>>      /// * `cmd_size` - Size of the command (not including the message element), in bytes.
>>      /// * `function` - Function of the message.
>>      #[allow(non_snake_case)]
>>      pub(crate) fn init(
>> -        sequence: u32,
>> +        transport_seq: u32,
>> +        rpc_seq: u32,
>
> The caller site of `init` mentions that `rpc_seq` will always either be
> equal to `transport_seq`, or be 0, depending on whether this is an async
> command or not, yet this constructor allows any value to be used. I'd
> like to enforce this invariant by making it impossible to build invalid
> combinations of sequences.
>
> As a matter of fact we already have this information in
> `CommandToGsp::IS_ASYNC`, so the simpler way would be to pass *that* as
> an extra argument to `init`, and let it handle the sequencing
> internally. It also has the benefit of not making this message-layer
> detail leak into the command queue code, making the discussion about
> this detail even more irrelevant outside of the `fw` module.
>
> An even better solution would be to to pass the `CommandToGsp` as a
> generic argument - that way `init` could extract both the `FUNCTION` and
> `IS_ASYNC`. Maybe use a public generic proxy method that calls into a
> non-generic private one to limit monomorphization.

I think we should go with the generic argument; but I don't think we need a
proxy method to fight monomorphization, i.e. I don't think we need to do work
for the optimizer in this case.

next prev parent reply	other threads:[~2026-02-17 11:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11  0:04 [PATCH 0/2] gpu: nova-core: use GSP RPC sequence numbers properly John Hubbard
2026-02-11  0:04 ` [PATCH 1/2] gpu: nova-core: fix GSP RPC send sequence numbers to match Open RM John Hubbard
2026-02-17  8:15   ` Alexandre Courbot
2026-02-17 11:52     ` Danilo Krummrich [this message]
2026-02-11  0:04 ` [PATCH 2/2] gpu: nova-core: improve GSP RPC debug logging with message classification John Hubbard
2026-02-17 11:04   ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DGH7S2HJRDJJ.210D7DLTNFRNR@kernel.org \
    --to=dakr@kernel.org \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=apopple@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=ecourtney@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=mmaneetsingh@nvidia.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=simona@ffwll.ch \
    --cc=tmgross@umich.edu \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox