($INBOX_DIR/description missing)
 help / color / mirror / Atom feed
* Re: [RFC PATCH v3 1/6] uapi: add goldfish_address_space userspace ABI header
From: Arnd Bergmann @ 2026-04-13 16:28 UTC (permalink / raw)
  To: Wenzhao Liao, rust-for-linux, linux-pci
  Cc: Miguel Ojeda, Danilo Krummrich, bhelgaas,
	Krzysztof Wilczyński, Greg Kroah-Hartman, linux-kernel,
	linux-api
In-Reply-To: <20260406165120.166928-2-wenzhaoliao@ruc.edu.cn>

On Mon, Apr 6, 2026, at 18:51, Wenzhao Liao wrote:

> +struct goldfish_address_space_allocate_block {
> +	__u64 size;
> +	__u64 offset;
> +	__u64 phys_addr;
> +};
> +
> +struct goldfish_address_space_ping {
> +	__u64 offset;
> +	__u64 size;
> +	__u64 metadata;
> +	__u32 version;
> +	__u32 wait_fd;
> +	__u32 wait_flags;
> +	__u32 direction;
> +};
> +
> +struct goldfish_address_space_claim_shared {
> +	__u64 offset;
> +	__u64 size;
> +};

All these ioctl structures are well-formed in the sense that they
are portable across architectures and won't leak kernel data
through implicit padding.

Two of the members are a bit worrying, but that may just
be my own understanding:

- the 'phys_addr' member sounds like it refers to a physical
  memory location in the CPU address space, which in general
  should not be visible to user space, as that tends to
  expose security problems if users with access to the
  device can use this to access data they should not.

- the 'version' field may refer to the version of the ioctl
  command, which is similarly discouraged since it is
  harder to deal with than just coming up with new ioctl
  command codes. If this refers to the version of the
  remote side, this is probably fine.

> +#define GOLDFISH_ADDRESS_SPACE_IOCTL_MAGIC 'G'
> +
> +#define GOLDFISH_ADDRESS_SPACE_IOCTL_OP(OP, T) \
> +	_IOWR(GOLDFISH_ADDRESS_SPACE_IOCTL_MAGIC, OP, T)

I think it would be better to remove this intermediate macro, since
it prevents easy scraping of ioctl command codes from looking
at the source file with regular expressions.

It is also unusual that all commands are both reading
and writing the data. Please check if you can make some
of them read-only or write-only.

     Arnd

^ permalink raw reply

* Re: [PATCH] gpu: nova-core: fb: make sure to unregister SysmemFlush on boot failure
From: Danilo Krummrich @ 2026-04-13 14:00 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Gary Guo, Alexandre Courbot, Alice Ryhl, David Airlie,
	Simona Vetter, John Hubbard, Alistair Popple, Joel Fernandes,
	Timur Tabi, rust-for-linux, dri-devel, linux-kernel
In-Reply-To: <DHS2DLJW7ISY.1QW0Y0WC1NSZJ@nvidia.com>

On Mon Apr 13, 2026 at 3:33 PM CEST, Eliot Courtney wrote:
> Yeah, I agree that this patch is the wrong long term solution. If we
> don't have the infra for the proper way to do it soon, it might be worth
> taking it anyway since it adds minimal complexity and fixes a real
> issue.

Yeah, let's see how it goes the next weeks.

> I had a brief look into the Devres chain stuff that exists on the C side
> and it looks like it doesn't provide any actual guarantees about the
> lifetime (it seems possible to delete from the middle of the chain, so
> descendants can't assume ancestors exist in general, AFAICT), so the
> the translation of that into lifetimes in rust might get interesting

There is no such API that does what we want here on the C side. In C it simply
is the responsibility of drivers to take care of this. So, I'm not sure what you
are referring to.

In case you are talking about devres groups, they have a different purpose. They
are used in cases where you e.g. have some middle layer API between drivers and
a subsystem, so the middle layer can acquire resources on the driver's behalf
and manage them. If the middle layer fails on optional features, it can simply
release the group of resources acquired for the purpose of the optional feature.

I.e. devres groups are simply markers for a certain range within the list of
devres nodes of a specific device.

^ permalink raw reply

* Re: [PATCH] gpu: nova-core: fb: make sure to unregister SysmemFlush on boot failure
From: Eliot Courtney @ 2026-04-13 13:33 UTC (permalink / raw)
  To: Danilo Krummrich, Gary Guo, Eliot Courtney
  Cc: Alexandre Courbot, Alice Ryhl, David Airlie, Simona Vetter,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	rust-for-linux, dri-devel, linux-kernel
In-Reply-To: <DHRZ01R16ZXR.3T0D56PREGR18@kernel.org>

On Mon Apr 13, 2026 at 7:55 PM JST, Danilo Krummrich wrote:
> On Fri Apr 10, 2026 at 5:57 PM CEST, Gary Guo wrote:
>>> +impl Drop for SysmemFlush {
>>> +    fn drop(&mut self) {
>>> +        let _ = self.bar.try_access_with(|bar| self.unregister(bar));
>>
>> I feel that this is the wrong solution to the problem.
>
> Yeah, it is pretty fragile as it relies on outer implementation details, such as
> the fact that SysmemFlush is part of the device private data and hence
> try_access_with() will fail when not dropped in an unwind path.
>
>> The thing we want is to *ensure* that `SysmemFlush` Drop is called with device
>> still being bound.
>>
>> It's not yet fully clear to me how we'd want to guarantee that, but one API that
>> might make sense is to create a DevRes API that allows you to reference an
>> existing `DevRes` and have driver-core making sure that the tear down happens in
>> reverse order. So inside the `Drop` the `bar` can still be unconditionally
>> access.
>
> Yes, I have something like this on my list of things I want to look into for
> while (my list entry calls it DevresChain).
>
> I want to leverage the internal reference count of Devres for this, which is not
> exactly straight forward, but should be possible.
>
> I will prioritize this and have a look.

Yeah, I agree that this patch is the wrong long term solution. If we
don't have the infra for the proper way to do it soon, it might be worth
taking it anyway since it adds minimal complexity and fixes a real
issue.

I had a brief look into the Devres chain stuff that exists on the C side
and it looks like it doesn't provide any actual guarantees about the
lifetime (it seems possible to delete from the middle of the chain, so
descendants can't assume ancestors exist in general, AFAICT), so the
the translation of that into lifetimes in rust might get interesting

thanks

>
> Thanks,
> Danilo


^ permalink raw reply

* Re: [PATCH 5/5] gpu: nova-core: vbios: use checked ops and accesses in `FwSecBiosImage::ucode`
From: Alexandre Courbot @ 2026-04-13 12:59 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Joel Fernandes, Danilo Krummrich, Alice Ryhl, David Airlie,
	Simona Vetter, John Hubbard, Alistair Popple, Timur Tabi,
	rust-for-linux, dri-devel, linux-kernel
In-Reply-To: <DHRT5WO1KNM9.7QPD97ADIC3O@nvidia.com>

On Mon Apr 13, 2026 at 3:20 PM JST, Eliot Courtney wrote:
> On Sat Apr 11, 2026 at 12:05 AM JST, Joel Fernandes wrote:
>> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> , one comment below
>>
>> On 4/10/2026 4:38 AM, Eliot Courtney wrote:
>>> Use checked arithmetic and access for extracting the microcode since the
>>> offsets are firmware derived.
>>> 
>>> Fixes: 47c4846e4319 ("gpu: nova-core: vbios: Add support for FWSEC ucode extraction")
>>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>>> ---
>>>  drivers/gpu/nova-core/vbios.rs | 19 ++++++++++++-------
>>>  1 file changed, 12 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
>>> index 3bd3ac3a69f2..b509cd8407a5 100644
>>> --- a/drivers/gpu/nova-core/vbios.rs
>>> +++ b/drivers/gpu/nova-core/vbios.rs
>>> @@ -1027,16 +1027,21 @@ pub(crate) fn header(&self) -> Result<FalconUCodeDesc> {
>>>  
>>>      /// Get the ucode data as a byte slice
>>>      pub(crate) fn ucode(&self, desc: &FalconUCodeDesc) -> Result<&[u8]> {
>>> -        let falcon_ucode_offset = self.falcon_ucode_offset;
>>> -
>>>          // The ucode data follows the descriptor.
>>> -        let ucode_data_offset = falcon_ucode_offset + desc.size();
>>> -        let size = usize::from_safe_cast(desc.imem_load_size() + desc.dmem_load_size());
>>> +        let data = self
>>> +            .base
>>> +            .data
>>> +            .get(self.falcon_ucode_offset..)
>>> +            .ok_or(ERANGE)?;
>>> +        let size = usize::from_safe_cast(
>>> +            desc.imem_load_size()
>>> +                .checked_add(desc.dmem_load_size())
>>> +                .ok_or(ERANGE)?,
>>> +        );
>>>  
>>>          // Get the data slice, checking bounds in a single operation.
>>> -        self.base
>>> -            .data
>>> -            .get(ucode_data_offset..ucode_data_offset + size)
>>> +        data.get(desc.size()..)
>>> +            .and_then(|data| data.get(..size))
>>
>> It might be worth adding something like:
>>
>> data.get_slice(start, size) -> Result
>>
>> in R4L longer term if the data.get(start..).and_then(|data|
>> data.get(..size)) pattern is common. It seems to be so in this series.
>
> Yeah, this sounds reasonable although I think we'd need a name that
> makes it more obvious it takes a size. I had a look at rust standard
> library code and it seems a common pattern is to use split_at_checked
> for the initial split. Unfortunately we can't use that yet I think since
> it's in 1.80.0, but maybe we can use it very soon.

IIUC MRSV will be 1.85 when -rc1 is released, so we should be able to
use `split_at_checked`. I've been looking forward to it as well.

>
> I had a look and it looks like there aren't really other existing
> patterns like .get(start..).and_then(.. get(..size)) in the rust kernel
> code. And in nova-core other usages of getting start..start+size slices
> have local proofs that start+size doesn't overflow.
>
> For the most part in other code hopefully we have local proofs that
> start..start+size won't overflow and we can just .get(start..start+size).

Rust is really missing a way to build a range using a starting point and
a size... but I guess there is a good reason for that.


^ permalink raw reply

* Re: [PATCH 0/1] rust: module_param: support bool parameters
From: Greg KH @ 2026-04-13 12:25 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: Wenzhao Liao, mcgrof, petr.pavlu, da.gomez, samitolvanen, ojeda,
	linux-modules, rust-for-linux, atomlin, boqun, gary, bjorn3_gh,
	lossin, aliceryhl, tmgross, dakr, linux-kernel
In-Reply-To: <87fr4zvzn2.fsf@kernel.org>

On Mon, Apr 13, 2026 at 02:00:17PM +0200, Andreas Hindborg wrote:
> "Greg KH" <greg@kroah.com> writes:
> 
> > On Sat, Apr 11, 2026 at 09:02:53AM -0400, Wenzhao Liao wrote:
> >> Sorry for the earlier noise and for our unfamiliarity with parts of the
> >> kernel submission process, which created extra burden for maintainers.
> >>
> >> This patch adds boolean module parameter support to the Rust `module!`
> >> parameter path.
> >>
> >> It implements `ModuleParam` for `bool` and wires `PARAM_OPS_BOOL` into
> >> the Rust module parameter machinery, so Rust-side parsing reuses the
> >> existing kernel `kstrtobool()` semantics through `kstrtobool_bytes()`
> >> instead of introducing a separate parser. A boolean parameter is also
> >> added to `samples/rust/rust_minimal.rs` as a small reference user and
> >> build-time validation point.
> >
> > What driver needs this feature?  Module options should be very rare
> > going forward as they are 1990's technology and do not fit the "modern"
> > kernel model at all.
> 
> Rust null block uses module parameters, and was requested to use proper
> boolean parsing rather than overloading u8 parsing for boolean
> parameters [1].
> 
> Best regards,
> Andreas Hindborg
> 
> 
> [1] https://lore.kernel.org/rust-for-linux/abfK4eji5jKSeO_W@google.com/
> 

Ok, then that needs to be said somewhere, and ideally, have the code
that uses that as part of the patch series.

thanks,
greg k-h

^ permalink raw reply

* Re: [PATCH 0/1] rust: module_param: support bool parameters
From: Andreas Hindborg @ 2026-04-13 12:00 UTC (permalink / raw)
  To: Greg KH, Wenzhao Liao
  Cc: mcgrof, petr.pavlu, da.gomez, samitolvanen, ojeda, linux-modules,
	rust-for-linux, atomlin, boqun, gary, bjorn3_gh, lossin,
	aliceryhl, tmgross, dakr, linux-kernel
In-Reply-To: <2026041118-croak-serving-ff5e@gregkh>

"Greg KH" <greg@kroah.com> writes:

> On Sat, Apr 11, 2026 at 09:02:53AM -0400, Wenzhao Liao wrote:
>> Sorry for the earlier noise and for our unfamiliarity with parts of the
>> kernel submission process, which created extra burden for maintainers.
>>
>> This patch adds boolean module parameter support to the Rust `module!`
>> parameter path.
>>
>> It implements `ModuleParam` for `bool` and wires `PARAM_OPS_BOOL` into
>> the Rust module parameter machinery, so Rust-side parsing reuses the
>> existing kernel `kstrtobool()` semantics through `kstrtobool_bytes()`
>> instead of introducing a separate parser. A boolean parameter is also
>> added to `samples/rust/rust_minimal.rs` as a small reference user and
>> build-time validation point.
>
> What driver needs this feature?  Module options should be very rare
> going forward as they are 1990's technology and do not fit the "modern"
> kernel model at all.

Rust null block uses module parameters, and was requested to use proper
boolean parsing rather than overloading u8 parsing for boolean
parameters [1].

Best regards,
Andreas Hindborg


[1] https://lore.kernel.org/rust-for-linux/abfK4eji5jKSeO_W@google.com/


^ permalink raw reply

* Re: [PATCH 3/3] gpu: nova-core: fix wrong use of barriers in GSP code
From: Gary Guo @ 2026-04-13 11:13 UTC (permalink / raw)
  To: Eliot Courtney, Gary Guo, Miguel Ojeda, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Alexandre Courbot, David Airlie,
	Simona Vetter
  Cc: Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	rust-for-linux, nouveau, linux-kernel, linux-arch, lkmm,
	dri-devel
In-Reply-To: <DHRS5I7ZTI01.27WY0OTHA7ZMQ@nvidia.com>

On Mon Apr 13, 2026 at 6:33 AM BST, Eliot Courtney wrote:
> On Fri Apr 3, 2026 at 12:24 AM JST, Gary Guo wrote:
>> From: Gary Guo <gary@garyguo.net>
>>
>> Currently, in the GSP->CPU messaging path, the current code misses a read
>> barrier before data read. The barrier after read is updated to a DMA
>> barrier (with release ordering desired), instead of the existing (Rust)
>> SeqCst SMP barrier; the location of barrier is also moved to the beginning
>> of function, because the barrier is needed to synchronizing between data
>> and ring-buffer pointer, the RMW operation does not internally need a
>> barrier (nor it has to be atomic, as CPU pointers are updated by CPU only).
>>
>> In the CPU->GSP messaging path, the current code misses a write barrier
>> after data write and before updating the CPU write pointer. Barrier is not
>> needed before data write due to control dependency, this fact is documented
>> explicitly. This could be replaced with an acquire barrier if needed.
>>
>> Signed-off-by: Gary Guo <gary@garyguo.net>
>
> nit: should this have
> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>
> ?
>
>> ---
>>  drivers/gpu/nova-core/gsp/cmdq.rs | 19 +++++++++++++++++++
>>  drivers/gpu/nova-core/gsp/fw.rs   | 12 ------------
>>  2 files changed, 19 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
>> index 2224896ccc89..7e4315b13984 100644
>> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
>> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
>> @@ -19,6 +19,12 @@
>>      prelude::*,
>>      sync::{
>>          aref::ARef,
>> +        barrier::{
>> +            dma_mb,
>> +            Read,
>> +            Release,
>> +            Write, //
>> +        },
>>          Mutex, //
>>      },
>>      time::Delta,
>> @@ -258,6 +264,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>>          let tx = self.cpu_write_ptr() as usize;
>>          let rx = self.gsp_read_ptr() as usize;
>>  
>> +        // ORDERING: control dependency provides necessary LOAD->STORE ordering.
>> +        // `dma_mb(Acquire)` may be used here if we don't want to rely on control dependency.
>> +
>>          // SAFETY:
>>          // - We will only access the driver-owned part of the shared memory.
>>          // - Per the safety statement of the function, no concurrent access will be performed.
>> @@ -311,6 +320,9 @@ fn driver_write_area_size(&self) -> usize {
>>          let tx = self.gsp_write_ptr() as usize;
>>          let rx = self.cpu_read_ptr() as usize;
>>  
>> +        // ORDERING: Ensure data load is ordered after load of GSP write pointer.
>> +        dma_mb(Read);
>> +
>>          // SAFETY:
>>          // - We will only access the driver-owned part of the shared memory.
>>          // - Per the safety statement of the function, no concurrent access will be performed.
>> @@ -408,6 +420,10 @@ fn cpu_read_ptr(&self) -> u32 {
>>  
>>      // Informs the GSP that it can send `elem_count` new pages into the message queue.
>>      fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
>> +        // ORDERING: Ensure read pointer is properly ordered.
>
> What about a more specific comment that describes exactly what is
> ordered, e.g. something like:
> Ensure all reads of message data by the CPU have completed before writing
> the updated read pointer to the GSP, since it may overwrite that data.
>
> Maybe this is just me but it's a lot easier for me to think of the
> orderings as a pair of (load? store? -> load? store?) which works for
> everything hw actually supports except for ll+ls+ss, rather than mapping
> 'Release' to (load+store -> store) in my head. e.g. here IIUC we need to
> make sure all loads by the CPU are done before we do the store for the
> pointer, so we need to make sure loads don't cross ahead of this
> barrier but also that stores don't cross behind it, so (load -> store)
> should be sufficient? So, depending on what you want to do with the
> memory model, this could be tightened IMO. Unlike the one below that
> only needs to order stores with eachother (ss).

The pair of `(access, access)` is actually the first iteration of the API that I've designed.
You can see it implemented here: https://github.com/nbdd0121/lkmm-rs/blob/trunk/src/lib.rs
A good thing is that it maps nicely to loongarch and RISC-V. A bad thing is that
it misses ll+ls+ss as you pointed out.

However, I think the more tricky part of this approach is now the semantics is
harder to fit into LKMM; it is now became 9 types of barrier that we need to
reason about, and that's a huge addition. I am not sure it is even feasible to
implement, given the KCSAN only have 5 types of barriers that it could encode
about.

On the other hand, acquire/barriers are much easier to reason about (and they
already exist in C11/Rust memory models), and KCSAN already implements marked
release accesses as ONCE + release barrier (with acquire accesses being
implicit), so it probably doesn't even need changing.

Best,
Gary

>
>> +        //
>
> nit: stray //
>
>> +        dma_mb(Release);
>> +
>>          super::fw::gsp_mem::advance_cpu_read_ptr(&self.0, elem_count)
>>      }
>>  
>> @@ -422,6 +438,9 @@ fn cpu_write_ptr(&self) -> u32 {
>>  
>>      // Informs the GSP that it can process `elem_count` new pages from the command queue.
>>      fn advance_cpu_write_ptr(&mut self, elem_count: u32) {
>> +        // ORDERING: Ensure all command data is visible before updateing ring buffer pointer.
>> +        dma_mb(Write);
>> +
>>          super::fw::gsp_mem::advance_cpu_write_ptr(&self.0, elem_count)
>>      }
>>  }
>> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
>> index 0c8a74f0e8ac..62c2cf1b030c 100644
>> --- a/drivers/gpu/nova-core/gsp/fw.rs
>> +++ b/drivers/gpu/nova-core/gsp/fw.rs
>> @@ -42,11 +42,6 @@
>>  
>>  // TODO: Replace with `IoView` projections once available.
>>  pub(super) mod gsp_mem {
>> -    use core::sync::atomic::{
>> -        fence,
>> -        Ordering, //
>> -    };
>> -
>>      use kernel::{
>>          dma::Coherent,
>>          dma_read,
>> @@ -72,10 +67,6 @@ pub(in crate::gsp) fn cpu_read_ptr(qs: &Coherent<GspMem>) -> u32 {
>>  
>>      pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &Coherent<GspMem>, count: u32) {
>>          let rptr = cpu_read_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
>> -
>> -        // Ensure read pointer is properly ordered.
>> -        fence(Ordering::SeqCst);
>> -
>>          dma_write!(qs, .cpuq.rx.0.readPtr, rptr);
>>      }
>>  
>> @@ -87,9 +78,6 @@ pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &Coherent<GspMem>, count: u32) {
>>          let wptr = cpu_write_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
>>  
>>          dma_write!(qs, .cpuq.tx.0.writePtr, wptr);
>> -
>> -        // Ensure all command data is visible before triggering the GSP read.
>> -        fence(Ordering::SeqCst);
>>      }
>>  }
>>  


^ permalink raw reply

* Re: [PATCH] gpu: nova-core: fb: make sure to unregister SysmemFlush on boot failure
From: Danilo Krummrich @ 2026-04-13 10:55 UTC (permalink / raw)
  To: Gary Guo, Eliot Courtney
  Cc: Alexandre Courbot, Alice Ryhl, David Airlie, Simona Vetter,
	John Hubbard, Alistair Popple, Joel Fernandes, Timur Tabi,
	rust-for-linux, dri-devel, linux-kernel
In-Reply-To: <DHPLKB40FM08.1L3H2S589WUW@garyguo.net>

On Fri Apr 10, 2026 at 5:57 PM CEST, Gary Guo wrote:
>> +impl Drop for SysmemFlush {
>> +    fn drop(&mut self) {
>> +        let _ = self.bar.try_access_with(|bar| self.unregister(bar));
>
> I feel that this is the wrong solution to the problem.

Yeah, it is pretty fragile as it relies on outer implementation details, such as
the fact that SysmemFlush is part of the device private data and hence
try_access_with() will fail when not dropped in an unwind path.

> The thing we want is to *ensure* that `SysmemFlush` Drop is called with device
> still being bound.
>
> It's not yet fully clear to me how we'd want to guarantee that, but one API that
> might make sense is to create a DevRes API that allows you to reference an
> existing `DevRes` and have driver-core making sure that the tear down happens in
> reverse order. So inside the `Drop` the `bar` can still be unconditionally
> access.

Yes, I have something like this on my list of things I want to look into for
while (my list entry calls it DevresChain).

I want to leverage the internal reference count of Devres for this, which is not
exactly straight forward, but should be possible.

I will prioritize this and have a look.

Thanks,
Danilo

^ permalink raw reply

* Re: [PATCH 3/5] gpu: nova-core: vbios: use checked accesses in `setup_falcon_data`
From: Alexandre Courbot @ 2026-04-13  7:10 UTC (permalink / raw)
  To: Eliot Courtney
  Cc: Joel Fernandes, Danilo Krummrich, Alice Ryhl, David Airlie,
	Simona Vetter, John Hubbard, Alistair Popple, Timur Tabi,
	rust-for-linux, dri-devel, linux-kernel
In-Reply-To: <DHRSTLTQNYTB.1ZQO8FBOU678Q@nvidia.com>

On Mon Apr 13, 2026 at 3:04 PM JST, Eliot Courtney wrote:
> On Fri Apr 10, 2026 at 11:53 PM JST, Joel Fernandes wrote:
>> Hi Eliot,
>>
>> On 4/10/2026 4:38 AM, Eliot Courtney wrote:
>>> Use checked arithmetic and accesses where the values are firmware
>>> derived to prevent potential overflow.
>>> 
>>> Fixes: dc70c6ae2441 ("gpu: nova-core: vbios: Add support to look up PMU table in FWSEC")
>>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>>> ---
>>>  drivers/gpu/nova-core/vbios.rs | 20 ++++++++------------
>>>  1 file changed, 8 insertions(+), 12 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
>>> index de856000de23..2b0dc1a9125d 100644
>>> --- a/drivers/gpu/nova-core/vbios.rs
>>> +++ b/drivers/gpu/nova-core/vbios.rs
>>> @@ -936,17 +936,12 @@ fn setup_falcon_data(
>>>  
>>>          self.falcon_data_offset = Some(offset);
>>>  
>>> -        if pmu_in_first_fwsec {
>>> -            self.pmu_lookup_table = Some(PmuLookupTable::new(
>>> -                &self.base.dev,
>>> -                &first_fwsec.base.data[offset..],
>>> -            )?);
>>> +        let pmu_lookup_data = if pmu_in_first_fwsec {
>>> +            &first_fwsec.base.data[offset..]
>>
>> I suggest use get() here as well for consistency with your use of get()
>> further below.
>>   first_fwsec.base.data.get(offset..).ok_or(EINVAL)?
>
> This one has a local proof that it won't ever OOB, so I didn't use
> get(). Not sure what the convention is, but what makes most sense to me
> is to use get() if there is no local proof that it will always succeed
> and use [] if there is such a proof. WDYT? Do you know if there's a
> decided convention for this?

Ideally we use the type system to maintain the proof that OOB cannot
happen - typically by calling `get` early and working with the returned
slice from then on. The problem with this code is that while there is a
local proof that OOB cannot occur *today*, there is no guarantee that
this proof won't be modified (and break the invariant we rely on) by
future code.

Looking at the code it looks like it deserves a larger refactor. We are
setting `pmu_in_first_fwsec` if the offset is valid for the first fwsec,
and modify `offset` if not. Then we check `pmu_in_first_fwsec` to get
the PMU lookup table from the right source. And after that neither
`pmu_in_first_fwsec` not `offset` are ever used again. So this looks
like this could be factored out into a single test (maybe a match on the
result of `get`?), where we simplify things further and don't mutate
variables. Things tend to fall into place with properly guaranteed
invariants when we do that.

^ permalink raw reply

* Re: [PATCH v3 1/1] rust: pci: add extended capability and SR-IOV support
From: Alexandre Courbot @ 2026-04-13  6:52 UTC (permalink / raw)
  To: Zhi Wang
  Cc: rust-for-linux, linux-pci, linux-kernel, dakr, aliceryhl,
	bhelgaas, kwilczynski, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, markus.probst, cjia, smitra, ankita, aniketa,
	kwankhede, targupta, joelagnelf, jhubbard, kjaju, zhiwang
In-Reply-To: <20260409185254.3869808-2-zhiw@nvidia.com>

Hi Zhi,

On Fri Apr 10, 2026 at 3:52 AM JST, Zhi Wang wrote:
<snip>
> +/// An extended PCI capability that implements [`Io`].
> +///
> +/// # Examples
> +///
> +/// ```no_run
> +/// use kernel::pci::{
> +///     self,
> +///     ExtSriovCapability, //
> +/// };
> +/// use kernel::io::Io;
> +///
> +/// fn probe_sriov(pdev: &pci::Device<kernel::device::Core>) -> Result<(), kernel::error::Error> {
> +///     let config = pdev.config_space_extended()?;
> +///     let sriov = ExtSriovCapability::find(&config)?;
> +///
> +///     let total_vfs = kernel::io_read!(&sriov, .total_vfs);
> +///     let vf_offset = kernel::io_read!(&sriov, .vf_offset);
> +///     let bar0 = kernel::io_read!(&sriov, .vf_bar[0]);
> +///     kernel::io_write!(&sriov, .num_vfs, 4u16);
> +///     let bar0_64 = sriov.read_vf_bar64(0)?;
> +///
> +///     Ok(())
> +/// }
> +/// ```
> +///
> +/// # Invariants
> +///
> +/// `ptr` is within the device's extended configuration space at a valid
> +/// capability. For sized `T`, the region is at least `size_of::<T>()` bytes.
> +pub struct ExtCapability<'a, T: ?Sized + KnownSize = Region<0>> {
> +    config: &'a ConfigSpace<'a, Extended>,
> +    ptr: *mut T,
> +}

This strongly looks like this is reinventing `io::View`. :) Even the
internals look similar. Can you check whether `io::View` can replace
this type? This would remove all the macro business and impl blocks and
simplify this patch considerably.

> +
> +impl<T: ?Sized + KnownSize> Io for ExtCapability<'_, T> {
> +    type Type = T;
> +
> +    #[inline]
> +    fn as_ptr(&self) -> *mut T {
> +        self.ptr
> +    }
> +}
> +
> +macro_rules! impl_ext_cap_io_capable {
> +    ($ty:ty) => {
> +        impl<T: ?Sized + KnownSize> IoCapable<$ty> for ExtCapability<'_, T> {
> +            #[inline]
> +            unsafe fn io_read(&self, address: *mut $ty) -> $ty {
> +                // SAFETY: The caller guarantees `address` is within bounds of
> +                // this capability, which is within the config space.
> +                unsafe { self.config.io_read(address) }
> +            }
> +
> +            #[inline]
> +            unsafe fn io_write(&self, value: $ty, address: *mut $ty) {
> +                // SAFETY: The caller guarantees `address` is within bounds of
> +                // this capability, which is within the config space.
> +                unsafe { self.config.io_write(value, address) }
> +            }
> +        }
> +    };
> +}
> +
> +impl_ext_cap_io_capable!(u8);
> +impl_ext_cap_io_capable!(u16);
> +impl_ext_cap_io_capable!(u32);
> +
> +impl<'a> ExtCapability<'a> {
> +    /// Base offset of this capability in configuration space.
> +    #[inline]
> +    pub fn offset(&self) -> usize {
> +        self.ptr.addr()
> +    }
> +
> +    /// Size of this capability region in bytes.
> +    #[inline]
> +    pub fn size(&self) -> usize {
> +        KnownSize::size(self.ptr)
> +    }
> +
> +    /// Cast to a typed capability, checking that the region is large enough.
> +    pub fn cast_sized<U>(self) -> Result<ExtCapability<'a, U>> {

This allows for any cast, including invalid ones, as long as `U` fits.
While not unsafe, this is still incorrect - I don't see any reason to
make this public, which could be a way to mitigate this.

> +        if self.size() < core::mem::size_of::<U>() {
> +            return Err(EINVAL);
> +        }
> +
> +        // INVARIANT: `self` already satisfies the invariant (ptr is within extended config
> +        // space at a valid capability), and the size check above guarantees the region is at
> +        // least `size_of::<U>()` bytes.
> +        Ok(ExtCapability {
> +            config: self.config,
> +            ptr: core::ptr::without_provenance_mut(self.offset()),
> +        })
> +    }
> +}
> +
> +impl ConfigSpace<'_, Extended> {
> +    /// Finds an extended capability by ID, returning an untyped [`ExtCapability`].
> +    pub fn find_ext_capability(&self, cap: ExtCapId) -> Result<ExtCapability<'_>> {
> +        let offset = usize::from(
> +            // SAFETY: `self.pdev` is valid by the type invariant of `ConfigSpace`.
> +            unsafe {
> +                bindings::pci_find_ext_capability(self.pdev.as_raw(), i32::from(cap.as_raw()))
> +            },
> +        );
> +
> +        if offset == 0 {
> +            return Err(ENODEV);
> +        }
> +
> +        Ok(self.make_ext_capability(offset))
> +    }
> +
> +    /// Finds the next extended capability with `cap` after `start`.
> +    pub fn find_next_ext_capability(&self, start: u16, cap: ExtCapId) -> Result<ExtCapability<'_>> {

This `start` offset can be anything and potentially lead to invalid
results being returned (I don't know what the C binding does, but
assuming the worst for safety).

Listing capabilities should be done through an iterator as it provides
all their benefits to users. I understand we also want a `find` API to
look for a specific capability and that an iterator is not needed for
this, but if we end up providing a way to list them, it should be
through an iterator.

In any case, this method seems to be unused, so you can safely drop it
for now.

> +        let offset = usize::from(
> +            // SAFETY: `self.pdev` is valid by the type invariant of `ConfigSpace`.
> +            unsafe {
> +                bindings::pci_find_next_ext_capability(
> +                    self.pdev.as_raw(),
> +                    start,
> +                    i32::from(cap.as_raw()),
> +                )
> +            },
> +        );
> +
> +        if offset == 0 {
> +            return Err(ENODEV);
> +        }
> +
> +        Ok(self.make_ext_capability(offset))
> +    }
> +
> +    fn make_ext_capability(&self, offset: usize) -> ExtCapability<'_> {
> +        let size = self.calculate_ext_cap_size(offset);
> +
> +        let ptr = core::ptr::slice_from_raw_parts_mut::<u8>(
> +            core::ptr::without_provenance_mut(offset),
> +            size,
> +            // CAST: `Region<0>` is a DST like `[u8]`, so this pointer cast preserves metadata.
> +        ) as *mut Region<0>;
> +
> +        // INVARIANT: `offset` was returned by `pci_find_ext_capability` /

This method cannot make assumptions about the origin of its arguments
without being `unsafe`, as its correct behavior depends on the goodwill
of the caller. Given that we will drop `find_next_ext_capability`, the
code can be rolled into `find_ext_capability` which is now the only user.

> +        // `pci_find_next_ext_capability`, which guarantees it points to a valid capability
> +        // within the extended configuration space. `size` is bounded by the next capability
> +        // offset or the end of the configuration space.
> +        ExtCapability { config: self, ptr }
> +    }
> +
> +    fn calculate_ext_cap_size(&self, offset: usize) -> usize {

This method lacks documentation.

> +        let header = self.try_read32(offset).unwrap_or(0);
> +        // SAFETY: Pure bit manipulation, no preconditions.
> +        // CAST: The next-cap pointer is a 12-bit field (max 0xFFC), always fits in `usize`.
> +        let next_ptr = unsafe { bindings::pci_ext_cap_next(header) } as usize;
> +
> +        if next_ptr > offset {
> +            next_ptr - offset
> +        } else {
> +            KnownSize::size(self.as_ptr()) - offset
> +        }
> +    }
> +}
> +
> +/// SR-IOV register layout per PCIe spec (64 bytes starting at cap offset).
> +#[repr(C)]
> +pub struct ExtSriovRegs {
> +    /// Extended capability header.
> +    pub header: u32,
> +    /// SR-IOV capabilities.
> +    pub cap: u32,
> +    /// SR-IOV control.
> +    pub ctrl: u16,
> +    /// SR-IOV status.
> +    pub status: u16,
> +    /// Initial VFs.
> +    pub initial_vfs: u16,
> +    /// Total VFs.
> +    pub total_vfs: u16,
> +    /// Number of VFs.
> +    pub num_vfs: u16,
> +    /// Function dependency link.
> +    pub func_dep_link: u16,
> +    /// First VF offset.
> +    pub vf_offset: u16,
> +    /// VF stride.
> +    pub vf_stride: u16,
> +    _reserved: u16,
> +    /// VF device ID.
> +    pub vf_device_id: u16,
> +    /// Supported page sizes.
> +    pub supported_page_sizes: u32,
> +    /// System page size.
> +    pub system_page_size: u32,
> +    /// VF BARs (BAR0–BAR5).
> +    pub vf_bar: [u32; 6],
> +    /// VF migration state array offset.
> +    pub migration_state: u32,
> +}
> +
> +/// SR-IOV capability. See [`ExtCapability`] for usage.
> +pub type ExtSriovCapability<'a> = ExtCapability<'a, ExtSriovRegs>;
> +
> +impl ExtCapability<'_, ExtSriovRegs> {
> +    /// Find the SR-IOV capability, or `ENODEV` if not present.
> +    #[inline]
> +    pub fn find<'a>(
> +        config: &'a ConfigSpace<'_, Extended>,
> +    ) -> Result<ExtCapability<'a, ExtSriovRegs>> {
> +        config.find_ext_capability(ExtCapId::Sriov)?.cast_sized()
> +    }

This method looks like it belongs to `ConfigSpace`, and should be
generic. Something like

  impl<'a> ConfigSpace<'a, Extended> {
    pub fn find_ext_capability::<C: ExtCapability>(&self) 
      -> Result<io::View<...>>
  }

If we use `io::View` to return capabilities, we can recycle the
`ExtCapability` name for a trait that provides the information required
for `find_ext_capability` to work properly, like the associated constant
for the capability ID. In this patch, it would be implemented on
`ExtSriovRegs`, providing the projection of the view through `C`.

This also opens the way for a `find_capability` method that handles
normal capabilities and is available to both variants of `ConfigSpace`.
I am not saying this needs to be done in this patch though.

> +
> +    /// Reads a 64-bit VF BAR from two consecutive 32-bit slots.
> +    #[inline]
> +    pub fn read_vf_bar64(&self, bar_index: usize) -> Result<u64> {
> +        if bar_index >= 5 {

Why is this value hardcoded? If this is correct, let's make it a
constant with an informative name (and possibly a comment justifying its
value).

> +            return Err(EINVAL);
> +        }
> +        let low = crate::io_read!(self, .vf_bar[bar_index]?);
> +        let high = crate::io_read!(self, .vf_bar[bar_index + 1]?);

Don't we want to check whether the bar in question is actually a 64-bit BAR?

I wonder whether this also doesn't call for an iterator-based solution
returning an enum with the properly-sized BAR.

^ permalink raw reply

* Re: [PATCH 5/5] gpu: nova-core: vbios: use checked ops and accesses in `FwSecBiosImage::ucode`
From: Eliot Courtney @ 2026-04-13  6:20 UTC (permalink / raw)
  To: Joel Fernandes, Eliot Courtney, Danilo Krummrich, Alice Ryhl,
	Alexandre Courbot, David Airlie, Simona Vetter
  Cc: John Hubbard, Alistair Popple, Timur Tabi, rust-for-linux,
	dri-devel, linux-kernel
In-Reply-To: <edff8db4-a3a5-4b21-a8e6-9871206ddeb9@nvidia.com>

On Sat Apr 11, 2026 at 12:05 AM JST, Joel Fernandes wrote:
> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> , one comment below
>
> On 4/10/2026 4:38 AM, Eliot Courtney wrote:
>> Use checked arithmetic and access for extracting the microcode since the
>> offsets are firmware derived.
>> 
>> Fixes: 47c4846e4319 ("gpu: nova-core: vbios: Add support for FWSEC ucode extraction")
>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/vbios.rs | 19 ++++++++++++-------
>>  1 file changed, 12 insertions(+), 7 deletions(-)
>> 
>> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
>> index 3bd3ac3a69f2..b509cd8407a5 100644
>> --- a/drivers/gpu/nova-core/vbios.rs
>> +++ b/drivers/gpu/nova-core/vbios.rs
>> @@ -1027,16 +1027,21 @@ pub(crate) fn header(&self) -> Result<FalconUCodeDesc> {
>>  
>>      /// Get the ucode data as a byte slice
>>      pub(crate) fn ucode(&self, desc: &FalconUCodeDesc) -> Result<&[u8]> {
>> -        let falcon_ucode_offset = self.falcon_ucode_offset;
>> -
>>          // The ucode data follows the descriptor.
>> -        let ucode_data_offset = falcon_ucode_offset + desc.size();
>> -        let size = usize::from_safe_cast(desc.imem_load_size() + desc.dmem_load_size());
>> +        let data = self
>> +            .base
>> +            .data
>> +            .get(self.falcon_ucode_offset..)
>> +            .ok_or(ERANGE)?;
>> +        let size = usize::from_safe_cast(
>> +            desc.imem_load_size()
>> +                .checked_add(desc.dmem_load_size())
>> +                .ok_or(ERANGE)?,
>> +        );
>>  
>>          // Get the data slice, checking bounds in a single operation.
>> -        self.base
>> -            .data
>> -            .get(ucode_data_offset..ucode_data_offset + size)
>> +        data.get(desc.size()..)
>> +            .and_then(|data| data.get(..size))
>
> It might be worth adding something like:
>
> data.get_slice(start, size) -> Result
>
> in R4L longer term if the data.get(start..).and_then(|data|
> data.get(..size)) pattern is common. It seems to be so in this series.

Yeah, this sounds reasonable although I think we'd need a name that
makes it more obvious it takes a size. I had a look at rust standard
library code and it seems a common pattern is to use split_at_checked
for the initial split. Unfortunately we can't use that yet I think since
it's in 1.80.0, but maybe we can use it very soon.

I had a look and it looks like there aren't really other existing
patterns like .get(start..).and_then(.. get(..size)) in the rust kernel
code. And in nova-core other usages of getting start..start+size slices
have local proofs that start+size doesn't overflow.

For the most part in other code hopefully we have local proofs that
start..start+size won't overflow and we can just .get(start..start+size).

^ permalink raw reply

* Re: [PATCH 3/5] gpu: nova-core: vbios: use checked accesses in `setup_falcon_data`
From: Eliot Courtney @ 2026-04-13  6:04 UTC (permalink / raw)
  To: Joel Fernandes, Eliot Courtney, Danilo Krummrich, Alice Ryhl,
	Alexandre Courbot, David Airlie, Simona Vetter
  Cc: John Hubbard, Alistair Popple, Timur Tabi, rust-for-linux,
	dri-devel, linux-kernel
In-Reply-To: <8bf30537-de54-4be0-a676-3e8aad6fb312@nvidia.com>

On Fri Apr 10, 2026 at 11:53 PM JST, Joel Fernandes wrote:
> Hi Eliot,
>
> On 4/10/2026 4:38 AM, Eliot Courtney wrote:
>> Use checked arithmetic and accesses where the values are firmware
>> derived to prevent potential overflow.
>> 
>> Fixes: dc70c6ae2441 ("gpu: nova-core: vbios: Add support to look up PMU table in FWSEC")
>> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/vbios.rs | 20 ++++++++------------
>>  1 file changed, 8 insertions(+), 12 deletions(-)
>> 
>> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
>> index de856000de23..2b0dc1a9125d 100644
>> --- a/drivers/gpu/nova-core/vbios.rs
>> +++ b/drivers/gpu/nova-core/vbios.rs
>> @@ -936,17 +936,12 @@ fn setup_falcon_data(
>>  
>>          self.falcon_data_offset = Some(offset);
>>  
>> -        if pmu_in_first_fwsec {
>> -            self.pmu_lookup_table = Some(PmuLookupTable::new(
>> -                &self.base.dev,
>> -                &first_fwsec.base.data[offset..],
>> -            )?);
>> +        let pmu_lookup_data = if pmu_in_first_fwsec {
>> +            &first_fwsec.base.data[offset..]
>
> I suggest use get() here as well for consistency with your use of get()
> further below.
>   first_fwsec.base.data.get(offset..).ok_or(EINVAL)?

This one has a local proof that it won't ever OOB, so I didn't use
get(). Not sure what the convention is, but what makes most sense to me
is to use get() if there is no local proof that it will always succeed
and use [] if there is such a proof. WDYT? Do you know if there's a
decided convention for this?

>
>>          } else {
>> -            self.pmu_lookup_table = Some(PmuLookupTable::new(
>> -                &self.base.dev,
>> -                &self.base.data[offset..],
>> -            )?);
>> -        }
>> +            self.base.data.get(offset..).ok_or(EINVAL)?
>> +        };
>> +        self.pmu_lookup_table = Some(PmuLookupTable::new(&self.base.dev, pmu_lookup_data)?);
>>  
>>          match self
>>              .pmu_lookup_table
>> @@ -955,8 +950,9 @@ fn setup_falcon_data(
>>              .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD)
>>          {
>>              Ok(entry) => {
>> -                let mut ucode_offset = usize::from_safe_cast(entry.data);
>> -                ucode_offset -= pci_at_image.base.data.len();
>> +                let mut ucode_offset = usize::from_safe_cast(entry.data)
>> +                    .checked_sub(pci_at_image.base.data.len())
>> +                    .ok_or(EINVAL)?;
>>                  if ucode_offset < first_fwsec.base.data.len() {
>>                      dev_err!(self.base.dev, "Falcon Ucode offset not in second Fwsec.\n");
>>                      return Err(EINVAL);
>
> How about replace this whole block with:
>
>   self.falcon_ucode_offset = Some(
>       usize::from_safe_cast(entry.data)
>           .checked_sub(pci_at_image.base.data.len())
>           .and_then(|o| o.checked_sub(first_fwsec.base.data.len()))
>           .ok_or(EINVAL)
>           .inspect_err(|_| {
>               dev_err!(self.base.dev,
>                        "Falcon Ucode offset not in second Fwsec.\n");
>           })?,
>   );
>
> That way, the error message also shows up when
> checked_sub(pci_at_image.base.data.len()) fails and it is a bit cleaner.

Yeah, this looks like a better way to do it. Thanks! Will apply.

>
> If you agree with both the above suggestions:
>
> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
>
> thanks,
>
> --
> Joel Fernandes


^ permalink raw reply

* Re: [RFC v3 00/27] lib: Rust implementation of SPDM
From: Alistair Francis @ 2026-04-13  5:42 UTC (permalink / raw)
  To: djbw
  Cc: bhelgaas, lukas, rust-for-linux, akpm, linux-pci,
	Jonathan.Cameron, linux-cxl, linux-kernel, alex.gaynor,
	benno.lossin, boqun.feng, a.hindborg, gary, bjorn3_gh, tmgross,
	ojeda, wilfred.mallawa, aliceryhl, Alistair Francis
In-Reply-To: <CAKmqyKPyR165rACXRMJXgyUOp6OUdQqSsym-gfYVFnKLZ4ednQ@mail.gmail.com>

On Thu, Apr 9, 2026 at 1:39 PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Thu, Feb 12, 2026 at 3:56 PM <dan.j.williams@intel.com> wrote:
> >
> > alistair23@ wrote:
> > > From: Alistair Francis <alistair.francis@wdc.com>
> >
> > Hi Alistair, quite a bit to digest here and details to dig into. Before
> > getting into that, I will say that at a broad strokes level, no immune
> > response to the core proposal of depending on a Rust SPDM library and
> > forgoing a C SPDM library.
> >
> > Most of that allergy relief comes from how this organizes the C to
> > Rust interactions. The core SPDM implementation calls out to C for the
> > presentation layer (Netlink) or is invoked by sysfs. That makes it
> > amenable for sharing those presentation mechanics.
> >
> > Specifically, my primary concern is integration and refactoring
> > opportunities with the PCI TSM implementation [1]. The PCI TSM case
> > should use the same uABI transport for requesting + conveying device
> > certificate chain, SPDM transcript, and device measurements as PCI CMA.
> > Note that in the TSM case the SPDM implementation is in platform
> > firmware and will bypass this library. The TSM SPDM session is mutually
> > exclusive with the CMA SPDM session.
> >
> > [1]: http://lore.kernel.org/69339e215b09f_1e0210057@dwillia2-mobl4.notmuch
> >
> > > Security Protocols and Data Models (SPDM) [1] is used for authentication,
> > > attestation and key exchange. SPDM is generally used over a range of
> > > transports, such as PCIe, MCTP/SMBus/I3C, ATA, SCSI, NVMe or TCP.
> > >
> > > From the kernels perspective SPDM is used to authenticate and attest devices.
> > > In this threat model a device is considered untrusted until it can be verified
> > > by the kernel and userspace using SPDM. As such SPDM data is untrusted data
> > > that can be mallicious.
> > >
> > > The SPDM specification is also complex, with the 1.2.1 spec being almost 200
> > > pages and the 1.3.0 spec being almost 250 pages long.
> > >
> > > As such we have the kernel parsing untrusted responses from a complex
> > > specification, which sounds like a possible exploit vector. This is the type
> > > of place where Rust excels!
> > >
> > > This series implements a SPDM requester in Rust.
> > >
> > > This is very similar to Lukas' implementation [2]. This series includes patches
> > > and files from Lukas' C SPDM implementation, which isn't in mainline.
> > >
> > > This is a standalone series and doesn't depend on Lukas' implementation.
> >
> > So, I *am* allergic to how this series references Lukas' work by
> > pointing to random points in his personal git tree. I trust that was
> > done for RFC purposes, but it would have helped to call that out in the
> > changelog and set expectations about the ideal path to coordinate with
> > that work.
> >
> > > To help with maintaining compatibility it's designed in a way to match Lukas'
> > > design and the state struct stores the same information, although in a Rust
> > > struct instead of the original C one.
> > >
> > > This series exposes the data to userspace via netlink, with a single sysfs
> > > atrribute to allow reauthentication.
> > >
> > > All of the patches are included in the RFC, as it depends on some patches
> > > that aren't upstream yet.
> > >
> > > Now that Rust is no longer experimental I have picked this back up. If the
> > > community is generally on board with a Rust implementation I can work on
> > > sending a non-RFC version and push towards getting that merged.
> >
> > As long as this stays explicitly designed to minimize exposure to
> > "refactor across language boundary" events (as initially seems to be the
> > case), then it seems workable.
> >
> > > The entire tree can be seen here: https://github.com/alistair23/linux/tree/alistair/spdm-rust
> > >
> > > I'm testing the netlink data by running the following
> > >
> > > ```shell
> > > cargo run -- --qemu-server response
> > >
> > > qemu-system-x86_64 \
> > >   -nic none \
> > >   -object rng-random,filename=/dev/urandom,id=rng0 \
> > >   -device virtio-rng-pci,rng=rng0 \
> > >   -drive file=deploy/images/qemux86-64/core-image-pcie-qemux86-64.rootfs.ext4,if=virtio,format=raw \
> > >   -usb -device usb-tablet -usb -device usb-kbd \
> > >   -cpu Skylake-Client \
> > >   -machine q35,i8042=off \
> > >   -smp 4 -m 2G \
> > >   -drive file=blknvme,if=none,id=mynvme,format=raw \
> > >   -device nvme,drive=mynvme,serial=deadbeef,spdm_port=2323,spdm_trans=doe \
> > >   -snapshot \
> > >   -serial mon:stdio -serial null -nographic \
> > >   -kernel deploy/images/qemux86-64/bzImage \
> > >   -append 'root=/dev/vda rw  console=ttyS0 console=ttyS1 oprofile.timer=1 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 swiotlb=0 '
> > >
> > > spdm_utils identify &
> > > sleep 1
> > > echo re > /sys/devices/pci0000:00/0000:00:03.0/authenticated
> >
> > So this is where it will collide with TSM that also emits an
> > authenticated attribute. See Documentation/ABI/testing/sysfs-bus-pci.
> >
> > The rough plan Lukas and I worked out is that switching between TSM and
> > CMA based authentication would use sysfs visibility to coordinate. I.e.
> > TSM to CMA conversion hides the TSM "authenticated" attribute and
> > unhides the CMA attribute of the same name.
>
> That seems straightforward and is already documented upstream as well,
> so that's pretty easy.
>
> >
> > The most significant unsolved point of contention between TSM and CMA is
> > the policy on when authentication is mandated and the driver probe
> > policy. The proposed model for enforcing device security for
> > Confidential Computing is make it completely amenable to userspace
> > policy. Draft details here [2] to be refreshed "soon" when I send out
> > the next version of that.
> >
> > [2]: http://lore.kernel.org/20250827035259.1356758-6-dan.j.williams@intel.com
>
> CMA will eventually need to support some sort of drive probe policy as
> well, but that can wait until later and isn't going to be dealt with
> in this series.
>
> >
> > To be clear I am ok if there is an incremental option to have auto_cma
> > and/or auto_tsm that arranges for authentication or link encryption to
> > happen without asking. I take issue with auto_cma being the only hard
> > coded option.
>
> I have been working through all of the comments and discussions and I
> think I have addressed everything, except for this one.
>
> To summerise, is the high level issue is how do we know if we should
> use CMA or TSM?
>
> Do you have any more thoughts on this?

+ Dan's Kernel.org email

Alistair

^ permalink raw reply

* Re: [PATCH 1/3] rust: sync: add helpers for mb, dma_mb and friends
From: Eliot Courtney @ 2026-04-13  5:33 UTC (permalink / raw)
  To: Gary Guo, Miguel Ojeda, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
	Danilo Krummrich
  Cc: Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	rust-for-linux, nouveau, linux-kernel, linux-arch, lkmm
In-Reply-To: <20260402152443.1059634-3-gary@kernel.org>

On Fri Apr 3, 2026 at 12:24 AM JST, Gary Guo wrote:
> From: Gary Guo <gary@garyguo.net>
>
> They supplement the existing smp_mb, smp_rmb and smp_wmb.
>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply

* Re: [PATCH 3/3] gpu: nova-core: fix wrong use of barriers in GSP code
From: Eliot Courtney @ 2026-04-13  5:33 UTC (permalink / raw)
  To: Gary Guo, Miguel Ojeda, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
	Danilo Krummrich, Alexandre Courbot, David Airlie, Simona Vetter
  Cc: Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	rust-for-linux, nouveau, linux-kernel, linux-arch, lkmm,
	dri-devel
In-Reply-To: <20260402152443.1059634-5-gary@kernel.org>

On Fri Apr 3, 2026 at 12:24 AM JST, Gary Guo wrote:
> From: Gary Guo <gary@garyguo.net>
>
> Currently, in the GSP->CPU messaging path, the current code misses a read
> barrier before data read. The barrier after read is updated to a DMA
> barrier (with release ordering desired), instead of the existing (Rust)
> SeqCst SMP barrier; the location of barrier is also moved to the beginning
> of function, because the barrier is needed to synchronizing between data
> and ring-buffer pointer, the RMW operation does not internally need a
> barrier (nor it has to be atomic, as CPU pointers are updated by CPU only).
>
> In the CPU->GSP messaging path, the current code misses a write barrier
> after data write and before updating the CPU write pointer. Barrier is not
> needed before data write due to control dependency, this fact is documented
> explicitly. This could be replaced with an acquire barrier if needed.
>
> Signed-off-by: Gary Guo <gary@garyguo.net>

nit: should this have
Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")

?

> ---
>  drivers/gpu/nova-core/gsp/cmdq.rs | 19 +++++++++++++++++++
>  drivers/gpu/nova-core/gsp/fw.rs   | 12 ------------
>  2 files changed, 19 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 2224896ccc89..7e4315b13984 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -19,6 +19,12 @@
>      prelude::*,
>      sync::{
>          aref::ARef,
> +        barrier::{
> +            dma_mb,
> +            Read,
> +            Release,
> +            Write, //
> +        },
>          Mutex, //
>      },
>      time::Delta,
> @@ -258,6 +264,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          let tx = self.cpu_write_ptr() as usize;
>          let rx = self.gsp_read_ptr() as usize;
>  
> +        // ORDERING: control dependency provides necessary LOAD->STORE ordering.
> +        // `dma_mb(Acquire)` may be used here if we don't want to rely on control dependency.
> +
>          // SAFETY:
>          // - We will only access the driver-owned part of the shared memory.
>          // - Per the safety statement of the function, no concurrent access will be performed.
> @@ -311,6 +320,9 @@ fn driver_write_area_size(&self) -> usize {
>          let tx = self.gsp_write_ptr() as usize;
>          let rx = self.cpu_read_ptr() as usize;
>  
> +        // ORDERING: Ensure data load is ordered after load of GSP write pointer.
> +        dma_mb(Read);
> +
>          // SAFETY:
>          // - We will only access the driver-owned part of the shared memory.
>          // - Per the safety statement of the function, no concurrent access will be performed.
> @@ -408,6 +420,10 @@ fn cpu_read_ptr(&self) -> u32 {
>  
>      // Informs the GSP that it can send `elem_count` new pages into the message queue.
>      fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
> +        // ORDERING: Ensure read pointer is properly ordered.

What about a more specific comment that describes exactly what is
ordered, e.g. something like:
Ensure all reads of message data by the CPU have completed before writing
the updated read pointer to the GSP, since it may overwrite that data.

Maybe this is just me but it's a lot easier for me to think of the
orderings as a pair of (load? store? -> load? store?) which works for
everything hw actually supports except for ll+ls+ss, rather than mapping
'Release' to (load+store -> store) in my head. e.g. here IIUC we need to
make sure all loads by the CPU are done before we do the store for the
pointer, so we need to make sure loads don't cross ahead of this
barrier but also that stores don't cross behind it, so (load -> store)
should be sufficient? So, depending on what you want to do with the
memory model, this could be tightened IMO. Unlike the one below that
only needs to order stores with eachother (ss).

> +        //

nit: stray //

> +        dma_mb(Release);
> +
>          super::fw::gsp_mem::advance_cpu_read_ptr(&self.0, elem_count)
>      }
>  
> @@ -422,6 +438,9 @@ fn cpu_write_ptr(&self) -> u32 {
>  
>      // Informs the GSP that it can process `elem_count` new pages from the command queue.
>      fn advance_cpu_write_ptr(&mut self, elem_count: u32) {
> +        // ORDERING: Ensure all command data is visible before updateing ring buffer pointer.
> +        dma_mb(Write);
> +
>          super::fw::gsp_mem::advance_cpu_write_ptr(&self.0, elem_count)
>      }
>  }
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index 0c8a74f0e8ac..62c2cf1b030c 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -42,11 +42,6 @@
>  
>  // TODO: Replace with `IoView` projections once available.
>  pub(super) mod gsp_mem {
> -    use core::sync::atomic::{
> -        fence,
> -        Ordering, //
> -    };
> -
>      use kernel::{
>          dma::Coherent,
>          dma_read,
> @@ -72,10 +67,6 @@ pub(in crate::gsp) fn cpu_read_ptr(qs: &Coherent<GspMem>) -> u32 {
>  
>      pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &Coherent<GspMem>, count: u32) {
>          let rptr = cpu_read_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
> -
> -        // Ensure read pointer is properly ordered.
> -        fence(Ordering::SeqCst);
> -
>          dma_write!(qs, .cpuq.rx.0.readPtr, rptr);
>      }
>  
> @@ -87,9 +78,6 @@ pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &Coherent<GspMem>, count: u32) {
>          let wptr = cpu_write_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
>  
>          dma_write!(qs, .cpuq.tx.0.writePtr, wptr);
> -
> -        // Ensure all command data is visible before triggering the GSP read.
> -        fence(Ordering::SeqCst);
>      }
>  }
>  


^ permalink raw reply

* Re: [PATCH 5/6] gpu: nova-core: skip the IFR header if present
From: Eliot Courtney @ 2026-04-13  4:53 UTC (permalink / raw)
  To: Timur Tabi, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Eliot Courtney, John Hubbard, rust-for-linux
In-Reply-To: <20260410203722.1586938-6-ttabi@nvidia.com>

On Sat Apr 11, 2026 at 5:37 AM JST, Timur Tabi wrote:
> The GPU's ROM may begin with an Init-from-ROM (IFR) header that precedes
> the PCI Expansion ROM images (VBIOS).  When present, the PROM shadow
> method must parse this header to determine the offset where the PCI ROM
> images actually begin, and adjust all subsequent reads accordingly.
>
> On most GPUs this is not needed because the IFR microcode has already
> applied the ROM offset so that PROM reads transparently skip the header.
> On GA100, for whatever reason, the IFR offset is not applied to PROM
> reads.  Therefore, the search for the PCI expansion must skip the IFR
> header, if found.

I like this commit message a lot because it made it really easy for me
to understand why we are doing this. Could we concisely put a comment in
the code as well explaining why it's necessary?

>
> Signed-off-by: Timur Tabi <ttabi@nvidia.com>
> ---
>  drivers/gpu/nova-core/vbios.rs | 39 +++++++++++++++++++++++++++++++++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
> index e726594eb130..a7ec68448a4a 100644
> --- a/drivers/gpu/nova-core/vbios.rs
> +++ b/drivers/gpu/nova-core/vbios.rs
> @@ -89,13 +89,50 @@ struct VbiosIterator<'a> {
>      last_found: bool,
>  }
>  
> +/// IFR signature: ASCII "NVGI" as a little-endian u32.
> +const IFR_SIGNATURE: u32 = 0x4947564E;
> +/// ROM directory signature: ASCII "RFRD" as a little-endian u32.
> +const ROM_DIRECTORY_SIGNATURE: u32 = 0x44524652;

I wonder if we can use the bindings for some of these constants,
including the fixed value offsets below. I found these values in OpenRM
in dev_bus.h, for example (but it was in tu102 so even though the values
are the same not sure if it's semantically correct):

```
#define NV_PBUS_IFR_FMT_FIXED0                        0x00000000 /*       */
#define NV_PBUS_IFR_FMT_FIXED0_SIGNATURE                    31:0 /*       */
#define NV_PBUS_IFR_FMT_FIXED0_SIGNATURE_VALUE        0x4947564E /*       */
#define NV_PBUS_IFR_FMT_FIXED1                        0x00000004 /*       */
#define NV_PBUS_IFR_FMT_FIXED1_VERSIONSW                    15:8 /*       */
#define NV_PBUS_IFR_FMT_FIXED1_FIXED_DATA_SIZE             30:16 /*       */
#define NV_PBUS_IFR_FMT_FIXED2                        0x00000008 /*       */
#define NV_PBUS_IFR_FMT_FIXED2_TOTAL_DATA_SIZE              19:0 /*       */
```

Unfortunately IIUC we can't use the bitfield range specifies via
bindgen, but the FIXEDX offsets and some of the signature values might
be doable via bindings. WDYT?

> +
>  impl<'a> VbiosIterator<'a> {
>      fn new(dev: &'a device::Device, bar0: &'a Bar0) -> Result<Self> {
> +        let sig = bar0.try_read32(ROM_OFFSET)?;
> +
> +        let current_offset = if sig == IFR_SIGNATURE {
> +            let fixed1 = bar0.try_read32(ROM_OFFSET + 4)?;
> +            let version = ((fixed1 >> 8) & 0xff) as u8;
> +
> +            match version {
> +                // Note: We do not actually expect to see v1 or v2 on these GPUs
> +                1 | 2 => {
> +                    let fixed_data_size = ((fixed1 >> 16) & 0x7fff) as usize;
> +                    bar0.try_read32(ROM_OFFSET + fixed_data_size + 4)? as usize

Should this be an error instead if we don't expect to see it?

> +                }
> +                3 => {
> +                    let fixed2 = bar0.try_read32(ROM_OFFSET + 8)?;
> +                    let total_data_size = (fixed2 & 0x000f_ffff) as usize;

What about using the bitfield! macro to define bitfields for these
things? e.g.

```
bitfield! {
    struct IfrFixed1(u32) {
        15:8 version as u8;
        30:16 fixed_data_size as u16;
    }
}

bitfield! {
    struct IfrFixed2(u32) {
        19:0 total_data_size as u32;
    }
}
```

> +                    let dir_offset = bar0.try_read32(ROM_OFFSET + total_data_size)? as usize + 4096;

What about making an inline constant for 4096?

> +                    let dir_sig = bar0.try_read32(ROM_OFFSET + dir_offset)?;
> +                    if dir_sig != ROM_DIRECTORY_SIGNATURE {
> +                        dev_err!(dev, "could not find IFR ROM directory\n");
> +                        return Err(EINVAL);
> +                    }
> +                    bar0.try_read32(ROM_OFFSET + dir_offset + 8)? as usize
> +                }
> +                _ => {
> +                    dev_err!(dev, "unsupported IFR header version {}\n", version);
> +                    return Err(EINVAL);
> +                }
> +            }
> +        } else {
> +            0
> +        };
> +
>          Ok(Self {
>              dev,
>              bar0,
>              data: KVec::new(),
> -            current_offset: 0,
> +            current_offset,
>              last_found: false,
>          })
>      }


^ permalink raw reply

* Re: [PATCH 6/6] gpu: nova-core: enable GA100
From: Eliot Courtney @ 2026-04-13  4:20 UTC (permalink / raw)
  To: Timur Tabi, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Eliot Courtney, John Hubbard, rust-for-linux
In-Reply-To: <20260410203722.1586938-7-ttabi@nvidia.com>

On Sat Apr 11, 2026 at 5:37 AM JST, Timur Tabi wrote:
> GA100 is a compute-only variant of GA102 that boots GSP-RM like a
> Turing, although it also has its own unique requirements.
>
> Now that all the pieces are in place, we can enable GA100 support.
> Although architecturally like an Ampere, GA100 uses the same GSP-RM
> firmware files as Turing, and therefore must boot it like Turing does.
> However, as a compute-only part, GA100 has no display engine.
>
> Signed-off-by: Timur Tabi <ttabi@nvidia.com>
> ---
>  drivers/gpu/nova-core/falcon/hal.rs | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
> index a7e5ea8d0272..5c2f92629eec 100644
> --- a/drivers/gpu/nova-core/falcon/hal.rs
> +++ b/drivers/gpu/nova-core/falcon/hal.rs
> @@ -77,13 +77,12 @@ pub(super) fn falcon_hal<E: FalconEngine + 'static>(
>      use Chipset::*;
>  
>      let hal = match chipset {
> -        TU102 | TU104 | TU106 | TU116 | TU117 => {
> +        TU102 | TU104 | TU106 | TU116 | TU117 | GA100 => {
>              KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
>          }
>          GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
>              KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
>          }
> -        _ => return Err(ENOTSUPP),
>      };
>  
>      Ok(hal)

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply

* Re: [PATCH 3/6] gpu: nova-core: only boot FRTS if it actually exists
From: Eliot Courtney @ 2026-04-13  4:19 UTC (permalink / raw)
  To: Timur Tabi, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Eliot Courtney, John Hubbard, rust-for-linux
In-Reply-To: <20260410203722.1586938-4-ttabi@nvidia.com>

On Sat Apr 11, 2026 at 5:37 AM JST, Timur Tabi wrote:
> On some Nvidia GPUs (i.e. GA100), the FRTS region is not allocated
> because there is no display hardware.  If the region does not exist,
> then FWSEC-FRTS should not be run.
>
> Signed-off-by: Timur Tabi <ttabi@nvidia.com>
> ---
>  drivers/gpu/nova-core/gsp/boot.rs | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
> index 18f356c9178e..d50dc9f41831 100644
> --- a/drivers/gpu/nova-core/gsp/boot.rs
> +++ b/drivers/gpu/nova-core/gsp/boot.rs
> @@ -155,7 +155,10 @@ pub(crate) fn boot(
>          let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?;
>          dev_dbg!(dev, "{:#x?}\n", fb_layout);
>  
> -        Self::run_fwsec_frts(dev, chipset, gsp_falcon, bar, &bios, &fb_layout)?;
> +        // Only boot FRTS if it actually exists.
> +        if !fb_layout.frts.is_empty() {
> +            Self::run_fwsec_frts(dev, chipset, gsp_falcon, bar, &bios, &fb_layout)?;
> +        }
>  
>          let booter_loader = BooterFirmware::new(
>              dev,

If FRTS is optional, WDYT about making fb_layout.frts an Option<FbRange>
and having the frts_size hal method return Option<NonZeroUsize> for
example? I think it will make it clearer in other locations that use
FRTS that it possibly might not exist.

^ permalink raw reply

* Re: [PATCH 2/6] gpu: nova-core: do not consider 0xBB77 as a valid PCI ROM header signature
From: Eliot Courtney @ 2026-04-13  4:11 UTC (permalink / raw)
  To: Timur Tabi, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Eliot Courtney, John Hubbard, rust-for-linux
In-Reply-To: <20260410203722.1586938-3-ttabi@nvidia.com>

On Sat Apr 11, 2026 at 5:37 AM JST, Timur Tabi wrote:
> Nvidia GPUs have some PCI expansion ROM sections that have an Nvidia-
> specific signature instead of 0xAA55.  Signature 0xBB77 is actually an
> internal-only value that has been deprecated for over a decade.
> Nova-core will never encounter a GPU with that signature, so don't look
> for it.
>
> Signed-off-by: Timur Tabi <ttabi@nvidia.com>
> ---
>  drivers/gpu/nova-core/vbios.rs | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
> index ebda28e596c5..e726594eb130 100644
> --- a/drivers/gpu/nova-core/vbios.rs
> +++ b/drivers/gpu/nova-core/vbios.rs
> @@ -491,7 +491,7 @@ fn new(dev: &device::Device, data: &[u8]) -> Result<Self> {
>  
>          // Check for valid ROM signatures.
>          match signature {
> -            0xAA55 | 0xBB77 | 0x4E56 => {}
> +            0xAA55 | 0x4E56 => {}
>              _ => {
>                  dev_err!(dev, "ROM signature unknown {:#x}\n", signature);
>                  return Err(EINVAL);

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply

* Re: [PATCH 1/6] gpu: nova-core: use correct fwsignature for GA100
From: Eliot Courtney @ 2026-04-13  4:10 UTC (permalink / raw)
  To: Timur Tabi, Danilo Krummrich, Alexandre Courbot, Joel Fernandes,
	Eliot Courtney, John Hubbard, rust-for-linux
In-Reply-To: <20260410203722.1586938-2-ttabi@nvidia.com>

On Sat Apr 11, 2026 at 5:37 AM JST, Timur Tabi wrote:
> Although GA100 uses the same GSP-RM firmware as Turing, it has a different
> signature specifically for it.
>
> Fixes: 121ea04cd9f2 ("gpu: nova-core: add support for Turing/GA100 fwsignature")
> Signed-off-by: Timur Tabi <ttabi@nvidia.com>
> ---
>  drivers/gpu/nova-core/firmware/gsp.rs | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
> index 2fcc255c3bc8..63e464788c0b 100644
> --- a/drivers/gpu/nova-core/firmware/gsp.rs
> +++ b/drivers/gpu/nova-core/firmware/gsp.rs
> @@ -139,7 +139,7 @@ pub(crate) fn new<'a>(
>                          }
>                          Architecture::Turing => ".fwsignature_tu10x",
>                          // GA100 uses the same firmware as Turing
> -                        Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
> +                        Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_ga100",
>                          Architecture::Ampere => ".fwsignature_ga10x",
>                          Architecture::Ada => ".fwsignature_ad10x",
>                      };

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>


^ permalink raw reply

* Re: [PATCH v4 1/4] rust: devres: return reference in `devres::register`
From: Viresh Kumar @ 2026-04-13  3:58 UTC (permalink / raw)
  To: markus.probst
  Cc: Rob Herring, Greg Kroah-Hartman, Jiri Slaby, Miguel Ojeda,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, Danilo Krummrich, Kari Argillander,
	Rafael J. Wysocki, Boqun Feng, David Airlie, Simona Vetter,
	linux-serial, linux-kernel, rust-for-linux, linux-pm, driver-core,
	dri-devel
In-Reply-To: <20260411-rust_serdev-v4-1-845e960c6627@posteo.de>

On 11-04-26, 17:10, Markus Probst via B4 Relay wrote:
> From: Markus Probst <markus.probst@posteo.de>
> 
> Return the reference to the initialized data in the `devres::register`
> function.
> 
> This is needed in a following commit (rust: add basic serial device bus
> abstractions).
> 
> Signed-off-by: Markus Probst <markus.probst@posteo.de>
> ---
>  rust/kernel/cpufreq.rs    |  3 ++-
>  rust/kernel/devres.rs     | 15 +++++++++++++--
>  rust/kernel/drm/driver.rs |  3 ++-
>  3 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/rust/kernel/cpufreq.rs b/rust/kernel/cpufreq.rs
> index f5adee48d40c..31bf7e685097 100644
> --- a/rust/kernel/cpufreq.rs
> +++ b/rust/kernel/cpufreq.rs
> @@ -1052,7 +1052,8 @@ pub fn new_foreign_owned(dev: &Device<Bound>) -> Result
>      where
>          T: 'static,
>      {
> -        devres::register(dev, Self::new()?, GFP_KERNEL)
> +        devres::register(dev, Self::new()?, GFP_KERNEL)?;
> +        Ok(())
>      }
>  }

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply

* Re: [PATCH v2 1/3] rust: extract `bitfield!` macro from `register!`
From: Eliot Courtney @ 2026-04-13  2:29 UTC (permalink / raw)
  To: Alexandre Courbot, Joel Fernandes, Yury Norov, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Daniel Almeida, David Airlie, Simona Vetter
  Cc: John Hubbard, Alistair Popple, Timur Tabi, Zhi Wang,
	Eliot Courtney, linux-kernel, rust-for-linux, driver-core,
	dri-devel
In-Reply-To: <20260409-bitfield-v2-1-23ac400071cb@nvidia.com>

On Thu Apr 9, 2026 at 11:58 PM JST, Alexandre Courbot wrote:
> Extract the bitfield-defining part of the `register!` macro into an
> independent macro used to define bitfield types with bounds-checked
> accessors.
>
> Each field is represented as a `Bounded` of the appropriate bit width,
> ensuring field values are never silently truncated.
>
> Fields can optionally be converted to/from custom types, either fallibly
> or infallibly.
>
> Appropriate documentation is also added, and a MAINTAINERS entry created
> for the new module.
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  MAINTAINERS                |   8 +
>  rust/kernel/bitfield.rs    | 491 +++++++++++++++++++++++++++++++++++++++++++++
>  rust/kernel/io/register.rs | 246 +----------------------
>  rust/kernel/lib.rs         |   1 +
>  4 files changed, 502 insertions(+), 244 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b01791963e25..77f2617ade5d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -23186,6 +23186,14 @@ F:	scripts/*rust*
>  F:	tools/testing/selftests/rust/
>  K:	\b(?i:rust)\b
>  
> +RUST [BITFIELD]
> +M:	Alexandre Courbot <acourbot@nvidia.com>
> +M:	Joel Fernandes <joelagnelf@nvidia.com>
> +R:	Yury Norov <yury.norov@gmail.com>
> +L:	rust-for-linux@vger.kernel.org
> +S:	Maintained
> +F:	rust/kernel/bitfield.rs
> +
>  RUST [ALLOC]
>  M:	Danilo Krummrich <dakr@kernel.org>
>  R:	Lorenzo Stoakes <ljs@kernel.org>

nit: Should this be kept in alphabetical order (e.g. with ALLOC here
below?).

> diff --git a/rust/kernel/bitfield.rs b/rust/kernel/bitfield.rs
> new file mode 100644
> index 000000000000..f5948eec8a76
> --- /dev/null
> +++ b/rust/kernel/bitfield.rs
> @@ -0,0 +1,491 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Support for defining bitfields as Rust structures.
> +//!
> +//! The [`bitfield!`](kernel::bitfield!) macro declares integer types that are split into distinct
> +//! bit fields of arbitrary length. Each field is typed using [`Bounded`](kernel::num::Bounded) to
> +//! ensure values are properly validated and to avoid implicit data loss.
> +//!
> +//! # Example
> +//!
> +//! ```rust
> +//! use kernel::bitfield;
> +//! use kernel::num::Bounded;
> +//!
> +//! bitfield! {
> +//!     pub struct Rgb(u16) {
> +//!         15:11 blue;
> +//!         10:5 green;
> +//!         4:0 red;
> +//!     }
> +//! }
> +//!
> +//! // Valid value for the `blue` field.
> +//! let blue = Bounded::<u16, 5>::new::<0x18>();
> +//!
> +//! // Setters can be chained. Values ranges are checked at compile-time.
> +//! let color = Rgb::zeroed()
> +//!     // Compile-time bounds check of constant value.
> +//!     .with_const_red::<0x10>()
> +//!     .with_const_green::<0x1f>()
> +//!     // A `Bounded` can also be passed.
> +//!     .with_blue(blue);
> +//!
> +//! assert_eq!(color.red(), 0x10);
> +//! assert_eq!(color.green(), 0x1f);
> +//! assert_eq!(color.blue(), 0x18);
> +//! assert_eq!(
> +//!     color.into_raw(),
> +//!     (0x18 << Rgb::BLUE_SHIFT) + (0x1f << Rgb::GREEN_SHIFT) + 0x10,
> +//! );
> +//!
> +//! // Convert to/from the backing storage type.
> +//! let raw: u16 = color.into();
> +//! assert_eq!(Rgb::from(raw), color);
> +//! ```
> +//!
> +//! # Syntax
> +//!
> +//! ```text
> +//! bitfield! {
> +//!     #[attributes]
> +//!     // Documentation for `Name`.
> +//!     pub struct Name(storage_type) {
> +//!         // `field_1` documentation.
> +//!         hi:lo field_1;
> +//!         // `field_2` documentation.
> +//!         hi:lo field_2 => ConvertedType;
> +//!         // `field_3` documentation.
> +//!         hi:lo field_3 ?=> ConvertedType;
> +//!         ...
> +//!     }
> +//! }
> +//! ```
> +//!
> +//! - `storage_type`: The underlying integer type (`u8`, `u16`, `u32`, `u64`).
> +//! - `hi:lo`: Bit range (inclusive), where `hi >= lo`.
> +//! - `=> Type`: Optional infallible conversion (see [below](#infallible-conversion-)).
> +//! - `?=> Type`: Optional fallible conversion (see [below](#fallible-conversion-)).
> +//! - Documentation strings and attributes are optional.
> +//!
> +//! # Generated code
> +//!
> +//! Each field is internally represented as a [`Bounded`] parameterized by its bit width. Field
> +//! values can either be set/retrieved directly, or converted from/to another type.
> +//!
> +//! The use of `Bounded` for each field enforces bounds-checking (at build time or runtime) of every
> +//! value assigned to a field. This ensures that data is never accidentally truncated.
> +//!
> +//! The macro generates the bitfield type, [`From`] and [`Into`] implementations for its storage
> +//! type, as well as [`Debug`] and [`Zeroable`](pin_init::Zeroable) implementations.
> +//!
> +//! For each field, it also generates:
> +//!
> +//! - `with_field(value)` — infallible setter; the argument type must be statically known to fit
> +//!   the field width.
> +//! - `with_const_field::<VALUE>()` — const setter; the value is validated at compile time.
> +//!   Usually shorter to use than `with_field` for constant values as it doesn't require
> +//!   constructing a `Bounded`.
> +//! - `try_with_field(value)` — fallible setter. Returns an error if the value is out of range.
> +//! - `FIELD_MASK`, `FIELD_SHIFT`, `FIELD_RANGE` - constants for manual bit manipulation.
> +//!
> +//! # Implicit conversions
> +//!
> +//! Types that fit entirely within a field's bit width can be used directly with setters. For
> +//! example, `bool` works with single-bit fields, and `u8` works with 8-bit fields:
> +//!
> +//! ```rust
> +//! use kernel::bitfield;
> +//!
> +//! bitfield! {
> +//!     pub struct Flags(u32) {
> +//!         15:8 byte_field;
> +//!         0:0 flag;
> +//!     }
> +//! }
> +//!
> +//! let flags = Flags::zeroed()
> +//!     .with_byte_field(0x42_u8)
> +//!     .with_flag(true);
> +//!
> +//! assert_eq!(flags.into_raw(), (0x42 << Flags::BYTE_FIELD_SHIFT) | 1);
> +//! ```
> +//!
> +//! # Runtime bounds checking
> +//!
> +//! When a value is not known at compile time, use `try_with_field()` to check bounds at runtime:
> +//!
> +//! ```rust
> +//! use kernel::bitfield;
> +//!
> +//! bitfield! {
> +//!     pub struct Config(u8) {
> +//!         3:0 nibble;
> +//!     }
> +//! }
> +//!
> +//! fn set_nibble(config: Config, value: u8) -> Result<Config, Error> {
> +//!     // Returns `EOVERFLOW` if `value > 0xf`.
> +//!     config.try_with_nibble(value)
> +//! }
> +//! # Ok::<(), Error>(())
> +//! ```
> +//!
> +//! # Type conversion
> +//!
> +//! Fields can be automatically converted to/from a custom type using `=>` (infallible) or `?=>`
> +//! (fallible). The custom type must implement the appropriate `From` or `TryFrom` traits with
> +//! `Bounded`.
> +//!
> +//! ## Infallible conversion (`=>`)
> +//!
> +//! Use this when all possible bit patterns of a field map to valid values:
> +//!
> +//! ```rust
> +//! use kernel::bitfield;
> +//! use kernel::num::Bounded;
> +//!
> +//! #[derive(Debug, Clone, Copy, PartialEq)]
> +//! enum Power {
> +//!     Off,
> +//!     On,
> +//! }
> +//!
> +//! impl From<Bounded<u32, 1>> for Power {
> +//!     fn from(v: Bounded<u32, 1>) -> Self {
> +//!         match *v {
> +//!             0 => Power::Off,
> +//!             _ => Power::On,
> +//!         }
> +//!     }
> +//! }
> +//!
> +//! impl From<Power> for Bounded<u32, 1> {
> +//!     fn from(p: Power) -> Self {
> +//!         (p as u32 != 0).into()
> +//!     }
> +//! }
> +//!
> +//! bitfield! {
> +//!     pub struct Control(u32) {
> +//!         0:0 power => Power;
> +//!     }
> +//! }
> +//!
> +//! let ctrl = Control::zeroed().with_power(Power::On);
> +//! assert_eq!(ctrl.power(), Power::On);
> +//! ```
> +//!
> +//! ## Fallible conversion (`?=>`)
> +//!
> +//! Use this when some bit patterns of a field are invalid. The getter returns a [`Result`]:
> +//!
> +//! ```rust
> +//! use kernel::bitfield;
> +//! use kernel::num::Bounded;
> +//!
> +//! #[derive(Debug, Clone, Copy, PartialEq)]
> +//! enum Mode {
> +//!     Low = 0,
> +//!     High = 1,
> +//!     Auto = 2,
> +//!     // 3 is invalid
> +//! }
> +//!
> +//! impl TryFrom<Bounded<u32, 2>> for Mode {
> +//!     type Error = u32;
> +//!
> +//!     fn try_from(v: Bounded<u32, 2>) -> Result<Self, u32> {
> +//!         match *v {
> +//!             0 => Ok(Mode::Low),
> +//!             1 => Ok(Mode::High),
> +//!             2 => Ok(Mode::Auto),
> +//!             n => Err(n),
> +//!         }
> +//!     }
> +//! }
> +//!
> +//! impl From<Mode> for Bounded<u32, 2> {
> +//!     fn from(m: Mode) -> Self {
> +//!         match m {
> +//!             Mode::Low => Bounded::<u32, _>::new::<0>(),
> +//!             Mode::High => Bounded::<u32, _>::new::<1>(),
> +//!             Mode::Auto => Bounded::<u32, _>::new::<2>(),
> +//!         }
> +//!     }
> +//! }
> +//!
> +//! bitfield! {
> +//!     pub struct Config(u32) {
> +//!         1:0 mode ?=> Mode;
> +//!     }
> +//! }
> +//!
> +//! let cfg = Config::zeroed().with_mode(Mode::Auto);
> +//! assert_eq!(cfg.mode(), Ok(Mode::Auto));
> +//!
> +//! // Invalid bit pattern returns an error.
> +//! assert_eq!(Config::from(0b11).mode(), Err(3));
> +//! ```
> +//!
> +//! [`Bounded`]: kernel::num::Bounded

In the nova version of bitfield we had @check_field_bounds. If we put
in the bit numbers the wrong way around, this patch gives a compile
error like:

```
attempt to compute `4_u32 - 7_u32`, which would overflow
```

The original nova version looks like it used build_assert, but I think
we can do it with const assert!, so we should be able to get a better
build error message for this case:

```
const _: () = assert!($hi >= $lo, "bitfield: hi bit must be >= lo bit");
```

With just that we get an extra build error, although it still spams the
confusing build error. We could also consider adding a function like:

```
pub const fn bitfield_width(hi: u32, lo: u32) -> u32 {
  assert!(hi >= lo, "bitfield: hi bit must be >= lo bit");
  hi + 1 - lo
}
```

Using this instead gets rid of some confusing build errors since we can
also use it in type bounds. But to get rid of all of them we would need
to do it for the mask etc and others.

WDYT?

^ permalink raw reply

* Re: [PATCH v2 2/3] rust: bitfield: Add KUNIT tests for bitfield
From: Eliot Courtney @ 2026-04-13  2:28 UTC (permalink / raw)
  To: Alexandre Courbot, Joel Fernandes, Yury Norov, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Daniel Almeida, David Airlie, Simona Vetter
  Cc: John Hubbard, Alistair Popple, Timur Tabi, Zhi Wang,
	Eliot Courtney, linux-kernel, rust-for-linux, driver-core,
	dri-devel
In-Reply-To: <20260409-bitfield-v2-2-23ac400071cb@nvidia.com>

On Thu Apr 9, 2026 at 11:58 PM JST, Alexandre Courbot wrote:
> From: Joel Fernandes <joelagnelf@nvidia.com>
>
> Add KUNIT tests to make sure the macro is working correctly.
>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> [acourbot: update code to latest bitfield! macro.]
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>

^ permalink raw reply

* Re: [PATCH v2 3/3] gpu: nova-core: switch to kernel bitfield macro
From: Eliot Courtney @ 2026-04-13  2:01 UTC (permalink / raw)
  To: Alexandre Courbot, Joel Fernandes, Yury Norov, Miguel Ojeda,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Daniel Almeida, David Airlie, Simona Vetter
  Cc: John Hubbard, Alistair Popple, Timur Tabi, Zhi Wang,
	Eliot Courtney, linux-kernel, rust-for-linux, driver-core,
	dri-devel
In-Reply-To: <20260409-bitfield-v2-3-23ac400071cb@nvidia.com>

On Thu Apr 9, 2026 at 11:58 PM JST, Alexandre Courbot wrote:
> Replace uses of the Nova-internal `bitfield!` macro by the kernel one,
> and remove the now-unneeded local macro.
>
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>


^ permalink raw reply

* Re: [PATCH v6] kbuild: host: use single executable for rustc -C linker
From: Mohamad Alsadhan @ 2026-04-12 14:19 UTC (permalink / raw)
  To: Nicolas Schier, Nathan Chancellor, Miguel Ojeda, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, Danilo Krummrich, Yoann Congal,
	linux-kbuild, linux-kernel, rust-for-linux
In-Reply-To: <adVpQ_ZvXTPUegig@levanger>

On 26/04/07 10:29pm, Nicolas Schier wrote:
> 
> What is the reason for re-adding the 'exec'?

I was under the impression of that it won't hurt, unless I am missing
something.

'exec' would replace the shell process entirely rather than leaving 
it alive waiting for the compiler to finish.

Best regards,
Mo

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox