All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhi Wang <zhiw@nvidia.com>
To: Alexandre Courbot <acourbot@nvidia.com>
Cc: <rust-for-linux@vger.kernel.org>, <linux-pci@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <dakr@kernel.org>,
	<aliceryhl@google.com>, <bhelgaas@google.com>,
	<kwilczynski@kernel.org>, <ojeda@kernel.org>, <boqun@kernel.org>,
	<gary@garyguo.net>, <bjorn3_gh@protonmail.com>,
	<lossin@kernel.org>, <a.hindborg@kernel.org>, <tmgross@umich.edu>,
	<markus.probst@posteo.de>, <cjia@nvidia.com>, <smitra@nvidia.com>,
	<ankita@nvidia.com>, <aniketa@nvidia.com>, <kwankhede@nvidia.com>,
	<targupta@nvidia.com>, <joelagnelf@nvidia.com>,
	<jhubbard@nvidia.com>, <kjaju@nvidia.com>, <zhiwang@kernel.org>
Subject: Re: [PATCH v3 1/1] rust: pci: add extended capability and SR-IOV support
Date: Mon, 20 Apr 2026 13:13:56 +0300	[thread overview]
Message-ID: <20260420131356.189e7209@inno-dell> (raw)
In-Reply-To: <DHRTUAF52GNI.1J98TSAG1LS6Q@nvidia.com>

On Mon, 13 Apr 2026 15:52:28 +0900
"Alexandre Courbot" <acourbot@nvidia.com> wrote:

> Hi Zhi,
> 
> On Fri Apr 10, 2026 at 3:52 AM JST, Zhi Wang wrote:
> <snip>
> > +/// An extended PCI capability that implements [`Io`].
> > +///
> > +/// # Examples
> > +///
> > +/// ```no_run
> > +/// use kernel::pci::{
> > +///     self,
> > +///     ExtSriovCapability, //
> > +/// };
> > +/// use kernel::io::Io;
> > +///
> > +/// fn probe_sriov(pdev: &pci::Device<kernel::device::Core>) ->
> > Result<(), kernel::error::Error> { +///     let config =
> > pdev.config_space_extended()?; +///     let sriov =
> > ExtSriovCapability::find(&config)?; +///
> > +///     let total_vfs = kernel::io_read!(&sriov, .total_vfs);
> > +///     let vf_offset = kernel::io_read!(&sriov, .vf_offset);
> > +///     let bar0 = kernel::io_read!(&sriov, .vf_bar[0]);
> > +///     kernel::io_write!(&sriov, .num_vfs, 4u16);
> > +///     let bar0_64 = sriov.read_vf_bar64(0)?;
> > +///
> > +///     Ok(())
> > +/// }
> > +/// ```
> > +///
> > +/// # Invariants
> > +///
> > +/// `ptr` is within the device's extended configuration space at a
> > valid +/// capability. For sized `T`, the region is at least
> > `size_of::<T>()` bytes. +pub struct ExtCapability<'a, T: ?Sized +
> > KnownSize = Region<0>> {
> > +    config: &'a ConfigSpace<'a, Extended>,
> > +    ptr: *mut T,
> > +}  
> 
> This strongly looks like this is reinventing `io::View`. :) Even the
> internals look similar. Can you check whether `io::View` can replace
> this type? This would remove all the macro business and impl blocks
> and simplify this patch considerably.
> 

Thanks for catching this. This should be updated with Gary's patches
posted in the mailing list. Will address this in V4.

> > +
> > +impl<T: ?Sized + KnownSize> Io for ExtCapability<'_, T> {
> > +    type Type = T;
> > +
> > +    #[inline]
> > +    fn as_ptr(&self) -> *mut T {
> > +        self.ptr
> > +    }
> > +}
> > +
> > +macro_rules! impl_ext_cap_io_capable {
> > +    ($ty:ty) => {
> > +        impl<T: ?Sized + KnownSize> IoCapable<$ty> for
> > ExtCapability<'_, T> {
> > +            #[inline]
> > +            unsafe fn io_read(&self, address: *mut $ty) -> $ty {
> > +                // SAFETY: The caller guarantees `address` is
> > within bounds of
> > +                // this capability, which is within the config
> > space.
> > +                unsafe { self.config.io_read(address) }
> > +            }
> > +
> > +            #[inline]
> > +            unsafe fn io_write(&self, value: $ty, address: *mut
> > $ty) {
> > +                // SAFETY: The caller guarantees `address` is
> > within bounds of
> > +                // this capability, which is within the config
> > space.
> > +                unsafe { self.config.io_write(value, address) }
> > +            }
> > +        }
> > +    };
> > +}
> > +
> > +impl_ext_cap_io_capable!(u8);
> > +impl_ext_cap_io_capable!(u16);
> > +impl_ext_cap_io_capable!(u32);
> > +
> > +impl<'a> ExtCapability<'a> {
> > +    /// Base offset of this capability in configuration space.
> > +    #[inline]
> > +    pub fn offset(&self) -> usize {
> > +        self.ptr.addr()
> > +    }
> > +
> > +    /// Size of this capability region in bytes.
> > +    #[inline]
> > +    pub fn size(&self) -> usize {
> > +        KnownSize::size(self.ptr)
> > +    }
> > +
> > +    /// Cast to a typed capability, checking that the region is
> > large enough.
> > +    pub fn cast_sized<U>(self) -> Result<ExtCapability<'a, U>> {  
> 
> This allows for any cast, including invalid ones, as long as `U` fits.
> While not unsafe, this is still incorrect - I don't see any reason to
> make this public, which could be a way to mitigate this.
> 

Agreed. With the io::View rework above, cast_sized is gone. The size
check can be done internally in find_sriov().

> > +        if self.size() < core::mem::size_of::<U>() {
> > +            return Err(EINVAL);
> > +        }
> > +
> > +        // INVARIANT: `self` already satisfies the invariant (ptr
> > is within extended config
> > +        // space at a valid capability), and the size check above
> > guarantees the region is at
> > +        // least `size_of::<U>()` bytes.
> > +        Ok(ExtCapability {
> > +            config: self.config,
> > +            ptr: core::ptr::without_provenance_mut(self.offset()),
> > +        })
> > +    }
> > +}
> > +
> > +impl ConfigSpace<'_, Extended> {
> > +    /// Finds an extended capability by ID, returning an untyped
> > [`ExtCapability`].
> > +    pub fn find_ext_capability(&self, cap: ExtCapId) ->
> > Result<ExtCapability<'_>> {
> > +        let offset = usize::from(
> > +            // SAFETY: `self.pdev` is valid by the type invariant
> > of `ConfigSpace`.
> > +            unsafe {
> > +
> > bindings::pci_find_ext_capability(self.pdev.as_raw(),
> > i32::from(cap.as_raw()))
> > +            },
> > +        );
> > +
> > +        if offset == 0 {
> > +            return Err(ENODEV);
> > +        }
> > +
> > +        Ok(self.make_ext_capability(offset))
> > +    }
> > +
> > +    /// Finds the next extended capability with `cap` after
> > `start`.
> > +    pub fn find_next_ext_capability(&self, start: u16, cap:
> > ExtCapId) -> Result<ExtCapability<'_>> {  
> 
> This `start` offset can be anything and potentially lead to invalid
> results being returned (I don't know what the C binding does, but
> assuming the worst for safety).
> 
> Listing capabilities should be done through an iterator as it provides
> all their benefits to users. I understand we also want a `find` API to
> look for a specific capability and that an iterator is not needed for
> this, but if we end up providing a way to list them, it should be
> through an iterator.
> 
> In any case, this method seems to be unused, so you can safely drop it
> for now.
> 

Dropped in v4.

> > +        let offset = usize::from(
> > +            // SAFETY: `self.pdev` is valid by the type invariant
> > of `ConfigSpace`.
> > +            unsafe {
> > +                bindings::pci_find_next_ext_capability(
> > +                    self.pdev.as_raw(),
> > +                    start,
> > +                    i32::from(cap.as_raw()),
> > +                )
> > +            },
> > +        );
> > +
> > +        if offset == 0 {
> > +            return Err(ENODEV);
> > +        }
> > +
> > +        Ok(self.make_ext_capability(offset))
> > +    }
> > +
> > +    fn make_ext_capability(&self, offset: usize) ->
> > ExtCapability<'_> {
> > +        let size = self.calculate_ext_cap_size(offset);
> > +
> > +        let ptr = core::ptr::slice_from_raw_parts_mut::<u8>(
> > +            core::ptr::without_provenance_mut(offset),
> > +            size,
> > +            // CAST: `Region<0>` is a DST like `[u8]`, so this
> > pointer cast preserves metadata.
> > +        ) as *mut Region<0>;
> > +
> > +        // INVARIANT: `offset` was returned by
> > `pci_find_ext_capability` /  
> 
> This method cannot make assumptions about the origin of its arguments
> without being `unsafe`, as its correct behavior depends on the
> goodwill of the caller. Given that we will drop
> `find_next_ext_capability`, the code can be rolled into
> `find_ext_capability` which is now the only user.
> 

Will inline it into find_ext_capability in v4.

> > +        // `pci_find_next_ext_capability`, which guarantees it
> > points to a valid capability
> > +        // within the extended configuration space. `size` is
> > bounded by the next capability
> > +        // offset or the end of the configuration space.
> > +        ExtCapability { config: self, ptr }
> > +    }
> > +
> > +    fn calculate_ext_cap_size(&self, offset: usize) -> usize {  
> 
> This method lacks documentation.
> 
> > +        let header = self.try_read32(offset).unwrap_or(0);
> > +        // SAFETY: Pure bit manipulation, no preconditions.
> > +        // CAST: The next-cap pointer is a 12-bit field (max
> > 0xFFC), always fits in `usize`.
> > +        let next_ptr = unsafe { bindings::pci_ext_cap_next(header)
> > } as usize; +
> > +        if next_ptr > offset {
> > +            next_ptr - offset
> > +        } else {
> > +            KnownSize::size(self.as_ptr()) - offset
> > +        }
> > +    }
> > +}
> > +
> > +/// SR-IOV register layout per PCIe spec (64 bytes starting at cap
> > offset). +#[repr(C)]
> > +pub struct ExtSriovRegs {
> > +    /// Extended capability header.
> > +    pub header: u32,
> > +    /// SR-IOV capabilities.
> > +    pub cap: u32,
> > +    /// SR-IOV control.
> > +    pub ctrl: u16,
> > +    /// SR-IOV status.
> > +    pub status: u16,
> > +    /// Initial VFs.
> > +    pub initial_vfs: u16,
> > +    /// Total VFs.
> > +    pub total_vfs: u16,
> > +    /// Number of VFs.
> > +    pub num_vfs: u16,
> > +    /// Function dependency link.
> > +    pub func_dep_link: u16,
> > +    /// First VF offset.
> > +    pub vf_offset: u16,
> > +    /// VF stride.
> > +    pub vf_stride: u16,
> > +    _reserved: u16,
> > +    /// VF device ID.
> > +    pub vf_device_id: u16,
> > +    /// Supported page sizes.
> > +    pub supported_page_sizes: u32,
> > +    /// System page size.
> > +    pub system_page_size: u32,
> > +    /// VF BARs (BAR0–BAR5).
> > +    pub vf_bar: [u32; 6],
> > +    /// VF migration state array offset.
> > +    pub migration_state: u32,
> > +}
> > +
> > +/// SR-IOV capability. See [`ExtCapability`] for usage.
> > +pub type ExtSriovCapability<'a> = ExtCapability<'a, ExtSriovRegs>;
> > +
> > +impl ExtCapability<'_, ExtSriovRegs> {
> > +    /// Find the SR-IOV capability, or `ENODEV` if not present.
> > +    #[inline]
> > +    pub fn find<'a>(
> > +        config: &'a ConfigSpace<'_, Extended>,
> > +    ) -> Result<ExtCapability<'a, ExtSriovRegs>> {
> > +        config.find_ext_capability(ExtCapId::Sriov)?.cast_sized()
> > +    }  
> 
> This method looks like it belongs to `ConfigSpace`, and should be
> generic. Something like
> 
>   impl<'a> ConfigSpace<'a, Extended> {
>     pub fn find_ext_capability::<C: ExtCapability>(&self) 
>       -> Result<io::View<...>>  
>   }
> 
> If we use `io::View` to return capabilities, we can recycle the
> `ExtCapability` name for a trait that provides the information
> required for `find_ext_capability` to work properly, like the
> associated constant for the capability ID. In this patch, it would be
> implemented on `ExtSriovRegs`, providing the projection of the view
> through `C`.
> 
> This also opens the way for a `find_capability` method that handles
> normal capabilities and is available to both variants of
> `ConfigSpace`. I am not saying this needs to be done in this patch
> though.
>

I've moved find to ConfigSpace::find_sriov() for now. I agree the
trait-based generics direction makes sense, but with only SR-IOV as a
user today I'd prefer to defer that to a follow-up when we have a
second capability type. :)

> > +
> > +    /// Reads a 64-bit VF BAR from two consecutive 32-bit slots.
> > +    #[inline]
> > +    pub fn read_vf_bar64(&self, bar_index: usize) -> Result<u64> {
> > +        if bar_index >= 5 {  
> 
> Why is this value hardcoded? If this is correct, let's make it a
> constant with an informative name (and possibly a comment justifying
> its value).
> 

Sure. I changed it to use PCI_SRIOV_NUM_BARS. 

> > +            return Err(EINVAL);
> > +        }
> > +        let low = crate::io_read!(self, .vf_bar[bar_index]?);
> > +        let high = crate::io_read!(self, .vf_bar[bar_index + 1]?);
> >  
> 
> Don't we want to check whether the bar in question is actually a
> 64-bit BAR?
> 
> I wonder whether this also doesn't call for an iterator-based solution
> returning an enum with the properly-sized BAR.

Good point. I was assuming that the driver actually know its
devcie support 64-bit bar or not. Do you prefer that we can have an
iterator-based solution at this time? I think it makes the code looks
nicer though. :)

Z.

  reply	other threads:[~2026-04-20 10:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09 18:52 [PATCH 0/1] Rust PCI capability infrastructure and SR-IOV support Zhi Wang
2026-04-09 18:52 ` [PATCH v3 1/1] rust: pci: add extended capability " Zhi Wang
2026-04-13  6:52   ` Alexandre Courbot
2026-04-20 10:13     ` Zhi Wang [this message]
2026-04-20 10:24       ` Gary Guo
2026-04-26  4:12       ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260420131356.189e7209@inno-dell \
    --to=zhiw@nvidia.com \
    --cc=a.hindborg@kernel.org \
    --cc=acourbot@nvidia.com \
    --cc=aliceryhl@google.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun@kernel.org \
    --cc=cjia@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=gary@garyguo.net \
    --cc=jhubbard@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=kjaju@nvidia.com \
    --cc=kwankhede@nvidia.com \
    --cc=kwilczynski@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=markus.probst@posteo.de \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=smitra@nvidia.com \
    --cc=targupta@nvidia.com \
    --cc=tmgross@umich.edu \
    --cc=zhiwang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.