public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] dma::Coherent & dma::CoherentInit API
@ 2026-03-03 16:22 ` Danilo Krummrich
  2026-03-03 16:22   ` [PATCH 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
                     ` (8 more replies)
  0 siblings, 9 replies; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

This patch series introduces the dma::Coherent API Gary worked out in the
context of his I/O projection work.

Additionally, introduce dma::CoherentInit, a type that encapsulates a
dma::Coherent object before its DMA address is exposed to the device.
dma::CoherentInit can guarantee exclusive access to the inner dma::Coherent
object and implement Deref and DerefMut.

Also add Coherent::init() and Coherent::init_with_attrs() so we can directly
initialize a new dma::Coherent object through an impl Init<T, E>.

Danilo Krummrich (5):
  rust: dma: use "kernel vertical" style for imports
  rust: dma: introduce dma::CoherentInit for memory initialization
  rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
  gpu: nova-core: convert Gsp::new() to use CoherentInit

Gary Guo (3):
  rust: dma: add generalized container for types other than slices
  rust: dma: add zeroed constructor to `Coherent`
  gpu: nova-core: convert to new dma::Coherent API

 drivers/gpu/nova-core/dma.rs      |  19 +-
 drivers/gpu/nova-core/falcon.rs   |   7 +-
 drivers/gpu/nova-core/firmware.rs |  10 +-
 drivers/gpu/nova-core/gsp.rs      |  65 ++--
 drivers/gpu/nova-core/gsp/boot.rs |   7 +-
 drivers/gpu/nova-core/gsp/cmdq.rs |  55 +--
 drivers/gpu/nova-core/gsp/fw.rs   |  82 ++--
 rust/kernel/device.rs             |   4 +-
 rust/kernel/dma.rs                | 626 +++++++++++++++++++++++-------
 samples/rust/rust_dma.rs          |   8 +-
 10 files changed, 619 insertions(+), 264 deletions(-)


base-commit: 1195fcbda62f12108dc9be56fa4173897905b90c
-- 
2.53.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/8] rust: dma: use "kernel vertical" style for imports
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-03 17:12     ` Gary Guo
  2026-03-03 16:22   ` [PATCH 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
                     ` (7 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Convert all imports to use "kernel vertical" style.

With this, subsequent patches neither introduce unrelated changes nor
leave an inconsistent import pattern.

While at it, drop unnecessary imports covered by prelude::*.

Link: https://docs.kernel.org/rust/coding-guidelines.html#imports
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index cd2957b5f260..54512278085e 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -5,12 +5,20 @@
 //! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
 
 use crate::{
-    bindings, build_assert, device,
-    device::{Bound, Core},
-    error::{to_result, Result},
+    bindings,
+    build_assert,
+    device::{
+        self,
+        Bound,
+        Core, //
+    },
+    error::to_result,
     prelude::*,
     sync::aref::ARef,
-    transmute::{AsBytes, FromBytes},
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    }, //
 };
 use core::ptr::NonNull;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/8] rust: dma: add generalized container for types other than slices
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
  2026-03-03 16:22   ` [PATCH 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-17  6:50     ` Alice Ryhl
  2026-03-03 16:22   ` [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

From: Gary Guo <gary@garyguo.net>

Currently, `CoherentAllocation` is concecptually a DMA coherent container
of a slice of `[T]` of runtime-checked length. Generalize it by creating
`dma::Coherent<T>` which can hold any value of `T`.
`Coherent::alloc_with_attrs` is implemented but not yet exposed, as I
believe we should not expose the way to obtain an uninitialized coherent
region.

`Coherent<[T]>` provides a `len` method instead of the previous `count()`
method to be consistent with methods on slices.

The existing type is re-defined as a type alias of `Coherent<[T]>` to ease
transition. Methods in use are not yet removed.

Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/device.rs |   4 +-
 rust/kernel/dma.rs    | 361 ++++++++++++++++++++++++++----------------
 2 files changed, 226 insertions(+), 139 deletions(-)

diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
index 94e0548e7687..379058eca2ed 100644
--- a/rust/kernel/device.rs
+++ b/rust/kernel/device.rs
@@ -575,7 +575,7 @@ pub trait DeviceContext: private::Sealed {}
 /// The bound context indicates that for the entire duration of the lifetime of a [`Device<Bound>`]
 /// reference, the [`Device`] is guaranteed to be bound to a driver.
 ///
-/// Some APIs, such as [`dma::CoherentAllocation`] or [`Devres`] rely on the [`Device`] to be bound,
+/// Some APIs, such as [`dma::Coherent`] or [`Devres`] rely on the [`Device`] to be bound,
 /// which can be proven with the [`Bound`] device context.
 ///
 /// Any abstraction that can guarantee a scope where the corresponding bus device is bound, should
@@ -584,7 +584,7 @@ pub trait DeviceContext: private::Sealed {}
 ///
 /// [`Devres`]: kernel::devres::Devres
 /// [`Devres::access`]: kernel::devres::Devres::access
-/// [`dma::CoherentAllocation`]: kernel::dma::CoherentAllocation
+/// [`dma::Coherent`]: kernel::dma::Coherent
 pub struct Bound;
 
 mod private {
diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 54512278085e..3a692ac76ba2 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -6,7 +6,6 @@
 
 use crate::{
     bindings,
-    build_assert,
     device::{
         self,
         Bound,
@@ -14,6 +13,7 @@
     },
     error::to_result,
     prelude::*,
+    ptr::KnownSize,
     sync::aref::ARef,
     transmute::{
         AsBytes,
@@ -357,18 +357,17 @@ fn from(direction: DataDirection) -> Self {
 /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
 /// large coherent DMA regions.
 ///
-/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
+/// A [`Coherent`] instance contains a pointer to the allocated region (in the
 /// processor's virtual address space) and the device address which can be given to the device
-/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
+/// as the DMA address base of the region. The region is released once [`Coherent`]
 /// is dropped.
 ///
 /// # Invariants
 ///
-/// - For the lifetime of an instance of [`CoherentAllocation`], the `cpu_addr` is a valid pointer
+/// - For the lifetime of an instance of [`Coherent`], the `cpu_addr` is a valid pointer
 ///   to an allocated region of coherent memory and `dma_handle` is the DMA address base of the
 ///   region.
-/// - The size in bytes of the allocation is equal to `size_of::<T> * count`.
-/// - `size_of::<T> * count` fits into a `usize`.
+/// - The size in bytes of the allocation is equal to size information via pointer.
 // TODO
 //
 // DMA allocations potentially carry device resources (e.g.IOMMU mappings), hence for soundness
@@ -379,124 +378,260 @@ fn from(direction: DataDirection) -> Self {
 // allocation from surviving device unbind; it would require RCU read side critical sections to
 // access the memory, which may require subsequent unnecessary copies.
 //
-// Hence, find a way to revoke the device resources of a `CoherentAllocation`, but not the
-// entire `CoherentAllocation` including the allocated memory itself.
-pub struct CoherentAllocation<T: AsBytes + FromBytes> {
+// Hence, find a way to revoke the device resources of a `Coherent`, but not the
+// entire `Coherent` including the allocated memory itself.
+pub struct Coherent<T: AsBytes + FromBytes + KnownSize + ?Sized> {
     dev: ARef<device::Device>,
     dma_handle: DmaAddress,
-    count: usize,
     cpu_addr: NonNull<T>,
     dma_attrs: Attrs,
 }
 
-impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
-    /// Allocates a region of `size_of::<T> * count` of coherent memory.
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> Coherent<T> {
+    /// Returns the size in bytes of this allocation.
+    #[inline]
+    pub fn size(&self) -> usize {
+        T::size(self.cpu_addr.as_ptr())
+    }
+
+    /// Returns the raw pointer to the allocated region in the CPU's virtual address space.
+    #[inline]
+    pub fn as_ptr(&self) -> *const T {
+        self.cpu_addr.as_ptr()
+    }
+
+    /// Returns the raw pointer to the allocated region in the CPU's virtual address space as
+    /// a mutable pointer.
+    #[inline]
+    pub fn as_mut_ptr(&self) -> *mut T {
+        self.cpu_addr.as_ptr()
+    }
+
+    /// Returns a DMA handle which may be given to the device as the DMA address base of
+    /// the region.
+    #[inline]
+    pub fn dma_handle(&self) -> DmaAddress {
+        self.dma_handle
+    }
+
+    /// Returns a reference to the data in the region.
     ///
-    /// # Examples
+    /// # Safety
     ///
-    /// ```
-    /// # use kernel::device::{Bound, Device};
-    /// use kernel::dma::{attrs::*, CoherentAllocation};
+    /// * Callers must ensure that the device does not read/write to/from memory while the returned
+    ///   slice is live.
+    /// * Callers must ensure that this call does not race with a write to the same region while
+    ///   the returned slice is live.
+    #[inline]
+    pub unsafe fn as_ref(&self) -> &T {
+        // SAFETY: per safety requirement.
+        unsafe { &*self.as_ptr() }
+    }
+
+    /// Returns a mutable reference to the data in the region.
     ///
-    /// # fn test(dev: &Device<Bound>) -> Result {
-    /// let c: CoherentAllocation<u64> =
-    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
-    /// # Ok::<(), Error>(()) }
-    /// ```
-    pub fn alloc_attrs(
+    /// # Safety
+    ///
+    /// * Callers must ensure that the device does not read/write to/from memory while the returned
+    ///   slice is live.
+    /// * Callers must ensure that this call does not race with a read or write to the same region
+    ///   while the returned slice is live.
+    #[expect(clippy::mut_from_ref, reason = "unsafe to use API")]
+    #[inline]
+    pub unsafe fn as_mut(&self) -> &mut T {
+        // SAFETY: per safety requirement.
+        unsafe { &mut *self.as_mut_ptr() }
+    }
+
+    /// Reads the value of `field` and ensures that its type is [`FromBytes`].
+    ///
+    /// # Safety
+    ///
+    /// This must be called from the [`dma_read`] macro which ensures that the `field` pointer is
+    /// validated beforehand.
+    ///
+    /// Public but hidden since it should only be used from [`dma_read`] macro.
+    #[doc(hidden)]
+    pub unsafe fn field_read<F: FromBytes>(&self, field: *const F) -> F {
+        // SAFETY:
+        // - By the safety requirements field is valid.
+        // - Using read_volatile() here is not sound as per the usual rules, the usage here is
+        // a special exception with the following notes in place. When dealing with a potential
+        // race from a hardware or code outside kernel (e.g. user-space program), we need that
+        // read on a valid memory is not UB. Currently read_volatile() is used for this, and the
+        // rationale behind is that it should generate the same code as READ_ONCE() which the
+        // kernel already relies on to avoid UB on data races. Note that the usage of
+        // read_volatile() is limited to this particular case, it cannot be used to prevent
+        // the UB caused by racing between two kernel functions nor do they provide atomicity.
+        unsafe { field.read_volatile() }
+    }
+
+    /// Writes a value to `field` and ensures that its type is [`AsBytes`].
+    ///
+    /// # Safety
+    ///
+    /// This must be called from the [`dma_write`] macro which ensures that the `field` pointer is
+    /// validated beforehand.
+    ///
+    /// Public but hidden since it should only be used from [`dma_write`] macro.
+    #[doc(hidden)]
+    pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
+        // SAFETY:
+        // - By the safety requirements field is valid.
+        // - Using write_volatile() here is not sound as per the usual rules, the usage here is
+        // a special exception with the following notes in place. When dealing with a potential
+        // race from a hardware or code outside kernel (e.g. user-space program), we need that
+        // write on a valid memory is not UB. Currently write_volatile() is used for this, and the
+        // rationale behind is that it should generate the same code as WRITE_ONCE() which the
+        // kernel already relies on to avoid UB on data races. Note that the usage of
+        // write_volatile() is limited to this particular case, it cannot be used to prevent
+        // the UB caused by racing between two kernel functions nor do they provide atomicity.
+        unsafe { field.write_volatile(val) }
+    }
+}
+
+impl<T: AsBytes + FromBytes> Coherent<T> {
+    /// Allocates a region of `T` of coherent memory.
+    #[expect(unused)]
+    fn alloc_with_attrs(
         dev: &device::Device<Bound>,
-        count: usize,
         gfp_flags: kernel::alloc::Flags,
         dma_attrs: Attrs,
-    ) -> Result<CoherentAllocation<T>> {
-        build_assert!(
-            core::mem::size_of::<T>() > 0,
-            "It doesn't make sense for the allocated type to be a ZST"
-        );
+    ) -> Result<Self> {
+        const {
+            assert!(
+                core::mem::size_of::<T>() > 0,
+                "It doesn't make sense for the allocated type to be a ZST"
+            );
+        }
 
-        let size = count
-            .checked_mul(core::mem::size_of::<T>())
-            .ok_or(EOVERFLOW)?;
         let mut dma_handle = 0;
         // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
         let addr = unsafe {
             bindings::dma_alloc_attrs(
                 dev.as_raw(),
-                size,
+                core::mem::size_of::<T>(),
                 &mut dma_handle,
                 gfp_flags.as_raw(),
                 dma_attrs.as_raw(),
             )
         };
-        let addr = NonNull::new(addr).ok_or(ENOMEM)?;
+        let cpu_addr = NonNull::new(addr.cast()).ok_or(ENOMEM)?;
         // INVARIANT:
-        // - We just successfully allocated a coherent region which is accessible for
-        //   `count` elements, hence the cpu address is valid. We also hold a refcounted reference
-        //   to the device.
-        // - The allocated `size` is equal to `size_of::<T> * count`.
-        // - The allocated `size` fits into a `usize`.
+        // - We just successfully allocated a coherent region which is adequately sized for `T`,
+        //   hence the cpu address is valid.
+        // - We also hold a refcounted reference to the device.
         Ok(Self {
             dev: dev.into(),
             dma_handle,
-            count,
-            cpu_addr: addr.cast(),
+            cpu_addr,
             dma_attrs,
         })
     }
 
-    /// Performs the same functionality as [`CoherentAllocation::alloc_attrs`], except the
-    /// `dma_attrs` is 0 by default.
-    pub fn alloc_coherent(
+    /// Allocates a region of `[T; len]` of coherent memory.
+    fn alloc_slice_with_attrs(
         dev: &device::Device<Bound>,
-        count: usize,
+        len: usize,
         gfp_flags: kernel::alloc::Flags,
-    ) -> Result<CoherentAllocation<T>> {
-        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
+        dma_attrs: Attrs,
+    ) -> Result<Coherent<[T]>> {
+        const {
+            assert!(
+                core::mem::size_of::<T>() > 0,
+                "It doesn't make sense for the allocated type to be a ZST"
+            );
+        }
+
+        // `dma_alloc_attrs` cannot handle zero-length allocation, bail early.
+        if len == 0 {
+            Err(EINVAL)?;
+        }
+
+        let size = core::mem::size_of::<T>().checked_mul(len).ok_or(ENOMEM)?;
+        let mut dma_handle = 0;
+        // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
+        let addr = unsafe {
+            bindings::dma_alloc_attrs(
+                dev.as_raw(),
+                size,
+                &mut dma_handle,
+                gfp_flags.as_raw(),
+                dma_attrs.as_raw(),
+            )
+        };
+        let cpu_addr = NonNull::slice_from_raw_parts(NonNull::new(addr.cast()).ok_or(ENOMEM)?, len);
+        // INVARIANT:
+        // - We just successfully allocated a coherent region which is adequately sized for
+        //   `[T; len]`, hence the cpu address is valid.
+        // - We also hold a refcounted reference to the device.
+        Ok(Coherent {
+            dev: dev.into(),
+            dma_handle,
+            cpu_addr,
+            dma_attrs,
+        })
     }
+}
 
+impl<T: AsBytes + FromBytes> Coherent<[T]> {
     /// Returns the number of elements `T` in this allocation.
     ///
     /// Note that this is not the size of the allocation in bytes, which is provided by
     /// [`Self::size`].
-    pub fn count(&self) -> usize {
-        self.count
+    #[inline]
+    #[expect(clippy::len_without_is_empty, reason = "Coherent slice is never empty")]
+    pub fn len(&self) -> usize {
+        self.cpu_addr.len()
     }
+}
 
-    /// Returns the size in bytes of this allocation.
-    pub fn size(&self) -> usize {
-        // INVARIANT: The type invariant of `Self` guarantees that `size_of::<T> * count` fits into
-        // a `usize`.
-        self.count * core::mem::size_of::<T>()
-    }
+// Type alias for compatibility.
+#[doc(hidden)]
+pub type CoherentAllocation<T> = Coherent<[T]>;
 
-    /// Returns the raw pointer to the allocated region in the CPU's virtual address space.
-    #[inline]
-    pub fn as_ptr(&self) -> *const [T] {
-        core::ptr::slice_from_raw_parts(self.cpu_addr.as_ptr(), self.count)
+impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
+    /// Allocates a region of `size_of::<T> * count` of coherent memory.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::device::{Bound, Device};
+    /// use kernel::dma::{attrs::*, CoherentAllocation};
+    ///
+    /// # fn test(dev: &Device<Bound>) -> Result {
+    /// let c: CoherentAllocation<u64> =
+    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    pub fn alloc_attrs(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<CoherentAllocation<T>> {
+        Coherent::alloc_slice_with_attrs(dev, count, gfp_flags, dma_attrs)
     }
 
-    /// Returns the raw pointer to the allocated region in the CPU's virtual address space as
-    /// a mutable pointer.
-    #[inline]
-    pub fn as_mut_ptr(&self) -> *mut [T] {
-        core::ptr::slice_from_raw_parts_mut(self.cpu_addr.as_ptr(), self.count)
+    /// Performs the same functionality as [`CoherentAllocation::alloc_attrs`], except the
+    /// `dma_attrs` is 0 by default.
+    pub fn alloc_coherent(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+    ) -> Result<CoherentAllocation<T>> {
+        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
     }
 
     /// Returns the base address to the allocated region in the CPU's virtual address space.
     pub fn start_ptr(&self) -> *const T {
-        self.cpu_addr.as_ptr()
+        self.as_ptr().cast()
     }
 
     /// Returns the base address to the allocated region in the CPU's virtual address space as
     /// a mutable pointer.
     pub fn start_ptr_mut(&mut self) -> *mut T {
-        self.cpu_addr.as_ptr()
-    }
-
-    /// Returns a DMA handle which may be given to the device as the DMA address base of
-    /// the region.
-    pub fn dma_handle(&self) -> DmaAddress {
-        self.dma_handle
+        self.as_mut_ptr().cast()
     }
 
     /// Returns a DMA handle starting at `offset` (in units of `T`) which may be given to the
@@ -504,11 +639,9 @@ pub fn dma_handle(&self) -> DmaAddress {
     ///
     /// Returns `EINVAL` if `offset` is not within the bounds of the allocation.
     pub fn dma_handle_with_offset(&self, offset: usize) -> Result<DmaAddress> {
-        if offset >= self.count {
+        if offset >= self.len() {
             Err(EINVAL)
         } else {
-            // INVARIANT: The type invariant of `Self` guarantees that `size_of::<T> * count` fits
-            // into a `usize`, and `offset` is inferior to `count`.
             Ok(self.dma_handle + (offset * core::mem::size_of::<T>()) as DmaAddress)
         }
     }
@@ -516,7 +649,7 @@ pub fn dma_handle_with_offset(&self, offset: usize) -> Result<DmaAddress> {
     /// Common helper to validate a range applied from the allocated region in the CPU's virtual
     /// address space.
     fn validate_range(&self, offset: usize, count: usize) -> Result {
-        if offset.checked_add(count).ok_or(EOVERFLOW)? > self.count {
+        if offset.checked_add(count).ok_or(EOVERFLOW)? > self.len() {
             return Err(EINVAL);
         }
         Ok(())
@@ -601,66 +734,20 @@ pub unsafe fn write(&mut self, src: &[T], offset: usize) -> Result {
         };
         Ok(())
     }
-
-    /// Reads the value of `field` and ensures that its type is [`FromBytes`].
-    ///
-    /// # Safety
-    ///
-    /// This must be called from the [`dma_read`] macro which ensures that the `field` pointer is
-    /// validated beforehand.
-    ///
-    /// Public but hidden since it should only be used from [`dma_read`] macro.
-    #[doc(hidden)]
-    pub unsafe fn field_read<F: FromBytes>(&self, field: *const F) -> F {
-        // SAFETY:
-        // - By the safety requirements field is valid.
-        // - Using read_volatile() here is not sound as per the usual rules, the usage here is
-        // a special exception with the following notes in place. When dealing with a potential
-        // race from a hardware or code outside kernel (e.g. user-space program), we need that
-        // read on a valid memory is not UB. Currently read_volatile() is used for this, and the
-        // rationale behind is that it should generate the same code as READ_ONCE() which the
-        // kernel already relies on to avoid UB on data races. Note that the usage of
-        // read_volatile() is limited to this particular case, it cannot be used to prevent
-        // the UB caused by racing between two kernel functions nor do they provide atomicity.
-        unsafe { field.read_volatile() }
-    }
-
-    /// Writes a value to `field` and ensures that its type is [`AsBytes`].
-    ///
-    /// # Safety
-    ///
-    /// This must be called from the [`dma_write`] macro which ensures that the `field` pointer is
-    /// validated beforehand.
-    ///
-    /// Public but hidden since it should only be used from [`dma_write`] macro.
-    #[doc(hidden)]
-    pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
-        // SAFETY:
-        // - By the safety requirements field is valid.
-        // - Using write_volatile() here is not sound as per the usual rules, the usage here is
-        // a special exception with the following notes in place. When dealing with a potential
-        // race from a hardware or code outside kernel (e.g. user-space program), we need that
-        // write on a valid memory is not UB. Currently write_volatile() is used for this, and the
-        // rationale behind is that it should generate the same code as WRITE_ONCE() which the
-        // kernel already relies on to avoid UB on data races. Note that the usage of
-        // write_volatile() is limited to this particular case, it cannot be used to prevent
-        // the UB caused by racing between two kernel functions nor do they provide atomicity.
-        unsafe { field.write_volatile(val) }
-    }
 }
 
 /// Note that the device configured to do DMA must be halted before this object is dropped.
-impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> Drop for Coherent<T> {
     fn drop(&mut self) {
-        let size = self.count * core::mem::size_of::<T>();
+        let size = T::size(self.cpu_addr.as_ptr());
         // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
         // The cpu address, and the dma handle are valid due to the type invariants on
-        // `CoherentAllocation`.
+        // `Coherent`.
         unsafe {
             bindings::dma_free_attrs(
                 self.dev.as_raw(),
                 size,
-                self.start_ptr_mut().cast(),
+                self.cpu_addr.as_ptr().cast(),
                 self.dma_handle,
                 self.dma_attrs.as_raw(),
             )
@@ -668,20 +755,20 @@ fn drop(&mut self) {
     }
 }
 
-// SAFETY: It is safe to send a `CoherentAllocation` to another thread if `T`
+// SAFETY: It is safe to send a `Coherent` to another thread if `T`
 // can be sent to another thread.
-unsafe impl<T: AsBytes + FromBytes + Send> Send for CoherentAllocation<T> {}
+unsafe impl<T: AsBytes + FromBytes + KnownSize + Send + ?Sized> Send for Coherent<T> {}
 
 /// Reads a field of an item from an allocated region of structs.
 ///
 /// The syntax is of form `kernel::dma_read!(dma, proj)` where `dma` is an expression to an
-/// [`CoherentAllocation`] and `proj` is a [projection specification](kernel::ptr::project!).
+/// [`Coherent`] and `proj` is a [projection specification](kernel::ptr::project!).
 ///
 /// # Examples
 ///
 /// ```
 /// use kernel::device::Device;
-/// use kernel::dma::{attrs::*, CoherentAllocation};
+/// use kernel::dma::{attrs::*, Coherent};
 ///
 /// struct MyStruct { field: u32, }
 ///
@@ -690,7 +777,7 @@ unsafe impl<T: AsBytes + FromBytes + Send> Send for CoherentAllocation<T> {}
 /// // SAFETY: Instances of `MyStruct` have no uninitialized portions.
 /// unsafe impl kernel::transmute::AsBytes for MyStruct{};
 ///
-/// # fn test(alloc: &kernel::dma::CoherentAllocation<MyStruct>) -> Result {
+/// # fn test(alloc: &kernel::dma::Coherent<[MyStruct]>) -> Result {
 /// let whole = kernel::dma_read!(alloc, [2]?);
 /// let field = kernel::dma_read!(alloc, [1]?.field);
 /// # Ok::<(), Error>(()) }
@@ -700,17 +787,17 @@ macro_rules! dma_read {
     ($dma:expr, $($proj:tt)*) => {{
         let dma = &$dma;
         let ptr = $crate::ptr::project!(
-            $crate::dma::CoherentAllocation::as_ptr(dma), $($proj)*
+            $crate::dma::Coherent::as_ptr(dma), $($proj)*
         );
         // SAFETY: pointer created by projection is within DMA region.
-        unsafe { $crate::dma::CoherentAllocation::field_read(dma, ptr) }
+        unsafe { $crate::dma::Coherent::field_read(dma, ptr) }
     }};
 }
 
 /// Writes to a field of an item from an allocated region of structs.
 ///
 /// The syntax is of form `kernel::dma_write!(dma, proj, val)` where `dma` is an expression to an
-/// [`CoherentAllocation`] and `proj` is a [projection specification](kernel::ptr::project!),
+/// [`Coherent`] and `proj` is a [projection specification](kernel::ptr::project!),
 /// and `val` is the value to be written to the projected location.
 ///
 ///
@@ -718,7 +805,7 @@ macro_rules! dma_read {
 ///
 /// ```
 /// use kernel::device::Device;
-/// use kernel::dma::{attrs::*, CoherentAllocation};
+/// use kernel::dma::{attrs::*, Coherent};
 ///
 /// struct MyStruct { member: u32, }
 ///
@@ -727,7 +814,7 @@ macro_rules! dma_read {
 /// // SAFETY: Instances of `MyStruct` have no uninitialized portions.
 /// unsafe impl kernel::transmute::AsBytes for MyStruct{};
 ///
-/// # fn test(alloc: &kernel::dma::CoherentAllocation<MyStruct>) -> Result {
+/// # fn test(alloc: &kernel::dma::Coherent<[MyStruct]>) -> Result {
 /// kernel::dma_write!(alloc, [2]?.member, 0xf);
 /// kernel::dma_write!(alloc, [1]?, MyStruct { member: 0xf });
 /// # Ok::<(), Error>(()) }
@@ -737,11 +824,11 @@ macro_rules! dma_write {
     (@parse [$dma:expr] [$($proj:tt)*] [, $val:expr]) => {{
         let dma = &$dma;
         let ptr = $crate::ptr::project!(
-            mut $crate::dma::CoherentAllocation::as_mut_ptr(dma), $($proj)*
+            mut $crate::dma::Coherent::as_mut_ptr(dma), $($proj)*
         );
         let val = $val;
         // SAFETY: pointer created by projection is within DMA region.
-        unsafe { $crate::dma::CoherentAllocation::field_write(dma, ptr, val) }
+        unsafe { $crate::dma::Coherent::field_write(dma, ptr, val) }
     }};
     (@parse [$dma:expr] [$($proj:tt)*] [.$field:tt $($rest:tt)*]) => {
         $crate::dma_write!(@parse [$dma] [$($proj)* .$field] [$($rest)*])
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent`
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
  2026-03-03 16:22   ` [PATCH 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
  2026-03-03 16:22   ` [PATCH 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-17  6:47     ` Alice Ryhl
  2026-03-20 14:35     ` Alexandre Courbot
  2026-03-03 16:22   ` [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization Danilo Krummrich
                     ` (5 subsequent siblings)
  8 siblings, 2 replies; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

From: Gary Guo <gary@garyguo.net>

These constructors create a coherent container of a single object
instead of slice. They are named `zeroed` and `zeroed_with_attrs` to
emphasis that they are created initialized zeroed. It is intended that
there'll be new constructors that take `PinInit` instead of zeroing.

Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs       | 77 +++++++++++++++++++++++++++++++++++-----
 samples/rust/rust_dma.rs |  8 ++---
 2 files changed, 73 insertions(+), 12 deletions(-)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 3a692ac76ba2..291fdea3b52b 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -47,7 +47,7 @@ pub trait Device: AsRef<device::Device<Core>> {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_mask(&self, mask: DmaMask) -> Result {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -64,7 +64,7 @@ unsafe fn dma_set_mask(&self, mask: DmaMask) -> Result {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_coherent_mask(&self, mask: DmaMask) -> Result {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -83,7 +83,7 @@ unsafe fn dma_set_coherent_mask(&self, mask: DmaMask) -> Result {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_mask_and_coherent(&self, mask: DmaMask) -> Result {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -102,7 +102,7 @@ unsafe fn dma_set_mask_and_coherent(&self, mask: DmaMask) -> Result {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_max_seg_size(&self, size: u32) {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -202,12 +202,12 @@ pub const fn value(&self) -> u64 {
 ///
 /// ```
 /// # use kernel::device::{Bound, Device};
-/// use kernel::dma::{attrs::*, CoherentAllocation};
+/// use kernel::dma::{attrs::*, Coherent};
 ///
 /// # fn test(dev: &Device<Bound>) -> Result {
 /// let attribs = DMA_ATTR_FORCE_CONTIGUOUS | DMA_ATTR_NO_WARN;
-/// let c: CoherentAllocation<u64> =
-///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, attribs)?;
+/// let c: Coherent<[u64]> =
+///     Coherent::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, attribs)?;
 /// # Ok::<(), Error>(()) }
 /// ```
 #[derive(Clone, Copy, PartialEq)]
@@ -492,7 +492,6 @@ pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
 
 impl<T: AsBytes + FromBytes> Coherent<T> {
     /// Allocates a region of `T` of coherent memory.
-    #[expect(unused)]
     fn alloc_with_attrs(
         dev: &device::Device<Bound>,
         gfp_flags: kernel::alloc::Flags,
@@ -529,6 +528,33 @@ fn alloc_with_attrs(
         })
     }
 
+    /// Allocates a region of type `T` of coherent memory.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::device::{Bound, Device};
+    /// use kernel::dma::{attrs::*, Coherent};
+    ///
+    /// # fn test(dev: &Device<Bound>) -> Result {
+    /// let c: Coherent<[u64; 4]> =
+    ///     Coherent::zeroed_with_attrs(dev, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    pub fn zeroed_with_attrs(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Self> {
+        Self::alloc_with_attrs(dev, gfp_flags | __GFP_ZERO, dma_attrs)
+    }
+
+    /// Performs the same functionality as [`Coherent::zeroed_with_attrs`], except the
+    /// `dma_attrs` is 0 by default.
+    pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> Result<Self> {
+        Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
+    }
+
     /// Allocates a region of `[T; len]` of coherent memory.
     fn alloc_slice_with_attrs(
         dev: &device::Device<Bound>,
@@ -572,6 +598,41 @@ fn alloc_slice_with_attrs(
             dma_attrs,
         })
     }
+
+    /// Allocates a zeroed region of type `T` of coherent memory.
+    ///
+    /// Unlike `Coherent::<[T; N]>::zeroed_with_attrs`, `Coherent::<T>::zeored_slices` support
+    /// a runtime length.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::device::{Bound, Device};
+    /// use kernel::dma::{attrs::*, Coherent};
+    ///
+    /// # fn test(dev: &Device<Bound>) -> Result {
+    /// let c: Coherent<[u64]> =
+    ///     Coherent::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    pub fn zeroed_slice_with_attrs(
+        dev: &device::Device<Bound>,
+        len: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Coherent<[T]>> {
+        Coherent::alloc_slice_with_attrs(dev, len, gfp_flags | __GFP_ZERO, dma_attrs)
+    }
+
+    /// Performs the same functionality as [`Coherent::zeroed_slice_with_attrs`], except the
+    /// `dma_attrs` is 0 by default.
+    pub fn zeroed_slice(
+        dev: &device::Device<Bound>,
+        len: usize,
+        gfp_flags: kernel::alloc::Flags,
+    ) -> Result<Coherent<[T]>> {
+        Self::zeroed_slice_with_attrs(dev, len, gfp_flags, Attrs(0))
+    }
 }
 
 impl<T: AsBytes + FromBytes> Coherent<[T]> {
diff --git a/samples/rust/rust_dma.rs b/samples/rust/rust_dma.rs
index ce39b5545097..314ef51cd86c 100644
--- a/samples/rust/rust_dma.rs
+++ b/samples/rust/rust_dma.rs
@@ -6,7 +6,7 @@
 
 use kernel::{
     device::Core,
-    dma::{CoherentAllocation, DataDirection, Device, DmaMask},
+    dma::{Coherent, DataDirection, Device, DmaMask},
     page, pci,
     prelude::*,
     scatterlist::{Owned, SGTable},
@@ -16,7 +16,7 @@
 #[pin_data(PinnedDrop)]
 struct DmaSampleDriver {
     pdev: ARef<pci::Device>,
-    ca: CoherentAllocation<MyStruct>,
+    ca: Coherent<[MyStruct]>,
     #[pin]
     sgt: SGTable<Owned<VVec<u8>>>,
 }
@@ -64,8 +64,8 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
             // SAFETY: There are no concurrent calls to DMA allocation and mapping primitives.
             unsafe { pdev.dma_set_mask_and_coherent(mask)? };
 
-            let ca: CoherentAllocation<MyStruct> =
-                CoherentAllocation::alloc_coherent(pdev.as_ref(), TEST_VALUES.len(), GFP_KERNEL)?;
+            let ca: Coherent<[MyStruct]> =
+                Coherent::zeroed_slice(pdev.as_ref(), TEST_VALUES.len(), GFP_KERNEL)?;
 
             for (i, value) in TEST_VALUES.into_iter().enumerate() {
                 kernel::dma_write!(ca, [i]?, MyStruct::new(value.0, value.1));
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
                     ` (2 preceding siblings ...)
  2026-03-03 16:22   ` [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-12 13:51     ` Gary Guo
                       ` (2 more replies)
  2026-03-03 16:22   ` [PATCH 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
                     ` (4 subsequent siblings)
  8 siblings, 3 replies; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Currently, dma::Coherent cannot safely provide (mutable) access to its
underlying memory because the memory might be concurrently accessed by a
DMA device. This makes it difficult to safely initialize the memory
before handing it over to the hardware.

Introduce dma::CoherentInit, a type that encapsulates a dma::Coherent
before its DMA address is exposed to the device. dma::CoherentInit can
guarantee exclusive access to the inner dma::Coherent and implement
Deref and DerefMut.

Once the memory is properly initialized, dma::CoherentInit can be
converted into a regular dma::Coherent.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs | 153 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 152 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 291fdea3b52b..79dd8717ac47 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -20,7 +20,13 @@
         FromBytes, //
     }, //
 };
-use core::ptr::NonNull;
+use core::{
+    ops::{
+        Deref,
+        DerefMut, //
+    },
+    ptr::NonNull, //
+};
 
 /// DMA address type.
 ///
@@ -352,6 +358,151 @@ fn from(direction: DataDirection) -> Self {
     }
 }
 
+/// Initializer type for [`Coherent`].
+///
+/// A [`Coherent`] object can't provide access to its memory as (mutable) slice safely, since it
+/// can't fulfill the requirements for creating a slice. For instance, it is not valid to have a
+/// (mutable) slice to of the memory of a [`Coherent`] while the memory might be accessed by a
+/// device.
+///
+/// In contrast, this initializer type is able to fulfill the requirements to safely obtain a
+/// (mutable) slice, as it neither provides access to the DMA address of the embedded [`Coherent`],
+/// nor can it be used with the DMA projection accessors.
+///
+/// Once initialized, this type can be converted to a regular [`Coherent`] object.
+///
+/// # Examples
+///
+/// `CoherentInit<T>`:
+///
+/// ```
+/// # use kernel::device::{
+/// #     Bound,
+/// #     Device,
+/// # };
+/// use kernel::dma::{attrs::*,
+///     Coherent,
+///     CoherentInit,
+/// };
+///
+/// # fn test(dev: &Device<Bound>) -> Result {
+/// let mut dmem: CoherentInit<u64> =
+///     CoherentInit::zeroed_with_attrs(dev, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+/// *dmem = 42;
+/// let dmem: Coherent<u64> = dmem.into();
+/// # Ok::<(), Error>(()) }
+/// ```
+///
+/// `CoherentInit<[T]>`:
+///
+///
+/// ```
+/// # use kernel::device::{
+/// #     Bound,
+/// #     Device,
+/// # };
+/// use kernel::dma::{attrs::*,
+///     Coherent,
+///     CoherentInit,
+/// };
+///
+/// # fn test(dev: &Device<Bound>) -> Result {
+/// let mut dmem: CoherentInit<[u64]> =
+///     CoherentInit::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+/// dmem.fill(42);
+/// let dmem: Coherent<[u64]> = dmem.into();
+/// # Ok::<(), Error>(()) }
+/// ```
+pub struct CoherentInit<T: AsBytes + FromBytes + KnownSize + ?Sized>(Coherent<T>);
+
+impl<T: AsBytes + FromBytes> CoherentInit<[T]> {
+    /// Initializer variant of [`Coherent::zeroed_slice_with_attrs`].
+    pub fn zeroed_slice_with_attrs(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Self> {
+        Coherent::zeroed_slice_with_attrs(dev, count, gfp_flags, dma_attrs).map(Self)
+    }
+
+    /// Same as [CoherentInit::zeroed_slice_with_attrs], but with `dma::Attrs(0)`.
+    pub fn zeroed_slice(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+    ) -> Result<Self> {
+        Self::zeroed_slice_with_attrs(dev, count, gfp_flags, Attrs(0))
+    }
+
+    /// Initializes the element at `i` using the given initializer.
+    ///
+    /// Returns `EINVAL` if `i` is out of bounds.
+    pub fn init_at<E>(&mut self, i: usize, init: impl Init<T, E>) -> Result
+    where
+        Error: From<E>,
+    {
+        if i >= self.0.len() {
+            return Err(EINVAL);
+        }
+
+        let ptr = core::ptr::from_mut(&mut self[i]);
+
+        // SAFETY:
+        // - `ptr` is valid, properly aligned, and within this allocation.
+        // - `T: AsBytes + FromBytes` guarantees all bit patterns are valid, so partial writes on
+        //   error cannot leave the element in an invalid state.
+        // - The DMA address has not been exposed yet, so there is no concurrent device access.
+        unsafe { init.__init(ptr)? };
+
+        Ok(())
+    }
+}
+
+impl<T: AsBytes + FromBytes> CoherentInit<T> {
+    /// Same as [`CoherentInit::zeroed_slice_with_attrs`], but for a single element.
+    pub fn zeroed_with_attrs(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Self> {
+        Coherent::zeroed_with_attrs(dev, gfp_flags, dma_attrs).map(Self)
+    }
+
+    /// Same as [`CoherentInit::zeroed_slice`], but for a single element.
+    pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> Result<Self> {
+        Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
+    }
+}
+
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> Deref for CoherentInit<T> {
+    type Target = T;
+
+    fn deref(&self) -> &Self::Target {
+        // SAFETY:
+        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
+        //   device.
+        // - We have exclusive access to `self.0`.
+        unsafe { self.0.as_ref() }
+    }
+}
+
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> DerefMut for CoherentInit<T> {
+    fn deref_mut(&mut self) -> &mut Self::Target {
+        // SAFETY:
+        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
+        //   device.
+        // - We have exclusive access to `self.0`.
+        unsafe { self.0.as_mut() }
+    }
+}
+
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> From<CoherentInit<T>> for Coherent<T> {
+    fn from(value: CoherentInit<T>) -> Self {
+        value.0
+    }
+}
+
 /// An abstraction of the `dma_alloc_coherent` API.
 ///
 /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
                     ` (3 preceding siblings ...)
  2026-03-03 16:22   ` [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-03 16:22   ` [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Analogous to Coherent::zeroed() and Coherent::zeroed_with_attrs(), add
Coherent:init() and Coherent::init_with_attrs() which both take an impl
Init<T, E> argument initializing the DMA coherent memory.

Compared to CoherentInit, Coherent::init() is a one-shot constructor
that runs an Init closure and immediately exposes the DMA handle,
whereas CoherentInit is a multi-stage initializer that provides safe
&mut T access by withholding the DMA address until converted to
Coherent.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 79dd8717ac47..d77c1b6cb0c4 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -706,6 +706,43 @@ pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> R
         Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
     }
 
+    /// Same as [`Coherent::zeroed_with_attrs`], but instead of a zero-initialization the memory is
+    /// initialized with `init`.
+    pub fn init_with_attrs<E>(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+        init: impl Init<T, E>,
+    ) -> Result<Self>
+    where
+        Error: From<E>,
+    {
+        let dmem = Self::alloc_with_attrs(dev, gfp_flags, dma_attrs)?;
+        let ptr = dmem.as_mut_ptr();
+
+        // SAFETY:
+        // - `ptr` is valid, properly aligned, and points to exclusively owned memory.
+        // - If `__init` fails, `self` is dropped, which safely frees the underlying `Coherent`'s
+        //   DMA memory. `T: AsBytes + FromBytes` ensures there are no complex `Drop` requirements
+        //   we are bypassing.
+        unsafe { init.__init(ptr)? };
+
+        Ok(dmem)
+    }
+
+    /// Same as [`Coherent::zeroed`], but instead of a zero-initialization the memory is initialized
+    /// with `init`.
+    pub fn init<E>(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        init: impl Init<T, E>,
+    ) -> Result<Self>
+    where
+        Error: From<E>,
+    {
+        Self::init_with_attrs(dev, gfp_flags, Attrs(0), init)
+    }
+
     /// Allocates a region of `[T; len]` of coherent memory.
     fn alloc_slice_with_attrs(
         dev: &device::Device<Bound>,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
                     ` (4 preceding siblings ...)
  2026-03-03 16:22   ` [PATCH 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-03 17:21     ` Gary Guo
  2026-03-20 14:36     ` Alexandre Courbot
  2026-03-03 16:22   ` [PATCH 7/8] gpu: nova-core: convert Gsp::new() to use CoherentInit Danilo Krummrich
                     ` (2 subsequent siblings)
  8 siblings, 2 replies; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Convert wpr_meta to use Coherent::init() and simplify the
initialization.  It also avoids a separate initialization of
GspFwWprMeta on the stack.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 drivers/gpu/nova-core/gsp/boot.rs |  7 ++-----
 drivers/gpu/nova-core/gsp/fw.rs   | 20 +++++++++++++++-----
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 7f46fa5e9b50..1a4d9ee4f256 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -2,8 +2,7 @@
 
 use kernel::{
     device,
-    dma::CoherentAllocation,
-    dma_write,
+    dma::Coherent,
     io::poll::read_poll_timeout,
     pci,
     prelude::*,
@@ -155,9 +154,7 @@ pub(crate) fn boot(
             bar,
         )?;
 
-        let wpr_meta =
-            CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
-        dma_write!(wpr_meta, [0]?, GspFwWprMeta::new(&gsp_fw, &fb_layout));
+        let wpr_meta = Coherent::init(dev, GFP_KERNEL, GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
 
         self.cmdq
             .send_command(bar, commands::SetSystemInfo::new(pdev))?;
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index f1797e1f0d9d..751d5447214d 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -131,7 +131,9 @@ pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
 /// Structure passed to the GSP bootloader, containing the framebuffer layout as well as the DMA
 /// addresses of the GSP bootloader and firmware.
 #[repr(transparent)]
-pub(crate) struct GspFwWprMeta(bindings::GspFwWprMeta);
+pub(crate) struct GspFwWprMeta {
+    inner: bindings::GspFwWprMeta,
+}
 
 // SAFETY: Padding is explicit and does not contain uninitialized data.
 unsafe impl AsBytes for GspFwWprMeta {}
@@ -144,10 +146,14 @@ unsafe impl FromBytes for GspFwWprMeta {}
 type GspFwWprMetaBootInfo = bindings::GspFwWprMeta__bindgen_ty_1__bindgen_ty_1;
 
 impl GspFwWprMeta {
-    /// Fill in and return a `GspFwWprMeta` suitable for booting `gsp_firmware` using the
+    /// Returns an initializer for a `GspFwWprMeta` suitable for booting `gsp_firmware` using the
     /// `fb_layout` layout.
-    pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
-        Self(bindings::GspFwWprMeta {
+    pub(crate) fn new<'a>(
+        gsp_firmware: &'a GspFirmware,
+        fb_layout: &'a FbLayout,
+    ) -> impl Init<Self> + 'a {
+        #[allow(non_snake_case)]
+        let init_inner = init!(bindings::GspFwWprMeta {
             // CAST: we want to store the bits of `GSP_FW_WPR_META_MAGIC` unmodified.
             magic: bindings::GSP_FW_WPR_META_MAGIC as u64,
             revision: u64::from(bindings::GSP_FW_WPR_META_REVISION),
@@ -182,7 +188,11 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
             fbSize: fb_layout.fb.end - fb_layout.fb.start,
             vgaWorkspaceOffset: fb_layout.vga_workspace.start,
             vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
-            ..Default::default()
+            ..Zeroable::init_zeroed()
+        });
+
+        init!(GspFwWprMeta {
+            inner <- init_inner,
         })
     }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 7/8] gpu: nova-core: convert Gsp::new() to use CoherentInit
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
                     ` (5 preceding siblings ...)
  2026-03-03 16:22   ` [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-03 17:33     ` Gary Guo
  2026-03-03 16:22   ` [PATCH 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
  2026-03-24 12:33   ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Andreas Hindborg
  8 siblings, 1 reply; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Convert libos (LibosMemoryRegionInitArgument) and rmargs
(GspArgumentsPadded) to use CoherentInit / Coherent::init() and simplify
the initialization. This also avoids separate initialization on the
stack.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 drivers/gpu/nova-core/gsp.rs    | 47 +++++++++++--------------
 drivers/gpu/nova-core/gsp/fw.rs | 62 +++++++++++++++++++++++----------
 2 files changed, 65 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index 25cd48514c77..cb7f6b4dc0f8 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -5,10 +5,11 @@
 use kernel::{
     device,
     dma::{
+        Coherent,
         CoherentAllocation,
+        CoherentInit,
         DmaAddress, //
     },
-    dma_write,
     pci,
     prelude::*,
     transmute::AsBytes, //
@@ -104,7 +105,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
 #[pin_data]
 pub(crate) struct Gsp {
     /// Libos arguments.
-    pub(crate) libos: CoherentAllocation<LibosMemoryRegionInitArgument>,
+    pub(crate) libos: Coherent<[LibosMemoryRegionInitArgument]>,
     /// Init log buffer.
     loginit: LogBuffer,
     /// Interrupts log buffer.
@@ -114,7 +115,7 @@ pub(crate) struct Gsp {
     /// Command queue.
     pub(crate) cmdq: Cmdq,
     /// RM arguments.
-    rmargs: CoherentAllocation<GspArgumentsPadded>,
+    rmargs: Coherent<GspArgumentsPadded>,
 }
 
 impl Gsp {
@@ -123,34 +124,28 @@ pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error
         pin_init::pin_init_scope(move || {
             let dev = pdev.as_ref();
 
+            // Initialise the logging structures. The OpenRM equivalents are in:
+            // _kgspInitLibosLoggingStructures (allocates memory for buffers)
+            // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
             Ok(try_pin_init!(Self {
-                libos: CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent(
-                    dev,
-                    GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
-                    GFP_KERNEL | __GFP_ZERO,
-                )?,
                 loginit: LogBuffer::new(dev)?,
                 logintr: LogBuffer::new(dev)?,
                 logrm: LogBuffer::new(dev)?,
                 cmdq: Cmdq::new(dev)?,
-                rmargs: CoherentAllocation::<GspArgumentsPadded>::alloc_coherent(
-                    dev,
-                    1,
-                    GFP_KERNEL | __GFP_ZERO,
-                )?,
-                _: {
-                    // Initialise the logging structures. The OpenRM equivalents are in:
-                    // _kgspInitLibosLoggingStructures (allocates memory for buffers)
-                    // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
-                    dma_write!(
-                        libos, [0]?, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0)
-                    );
-                    dma_write!(
-                        libos, [1]?, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0)
-                    );
-                    dma_write!(libos, [2]?, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0));
-                    dma_write!(rmargs, [0]?.inner, fw::GspArgumentsCached::new(cmdq));
-                    dma_write!(libos, [3]?, LibosMemoryRegionInitArgument::new("RMARGS", rmargs));
+                rmargs: Coherent::init(dev, GFP_KERNEL, GspArgumentsPadded::new(cmdq))?,
+                libos: {
+                    let mut libos = CoherentInit::zeroed_slice(
+                        dev,
+                        GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
+                        GFP_KERNEL,
+                    )?;
+
+                    libos.init_at(0, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0))?;
+                    libos.init_at(1, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0))?;
+                    libos.init_at(2, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?;
+                    libos.init_at(3, LibosMemoryRegionInitArgument::new("RMARGS", rmargs))?;
+
+                    libos.into()
                 },
             }))
         })
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 751d5447214d..59cb03a9b238 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -9,11 +9,12 @@
 use core::ops::Range;
 
 use kernel::{
-    dma::CoherentAllocation,
+    dma::Coherent,
     prelude::*,
     ptr::{
         Alignable,
-        Alignment, //
+        Alignment,
+        KnownSize, //
     },
     sizes::{
         SZ_128K,
@@ -568,7 +569,9 @@ unsafe impl AsBytes for RunCpuSequencer {}
 /// The memory allocated for the arguments must remain until the GSP sends the
 /// init_done RPC.
 #[repr(transparent)]
-pub(crate) struct LibosMemoryRegionInitArgument(bindings::LibosMemoryRegionInitArgument);
+pub(crate) struct LibosMemoryRegionInitArgument {
+    inner: bindings::LibosMemoryRegionInitArgument,
+}
 
 // SAFETY: Padding is explicit and does not contain uninitialized data.
 unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
@@ -578,10 +581,10 @@ unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
 unsafe impl FromBytes for LibosMemoryRegionInitArgument {}
 
 impl LibosMemoryRegionInitArgument {
-    pub(crate) fn new<A: AsBytes + FromBytes>(
+    pub(crate) fn new<'a, A: AsBytes + FromBytes + KnownSize + ?Sized>(
         name: &'static str,
-        obj: &CoherentAllocation<A>,
-    ) -> Self {
+        obj: &'a Coherent<A>,
+    ) -> impl Init<Self> + 'a {
         /// Generates the `ID8` identifier required for some GSP objects.
         fn id8(name: &str) -> u64 {
             let mut bytes = [0u8; core::mem::size_of::<u64>()];
@@ -593,7 +596,8 @@ fn id8(name: &str) -> u64 {
             u64::from_ne_bytes(bytes)
         }
 
-        Self(bindings::LibosMemoryRegionInitArgument {
+        #[allow(non_snake_case)]
+        let init_inner = init!(bindings::LibosMemoryRegionInitArgument {
             id8: id8(name),
             pa: obj.dma_handle(),
             size: num::usize_as_u64(obj.size()),
@@ -603,7 +607,11 @@ fn id8(name: &str) -> u64 {
             loc: num::u32_into_u8::<
                 { bindings::LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM },
             >(),
-            ..Default::default()
+            ..Zeroable::init_zeroed()
+        });
+
+        init!(LibosMemoryRegionInitArgument {
+            inner <- init_inner,
         })
     }
 }
@@ -814,15 +822,23 @@ unsafe impl FromBytes for GspMsgElement {}
 
 /// Arguments for GSP startup.
 #[repr(transparent)]
-pub(crate) struct GspArgumentsCached(bindings::GSP_ARGUMENTS_CACHED);
+#[derive(Zeroable)]
+pub(crate) struct GspArgumentsCached {
+    inner: bindings::GSP_ARGUMENTS_CACHED,
+}
 
 impl GspArgumentsCached {
     /// Creates the arguments for starting the GSP up using `cmdq` as its command queue.
-    pub(crate) fn new(cmdq: &Cmdq) -> Self {
-        Self(bindings::GSP_ARGUMENTS_CACHED {
-            messageQueueInitArguments: MessageQueueInitArguments::new(cmdq).0,
+    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
+        #[allow(non_snake_case)]
+        let init_inner = init!(bindings::GSP_ARGUMENTS_CACHED {
+            messageQueueInitArguments <- MessageQueueInitArguments::new(cmdq),
             bDmemStack: 1,
-            ..Default::default()
+            ..Zeroable::init_zeroed()
+        });
+
+        init!(GspArgumentsCached {
+            inner <- init_inner,
         })
     }
 }
@@ -834,11 +850,21 @@ unsafe impl AsBytes for GspArgumentsCached {}
 /// must all be a multiple of GSP_PAGE_SIZE in size, so add padding to force it
 /// to that size.
 #[repr(C)]
+#[derive(Zeroable)]
 pub(crate) struct GspArgumentsPadded {
     pub(crate) inner: GspArgumentsCached,
     _padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()],
 }
 
+impl GspArgumentsPadded {
+    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
+        init!(GspArgumentsPadded {
+            inner <- GspArgumentsCached::new(cmdq),
+            ..Zeroable::init_zeroed()
+        })
+    }
+}
+
 // SAFETY: Padding is explicit and will not contain uninitialized data.
 unsafe impl AsBytes for GspArgumentsPadded {}
 
@@ -847,18 +873,18 @@ unsafe impl AsBytes for GspArgumentsPadded {}
 unsafe impl FromBytes for GspArgumentsPadded {}
 
 /// Init arguments for the message queue.
-#[repr(transparent)]
-struct MessageQueueInitArguments(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS);
+type MessageQueueInitArguments = bindings::MESSAGE_QUEUE_INIT_ARGUMENTS;
 
 impl MessageQueueInitArguments {
     /// Creates a new init arguments structure for `cmdq`.
-    fn new(cmdq: &Cmdq) -> Self {
-        Self(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS {
+    #[allow(non_snake_case)]
+    fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
+        init!(MessageQueueInitArguments {
             sharedMemPhysAddr: cmdq.dma_handle(),
             pageTableEntryCount: num::usize_into_u32::<{ Cmdq::NUM_PTES }>(),
             cmdQueueOffset: num::usize_as_u64(Cmdq::CMDQ_OFFSET),
             statQueueOffset: num::usize_as_u64(Cmdq::STATQ_OFFSET),
-            ..Default::default()
+            ..Zeroable::init_zeroed()
         })
     }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 8/8] gpu: nova-core: convert to new dma::Coherent API
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
                     ` (6 preceding siblings ...)
  2026-03-03 16:22   ` [PATCH 7/8] gpu: nova-core: convert Gsp::new() to use CoherentInit Danilo Krummrich
@ 2026-03-03 16:22   ` Danilo Krummrich
  2026-03-20 14:36     ` Alexandre Courbot
  2026-03-24 12:33   ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Andreas Hindborg
  8 siblings, 1 reply; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-03 16:22 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

From: Gary Guo <gary@garyguo.net>

Remove all usages of dma::CoherentAllocation and use the new
dma::Coherent type instead.

Note that there are still remainders of the old dma::CoherentAllocation
API, such as as_slice() and as_slice_mut().

Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 drivers/gpu/nova-core/dma.rs      | 19 +++++------
 drivers/gpu/nova-core/falcon.rs   |  7 ++--
 drivers/gpu/nova-core/firmware.rs | 10 ++----
 drivers/gpu/nova-core/gsp.rs      | 18 ++++++----
 drivers/gpu/nova-core/gsp/cmdq.rs | 55 ++++++++++++-------------------
 5 files changed, 46 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
index 7215398969da..3c19d5ffcfe8 100644
--- a/drivers/gpu/nova-core/dma.rs
+++ b/drivers/gpu/nova-core/dma.rs
@@ -9,13 +9,13 @@
 
 use kernel::{
     device,
-    dma::CoherentAllocation,
+    dma::Coherent,
     page::PAGE_SIZE,
     prelude::*, //
 };
 
 pub(crate) struct DmaObject {
-    dma: CoherentAllocation<u8>,
+    dma: Coherent<[u8]>,
 }
 
 impl DmaObject {
@@ -24,23 +24,22 @@ pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Sel
             .map_err(|_| EINVAL)?
             .pad_to_align()
             .size();
-        let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?;
+        let dma = Coherent::zeroed_slice(dev, len, GFP_KERNEL)?;
 
         Ok(Self { dma })
     }
 
     pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
-        Self::new(dev, data.len()).and_then(|mut dma_obj| {
-            // SAFETY: We have just allocated the DMA memory, we are the only users and
-            // we haven't made the device aware of the handle yet.
-            unsafe { dma_obj.write(data, 0)? }
-            Ok(dma_obj)
-        })
+        let dma_obj = Self::new(dev, data.len())?;
+        // SAFETY: We have just allocated the DMA memory, we are the only users and
+        // we haven't made the device aware of the handle yet.
+        unsafe { dma_obj.as_mut()[..data.len()].copy_from_slice(data) };
+        Ok(dma_obj)
     }
 }
 
 impl Deref for DmaObject {
-    type Target = CoherentAllocation<u8>;
+    type Target = Coherent<[u8]>;
 
     fn deref(&self) -> &Self::Target {
         &self.dma
diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 37bfee1d0949..39f5df568ddb 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -25,10 +25,7 @@
     driver::Bar0,
     falcon::hal::LoadMethod,
     gpu::Chipset,
-    num::{
-        FromSafeCast,
-        IntoSafeCast, //
-    },
+    num::FromSafeCast,
     regs,
     regs::macros::RegisterBase, //
 };
@@ -434,7 +431,7 @@ fn dma_wr<F: FalconFirmware<Target = E>>(
             }
             FalconMem::Dmem => (
                 0,
-                fw.dma_handle_with_offset(load_offsets.src_start.into_safe_cast())?,
+                fw.dma_handle() + DmaAddress::from(load_offsets.src_start),
             ),
         };
         if dma_start % DmaAddress::from(DMA_LEN) > 0 {
diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 815e8000bf81..efb20ef34f31 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -310,7 +310,7 @@ trait FirmwareSignature<F: FalconFirmware>: AsRef<[u8]> {}
 impl<F: FalconFirmware> FirmwareDmaObject<F, Unsigned> {
     /// Patches the firmware at offset `sig_base_img` with `signature`.
     fn patch_signature<S: FirmwareSignature<F>>(
-        mut self,
+        self,
         signature: &S,
         sig_base_img: usize,
     ) -> Result<FirmwareDmaObject<F, Signed>> {
@@ -320,12 +320,8 @@ fn patch_signature<S: FirmwareSignature<F>>(
         }
 
         // SAFETY: We are the only user of this object, so there cannot be any race.
-        let dst = unsafe { self.0.start_ptr_mut().add(sig_base_img) };
-
-        // SAFETY: `signature` and `dst` are valid, properly aligned, and do not overlap.
-        unsafe {
-            core::ptr::copy_nonoverlapping(signature_bytes.as_ptr(), dst, signature_bytes.len())
-        };
+        let dst = unsafe { self.0.as_mut() };
+        dst[sig_base_img..][..signature_bytes.len()].copy_from_slice(signature_bytes);
 
         Ok(FirmwareDmaObject(self.0, PhantomData))
     }
diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index cb7f6b4dc0f8..fcb3020acf8d 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -6,13 +6,15 @@
     device,
     dma::{
         Coherent,
-        CoherentAllocation,
         CoherentInit,
         DmaAddress, //
     },
     pci,
     prelude::*,
-    transmute::AsBytes, //
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    }, //
 };
 
 pub(crate) mod cmdq;
@@ -44,6 +46,9 @@
 #[repr(C)]
 struct PteArray<const NUM_ENTRIES: usize>([u64; NUM_ENTRIES]);
 
+/// SAFETY: arrays of `u64` implement `FromBytes` and we are but a wrapper around one.
+unsafe impl<const NUM_ENTRIES: usize> FromBytes for PteArray<NUM_ENTRIES> {}
+
 /// SAFETY: arrays of `u64` implement `AsBytes` and we are but a wrapper around one.
 unsafe impl<const NUM_ENTRIES: usize> AsBytes for PteArray<NUM_ENTRIES> {}
 
@@ -75,25 +80,24 @@ fn new(start: DmaAddress) -> Result<Self> {
 /// then pp points to index into the buffer where the next logging entry will
 /// be written. Therefore, the logging data is valid if:
 ///   1 <= pp < sizeof(buffer)/sizeof(u64)
-struct LogBuffer(CoherentAllocation<u8>);
+struct LogBuffer(Coherent<[u8]>);
 
 impl LogBuffer {
     /// Creates a new `LogBuffer` mapped on `dev`.
     fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         const NUM_PAGES: usize = RM_LOG_BUFFER_NUM_PAGES;
 
-        let mut obj = Self(CoherentAllocation::<u8>::alloc_coherent(
+        let obj = Self(Coherent::<u8>::zeroed_slice(
             dev,
             NUM_PAGES * GSP_PAGE_SIZE,
-            GFP_KERNEL | __GFP_ZERO,
+            GFP_KERNEL,
         )?);
         let ptes = PteArray::<NUM_PAGES>::new(obj.0.dma_handle())?;
 
         // SAFETY: `obj` has just been created and we are its sole user.
         unsafe {
             // Copy the self-mapping PTE at the expected location.
-            obj.0
-                .as_slice_mut(size_of::<u64>(), size_of_val(&ptes))?
+            obj.0.as_mut()[size_of::<u64>()..][..size_of_val(&ptes)]
                 .copy_from_slice(ptes.as_bytes())
         };
 
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 0056bfbf0a44..05c5f70dd4a9 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -11,7 +11,7 @@
 use kernel::{
     device,
     dma::{
-        CoherentAllocation,
+        Coherent,
         DmaAddress, //
     },
     dma_write,
@@ -181,7 +181,7 @@ unsafe impl AsBytes for GspMem {}
 // that is not a problem because they are not used outside the kernel.
 unsafe impl FromBytes for GspMem {}
 
-/// Wrapper around [`GspMem`] to share it with the GPU using a [`CoherentAllocation`].
+/// Wrapper around [`GspMem`] to share it with the GPU using a [`Coherent`].
 ///
 /// This provides the low-level functionality to communicate with the GSP, including allocation of
 /// queue space to write messages to and management of read/write pointers.
@@ -192,7 +192,7 @@ unsafe impl FromBytes for GspMem {}
 ///   pointer and the GSP read pointer. This region is returned by [`Self::driver_write_area`].
 /// * The driver owns (i.e. can read from) the part of the GSP message queue between the CPU read
 ///   pointer and the GSP write pointer. This region is returned by [`Self::driver_read_area`].
-struct DmaGspMem(CoherentAllocation<GspMem>);
+struct DmaGspMem(Coherent<GspMem>);
 
 impl DmaGspMem {
     /// Allocate a new instance and map it for `dev`.
@@ -200,15 +200,10 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         const MSGQ_SIZE: u32 = num::usize_into_u32::<{ size_of::<Msgq>() }>();
         const RX_HDR_OFF: u32 = num::usize_into_u32::<{ mem::offset_of!(Msgq, rx) }>();
 
-        let gsp_mem =
-            CoherentAllocation::<GspMem>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
-        dma_write!(gsp_mem, [0]?.ptes, PteArray::new(gsp_mem.dma_handle())?);
-        dma_write!(
-            gsp_mem,
-            [0]?.cpuq.tx,
-            MsgqTxHeader::new(MSGQ_SIZE, RX_HDR_OFF, MSGQ_NUM_PAGES)
-        );
-        dma_write!(gsp_mem, [0]?.cpuq.rx, MsgqRxHeader::new());
+        let gsp_mem = Coherent::<GspMem>::zeroed(dev, GFP_KERNEL)?;
+        dma_write!(gsp_mem, .ptes, PteArray::new(gsp_mem.dma_handle())?);
+        dma_write!(gsp_mem, .cpuq.tx, MsgqTxHeader::new(MSGQ_SIZE, RX_HDR_OFF, MSGQ_NUM_PAGES));
+        dma_write!(gsp_mem, .cpuq.rx, MsgqRxHeader::new());
 
         Ok(Self(gsp_mem))
     }
@@ -223,10 +218,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         let rx = self.gsp_read_ptr() as usize;
 
         // SAFETY:
-        // - The `CoherentAllocation` contains exactly one object.
         // - We will only access the driver-owned part of the shared memory.
         // - Per the safety statement of the function, no concurrent access will be performed.
-        let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
+        let gsp_mem = unsafe { &mut *self.0.as_mut_ptr() };
         // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
         let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
 
@@ -264,10 +258,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         let rx = self.cpu_read_ptr() as usize;
 
         // SAFETY:
-        // - The `CoherentAllocation` contains exactly one object.
         // - We will only access the driver-owned part of the shared memory.
         // - Per the safety statement of the function, no concurrent access will be performed.
-        let gsp_mem = &unsafe { self.0.as_slice(0, 1) }.unwrap()[0];
+        let gsp_mem = unsafe { &*self.0.as_ptr() };
         let data = &gsp_mem.gspq.msgq.data;
 
         // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
@@ -334,11 +327,10 @@ fn allocate_command(&mut self, size: usize) -> Result<GspCommand<'_>> {
     //
     // - The returned value is within `0..MSGQ_NUM_PAGES`.
     fn gsp_write_ptr(&self) -> u32 {
-        let gsp_mem = self.0.start_ptr();
+        let gsp_mem = self.0.as_ptr();
 
         // SAFETY:
-        //  - The 'CoherentAllocation' contains at least one object.
-        //  - By the invariants of `CoherentAllocation` the pointer is valid.
+        //  - By the invariants of `Coherent` the pointer is valid.
         (unsafe { (*gsp_mem).gspq.tx.write_ptr() } % MSGQ_NUM_PAGES)
     }
 
@@ -348,11 +340,10 @@ fn gsp_write_ptr(&self) -> u32 {
     //
     // - The returned value is within `0..MSGQ_NUM_PAGES`.
     fn gsp_read_ptr(&self) -> u32 {
-        let gsp_mem = self.0.start_ptr();
+        let gsp_mem = self.0.as_ptr();
 
         // SAFETY:
-        //  - The 'CoherentAllocation' contains at least one object.
-        //  - By the invariants of `CoherentAllocation` the pointer is valid.
+        //  - By the invariants of `Coherent` the pointer is valid.
         (unsafe { (*gsp_mem).gspq.rx.read_ptr() } % MSGQ_NUM_PAGES)
     }
 
@@ -362,11 +353,10 @@ fn gsp_read_ptr(&self) -> u32 {
     //
     // - The returned value is within `0..MSGQ_NUM_PAGES`.
     fn cpu_read_ptr(&self) -> u32 {
-        let gsp_mem = self.0.start_ptr();
+        let gsp_mem = self.0.as_ptr();
 
         // SAFETY:
-        //  - The ['CoherentAllocation'] contains at least one object.
-        //  - By the invariants of CoherentAllocation the pointer is valid.
+        //  - By the invariants of `Coherent` the pointer is valid.
         (unsafe { (*gsp_mem).cpuq.rx.read_ptr() } % MSGQ_NUM_PAGES)
     }
 
@@ -377,11 +367,10 @@ fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
         // Ensure read pointer is properly ordered.
         fence(Ordering::SeqCst);
 
-        let gsp_mem = self.0.start_ptr_mut();
+        let gsp_mem = self.0.as_mut_ptr();
 
         // SAFETY:
-        //  - The 'CoherentAllocation' contains at least one object.
-        //  - By the invariants of `CoherentAllocation` the pointer is valid.
+        //  - By the invariants of `Coherent` the pointer is valid.
         unsafe { (*gsp_mem).cpuq.rx.set_read_ptr(rptr) };
     }
 
@@ -391,22 +380,20 @@ fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
     //
     // - The returned value is within `0..MSGQ_NUM_PAGES`.
     fn cpu_write_ptr(&self) -> u32 {
-        let gsp_mem = self.0.start_ptr();
+        let gsp_mem = self.0.as_ptr();
 
         // SAFETY:
-        //  - The 'CoherentAllocation' contains at least one object.
-        //  - By the invariants of `CoherentAllocation` the pointer is valid.
+        //  - By the invariants of `Coherent` the pointer is valid.
         (unsafe { (*gsp_mem).cpuq.tx.write_ptr() } % MSGQ_NUM_PAGES)
     }
 
     // Informs the GSP that it can process `elem_count` new pages from the command queue.
     fn advance_cpu_write_ptr(&mut self, elem_count: u32) {
         let wptr = self.cpu_write_ptr().wrapping_add(elem_count) % MSGQ_NUM_PAGES;
-        let gsp_mem = self.0.start_ptr_mut();
+        let gsp_mem = self.0.as_mut_ptr();
 
         // SAFETY:
-        //  - The 'CoherentAllocation' contains at least one object.
-        //  - By the invariants of `CoherentAllocation` the pointer is valid.
+        //  - By the invariants of `Coherent` the pointer is valid.
         unsafe { (*gsp_mem).cpuq.tx.set_write_ptr(wptr) };
 
         // Ensure all command data is visible before triggering the GSP read.
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/8] rust: dma: use "kernel vertical" style for imports
  2026-03-03 16:22   ` [PATCH 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
@ 2026-03-03 17:12     ` Gary Guo
  0 siblings, 0 replies; 30+ messages in thread
From: Gary Guo @ 2026-03-03 17:12 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 3, 2026 at 4:22 PM GMT, Danilo Krummrich wrote:
> Convert all imports to use "kernel vertical" style.
> 
> With this, subsequent patches neither introduce unrelated changes nor
> leave an inconsistent import pattern.
> 
> While at it, drop unnecessary imports covered by prelude::*.
> 
> Link: https://docs.kernel.org/rust/coding-guidelines.html#imports
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Gary Guo <gary@garyguo.net>

> ---
>  rust/kernel/dma.rs | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
  2026-03-03 16:22   ` [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
@ 2026-03-03 17:21     ` Gary Guo
  2026-03-20 14:36     ` Alexandre Courbot
  1 sibling, 0 replies; 30+ messages in thread
From: Gary Guo @ 2026-03-03 17:21 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 3, 2026 at 4:22 PM GMT, Danilo Krummrich wrote:
> Convert wpr_meta to use Coherent::init() and simplify the
> initialization.  It also avoids a separate initialization of
> GspFwWprMeta on the stack.
>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  drivers/gpu/nova-core/gsp/boot.rs |  7 ++-----
>  drivers/gpu/nova-core/gsp/fw.rs   | 20 +++++++++++++++-----
>  2 files changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
> index 7f46fa5e9b50..1a4d9ee4f256 100644
> --- a/drivers/gpu/nova-core/gsp/boot.rs
> +++ b/drivers/gpu/nova-core/gsp/boot.rs
> @@ -2,8 +2,7 @@
>  
>  use kernel::{
>      device,
> -    dma::CoherentAllocation,
> -    dma_write,
> +    dma::Coherent,
>      io::poll::read_poll_timeout,
>      pci,
>      prelude::*,
> @@ -155,9 +154,7 @@ pub(crate) fn boot(
>              bar,
>          )?;
>  
> -        let wpr_meta =
> -            CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
> -        dma_write!(wpr_meta, [0]?, GspFwWprMeta::new(&gsp_fw, &fb_layout));
> +        let wpr_meta = Coherent::init(dev, GFP_KERNEL, GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
>  
>          self.cmdq
>              .send_command(bar, commands::SetSystemInfo::new(pdev))?;
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index f1797e1f0d9d..751d5447214d 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -131,7 +131,9 @@ pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
>  /// Structure passed to the GSP bootloader, containing the framebuffer layout as well as the DMA
>  /// addresses of the GSP bootloader and firmware.
>  #[repr(transparent)]
> -pub(crate) struct GspFwWprMeta(bindings::GspFwWprMeta);
> +pub(crate) struct GspFwWprMeta {
> +    inner: bindings::GspFwWprMeta,
> +}
>  
>  // SAFETY: Padding is explicit and does not contain uninitialized data.
>  unsafe impl AsBytes for GspFwWprMeta {}
> @@ -144,10 +146,14 @@ unsafe impl FromBytes for GspFwWprMeta {}
>  type GspFwWprMetaBootInfo = bindings::GspFwWprMeta__bindgen_ty_1__bindgen_ty_1;
>  
>  impl GspFwWprMeta {
> -    /// Fill in and return a `GspFwWprMeta` suitable for booting `gsp_firmware` using the
> +    /// Returns an initializer for a `GspFwWprMeta` suitable for booting `gsp_firmware` using the
>      /// `fb_layout` layout.
> -    pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
> -        Self(bindings::GspFwWprMeta {
> +    pub(crate) fn new<'a>(
> +        gsp_firmware: &'a GspFirmware,
> +        fb_layout: &'a FbLayout,
> +    ) -> impl Init<Self> + 'a {
> +        #[allow(non_snake_case)]

I suppose this is from the field accessor generated?

@Benno we probably should add `#[allow(nonstandard_style)]` on the accessor
generated.

Best,
Gary

> +        let init_inner = init!(bindings::GspFwWprMeta {
>              // CAST: we want to store the bits of `GSP_FW_WPR_META_MAGIC` unmodified.
>              magic: bindings::GSP_FW_WPR_META_MAGIC as u64,
>              revision: u64::from(bindings::GSP_FW_WPR_META_REVISION),
> @@ -182,7 +188,11 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
>              fbSize: fb_layout.fb.end - fb_layout.fb.start,
>              vgaWorkspaceOffset: fb_layout.vga_workspace.start,
>              vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
> -            ..Default::default()
> +            ..Zeroable::init_zeroed()
> +        });
> +
> +        init!(GspFwWprMeta {
> +            inner <- init_inner,
>          })
>      }
>  }


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/8] gpu: nova-core: convert Gsp::new() to use CoherentInit
  2026-03-03 16:22   ` [PATCH 7/8] gpu: nova-core: convert Gsp::new() to use CoherentInit Danilo Krummrich
@ 2026-03-03 17:33     ` Gary Guo
  0 siblings, 0 replies; 30+ messages in thread
From: Gary Guo @ 2026-03-03 17:33 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 3, 2026 at 4:22 PM GMT, Danilo Krummrich wrote:
> Convert libos (LibosMemoryRegionInitArgument) and rmargs
> (GspArgumentsPadded) to use CoherentInit / Coherent::init() and simplify
> the initialization. This also avoids separate initialization on the
> stack.
>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  drivers/gpu/nova-core/gsp.rs    | 47 +++++++++++--------------
>  drivers/gpu/nova-core/gsp/fw.rs | 62 +++++++++++++++++++++++----------
>  2 files changed, 65 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
> index 25cd48514c77..cb7f6b4dc0f8 100644
> --- a/drivers/gpu/nova-core/gsp.rs
> +++ b/drivers/gpu/nova-core/gsp.rs
> @@ -5,10 +5,11 @@
>  use kernel::{
>      device,
>      dma::{
> +        Coherent,
>          CoherentAllocation,
> +        CoherentInit,
>          DmaAddress, //
>      },
> -    dma_write,
>      pci,
>      prelude::*,
>      transmute::AsBytes, //
> @@ -104,7 +105,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>  #[pin_data]
>  pub(crate) struct Gsp {
>      /// Libos arguments.
> -    pub(crate) libos: CoherentAllocation<LibosMemoryRegionInitArgument>,
> +    pub(crate) libos: Coherent<[LibosMemoryRegionInitArgument]>,
>      /// Init log buffer.
>      loginit: LogBuffer,
>      /// Interrupts log buffer.
> @@ -114,7 +115,7 @@ pub(crate) struct Gsp {
>      /// Command queue.
>      pub(crate) cmdq: Cmdq,
>      /// RM arguments.
> -    rmargs: CoherentAllocation<GspArgumentsPadded>,
> +    rmargs: Coherent<GspArgumentsPadded>,
>  }
>  
>  impl Gsp {
> @@ -123,34 +124,28 @@ pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error
>          pin_init::pin_init_scope(move || {
>              let dev = pdev.as_ref();
>  
> +            // Initialise the logging structures. The OpenRM equivalents are in:
> +            // _kgspInitLibosLoggingStructures (allocates memory for buffers)
> +            // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
>              Ok(try_pin_init!(Self {
> -                libos: CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent(
> -                    dev,
> -                    GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
> -                    GFP_KERNEL | __GFP_ZERO,
> -                )?,
>                  loginit: LogBuffer::new(dev)?,
>                  logintr: LogBuffer::new(dev)?,
>                  logrm: LogBuffer::new(dev)?,
>                  cmdq: Cmdq::new(dev)?,
> -                rmargs: CoherentAllocation::<GspArgumentsPadded>::alloc_coherent(
> -                    dev,
> -                    1,
> -                    GFP_KERNEL | __GFP_ZERO,
> -                )?,
> -                _: {
> -                    // Initialise the logging structures. The OpenRM equivalents are in:
> -                    // _kgspInitLibosLoggingStructures (allocates memory for buffers)
> -                    // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
> -                    dma_write!(
> -                        libos, [0]?, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0)
> -                    );
> -                    dma_write!(
> -                        libos, [1]?, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0)
> -                    );
> -                    dma_write!(libos, [2]?, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0));
> -                    dma_write!(rmargs, [0]?.inner, fw::GspArgumentsCached::new(cmdq));
> -                    dma_write!(libos, [3]?, LibosMemoryRegionInitArgument::new("RMARGS", rmargs));
> +                rmargs: Coherent::init(dev, GFP_KERNEL, GspArgumentsPadded::new(cmdq))?,
> +                libos: {
> +                    let mut libos = CoherentInit::zeroed_slice(
> +                        dev,
> +                        GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
> +                        GFP_KERNEL,
> +                    )?;
> +
> +                    libos.init_at(0, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0))?;
> +                    libos.init_at(1, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0))?;
> +                    libos.init_at(2, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?;
> +                    libos.init_at(3, LibosMemoryRegionInitArgument::new("RMARGS", rmargs))?;
> +
> +                    libos.into()
>                  },
>              }))
>          })
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index 751d5447214d..59cb03a9b238 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -9,11 +9,12 @@
>  use core::ops::Range;
>  
>  use kernel::{
> -    dma::CoherentAllocation,
> +    dma::Coherent,
>      prelude::*,
>      ptr::{
>          Alignable,
> -        Alignment, //
> +        Alignment,
> +        KnownSize, //
>      },
>      sizes::{
>          SZ_128K,
> @@ -568,7 +569,9 @@ unsafe impl AsBytes for RunCpuSequencer {}
>  /// The memory allocated for the arguments must remain until the GSP sends the
>  /// init_done RPC.
>  #[repr(transparent)]
> -pub(crate) struct LibosMemoryRegionInitArgument(bindings::LibosMemoryRegionInitArgument);
> +pub(crate) struct LibosMemoryRegionInitArgument {
> +    inner: bindings::LibosMemoryRegionInitArgument,
> +}

FYI tuple struct construction is being worked at now:
https://github.com/Rust-for-Linux/pin-init/pull/113

>  
>  // SAFETY: Padding is explicit and does not contain uninitialized data.
>  unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
> @@ -578,10 +581,10 @@ unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
>  unsafe impl FromBytes for LibosMemoryRegionInitArgument {}
>  
>  impl LibosMemoryRegionInitArgument {
> -    pub(crate) fn new<A: AsBytes + FromBytes>(
> +    pub(crate) fn new<'a, A: AsBytes + FromBytes + KnownSize + ?Sized>(
>          name: &'static str,
> -        obj: &CoherentAllocation<A>,
> -    ) -> Self {
> +        obj: &'a Coherent<A>,
> +    ) -> impl Init<Self> + 'a {
>          /// Generates the `ID8` identifier required for some GSP objects.
>          fn id8(name: &str) -> u64 {
>              let mut bytes = [0u8; core::mem::size_of::<u64>()];
> @@ -593,7 +596,8 @@ fn id8(name: &str) -> u64 {
>              u64::from_ne_bytes(bytes)
>          }
>  
> -        Self(bindings::LibosMemoryRegionInitArgument {
> +        #[allow(non_snake_case)]
> +        let init_inner = init!(bindings::LibosMemoryRegionInitArgument {
>              id8: id8(name),
>              pa: obj.dma_handle(),
>              size: num::usize_as_u64(obj.size()),
> @@ -603,7 +607,11 @@ fn id8(name: &str) -> u64 {
>              loc: num::u32_into_u8::<
>                  { bindings::LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM },
>              >(),
> -            ..Default::default()
> +            ..Zeroable::init_zeroed()
> +        });
> +
> +        init!(LibosMemoryRegionInitArgument {
> +            inner <- init_inner,
>          })
>      }
>  }
> @@ -814,15 +822,23 @@ unsafe impl FromBytes for GspMsgElement {}
>  
>  /// Arguments for GSP startup.
>  #[repr(transparent)]
> -pub(crate) struct GspArgumentsCached(bindings::GSP_ARGUMENTS_CACHED);
> +#[derive(Zeroable)]
> +pub(crate) struct GspArgumentsCached {
> +    inner: bindings::GSP_ARGUMENTS_CACHED,
> +}
>  
>  impl GspArgumentsCached {
>      /// Creates the arguments for starting the GSP up using `cmdq` as its command queue.
> -    pub(crate) fn new(cmdq: &Cmdq) -> Self {
> -        Self(bindings::GSP_ARGUMENTS_CACHED {
> -            messageQueueInitArguments: MessageQueueInitArguments::new(cmdq).0,
> +    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {

This struct is not that big, so I think we can just keep the by-value
constructor.

Best,
Gary

> +        #[allow(non_snake_case)]
> +        let init_inner = init!(bindings::GSP_ARGUMENTS_CACHED {
> +            messageQueueInitArguments <- MessageQueueInitArguments::new(cmdq),
>              bDmemStack: 1,
> -            ..Default::default()
> +            ..Zeroable::init_zeroed()
> +        });
> +
> +        init!(GspArgumentsCached {
> +            inner <- init_inner,
>          })
>      }
>  }
> @@ -834,11 +850,21 @@ unsafe impl AsBytes for GspArgumentsCached {}
>  /// must all be a multiple of GSP_PAGE_SIZE in size, so add padding to force it
>  /// to that size.
>  #[repr(C)]
> +#[derive(Zeroable)]
>  pub(crate) struct GspArgumentsPadded {
>      pub(crate) inner: GspArgumentsCached,
>      _padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()],
>  }
>  
> +impl GspArgumentsPadded {
> +    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
> +        init!(GspArgumentsPadded {
> +            inner <- GspArgumentsCached::new(cmdq),
> +            ..Zeroable::init_zeroed()
> +        })
> +    }
> +}
> +
>  // SAFETY: Padding is explicit and will not contain uninitialized data.
>  unsafe impl AsBytes for GspArgumentsPadded {}
>  
> @@ -847,18 +873,18 @@ unsafe impl AsBytes for GspArgumentsPadded {}
>  unsafe impl FromBytes for GspArgumentsPadded {}
>  
>  /// Init arguments for the message queue.
> -#[repr(transparent)]
> -struct MessageQueueInitArguments(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS);
> +type MessageQueueInitArguments = bindings::MESSAGE_QUEUE_INIT_ARGUMENTS;
>  
>  impl MessageQueueInitArguments {
>      /// Creates a new init arguments structure for `cmdq`.
> -    fn new(cmdq: &Cmdq) -> Self {
> -        Self(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS {
> +    #[allow(non_snake_case)]
> +    fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
> +        init!(MessageQueueInitArguments {
>              sharedMemPhysAddr: cmdq.dma_handle(),
>              pageTableEntryCount: num::usize_into_u32::<{ Cmdq::NUM_PTES }>(),
>              cmdQueueOffset: num::usize_as_u64(Cmdq::CMDQ_OFFSET),
>              statQueueOffset: num::usize_as_u64(Cmdq::STATQ_OFFSET),
> -            ..Default::default()
> +            ..Zeroable::init_zeroed()
>          })
>      }
>  }


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization
  2026-03-03 16:22   ` [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization Danilo Krummrich
@ 2026-03-12 13:51     ` Gary Guo
  2026-03-17  6:52     ` Alice Ryhl
  2026-03-20 14:35     ` Alexandre Courbot
  2 siblings, 0 replies; 30+ messages in thread
From: Gary Guo @ 2026-03-12 13:51 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 3, 2026 at 4:22 PM GMT, Danilo Krummrich wrote:
> Currently, dma::Coherent cannot safely provide (mutable) access to its
> underlying memory because the memory might be concurrently accessed by a
> DMA device. This makes it difficult to safely initialize the memory
> before handing it over to the hardware.
>
> Introduce dma::CoherentInit, a type that encapsulates a dma::Coherent
> before its DMA address is exposed to the device. dma::CoherentInit can
> guarantee exclusive access to the inner dma::Coherent and implement
> Deref and DerefMut.
>
> Once the memory is properly initialized, dma::CoherentInit can be
> converted into a regular dma::Coherent.
>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  rust/kernel/dma.rs | 153 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 152 insertions(+), 1 deletion(-)
>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index 291fdea3b52b..79dd8717ac47 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -20,7 +20,13 @@
>          FromBytes, //
>      }, //
>  };
> -use core::ptr::NonNull;
> +use core::{
> +    ops::{
> +        Deref,
> +        DerefMut, //
> +    },
> +    ptr::NonNull, //
> +};
>  
>  /// DMA address type.
>  ///
> @@ -352,6 +358,151 @@ fn from(direction: DataDirection) -> Self {
>      }
>  }
>  
> +/// Initializer type for [`Coherent`].
> +///
> +/// A [`Coherent`] object can't provide access to its memory as (mutable) slice safely, since it
> +/// can't fulfill the requirements for creating a slice. For instance, it is not valid to have a
> +/// (mutable) slice to of the memory of a [`Coherent`] while the memory might be accessed by a
> +/// device.
> +///
> +/// In contrast, this initializer type is able to fulfill the requirements to safely obtain a
> +/// (mutable) slice, as it neither provides access to the DMA address of the embedded [`Coherent`],
> +/// nor can it be used with the DMA projection accessors.
> +///
> +/// Once initialized, this type can be converted to a regular [`Coherent`] object.
> +///
> +/// # Examples
> +///
> +/// `CoherentInit<T>`:
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentInit,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentInit<u64> =
> +///     CoherentInit::zeroed_with_attrs(dev, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
> +/// *dmem = 42;
> +/// let dmem: Coherent<u64> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +///
> +/// `CoherentInit<[T]>`:
> +///
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentInit,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentInit<[u64]> =
> +///     CoherentInit::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
> +/// dmem.fill(42);
> +/// let dmem: Coherent<[u64]> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +pub struct CoherentInit<T: AsBytes + FromBytes + KnownSize + ?Sized>(Coherent<T>);
> +
> +impl<T: AsBytes + FromBytes> CoherentInit<[T]> {
> +    /// Initializer variant of [`Coherent::zeroed_slice_with_attrs`].
> +    pub fn zeroed_slice_with_attrs(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +    ) -> Result<Self> {
> +        Coherent::zeroed_slice_with_attrs(dev, count, gfp_flags, dma_attrs).map(Self)
> +    }
> +
> +    /// Same as [CoherentInit::zeroed_slice_with_attrs], but with `dma::Attrs(0)`.
> +    pub fn zeroed_slice(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +    ) -> Result<Self> {
> +        Self::zeroed_slice_with_attrs(dev, count, gfp_flags, Attrs(0))
> +    }
> +
> +    /// Initializes the element at `i` using the given initializer.
> +    ///
> +    /// Returns `EINVAL` if `i` is out of bounds.
> +    pub fn init_at<E>(&mut self, i: usize, init: impl Init<T, E>) -> Result
> +    where
> +        Error: From<E>,
> +    {
> +        if i >= self.0.len() {
> +            return Err(EINVAL);
> +        }
> +
> +        let ptr = core::ptr::from_mut(&mut self[i]);
> +
> +        // SAFETY:
> +        // - `ptr` is valid, properly aligned, and within this allocation.
> +        // - `T: AsBytes + FromBytes` guarantees all bit patterns are valid, so partial writes on
> +        //   error cannot leave the element in an invalid state.
> +        // - The DMA address has not been exposed yet, so there is no concurrent device access.
> +        unsafe { init.__init(ptr)? };
> +
> +        Ok(())
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentInit<T> {
> +    /// Same as [`CoherentInit::zeroed_slice_with_attrs`], but for a single element.
> +    pub fn zeroed_with_attrs(
> +        dev: &device::Device<Bound>,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +    ) -> Result<Self> {
> +        Coherent::zeroed_with_attrs(dev, gfp_flags, dma_attrs).map(Self)
> +    }
> +
> +    /// Same as [`CoherentInit::zeroed_slice`], but for a single element.
> +    pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> Result<Self> {
> +        Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes + KnownSize + ?Sized> Deref for CoherentInit<T> {
> +    type Target = T;
> +

The deref impls should be `#[inline]`.

Some other methods could be as well.

Best,
Gary

> +    fn deref(&self) -> &Self::Target {
> +        // SAFETY:
> +        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
> +        //   device.
> +        // - We have exclusive access to `self.0`.
> +        unsafe { self.0.as_ref() }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes + KnownSize + ?Sized> DerefMut for CoherentInit<T> {
> +    fn deref_mut(&mut self) -> &mut Self::Target {
> +        // SAFETY:
> +        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
> +        //   device.
> +        // - We have exclusive access to `self.0`.
> +        unsafe { self.0.as_mut() }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes + KnownSize + ?Sized> From<CoherentInit<T>> for Coherent<T> {
> +    fn from(value: CoherentInit<T>) -> Self {
> +        value.0
> +    }
> +}
> +
>  /// An abstraction of the `dma_alloc_coherent` API.
>  ///
>  /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent`
  2026-03-03 16:22   ` [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
@ 2026-03-17  6:47     ` Alice Ryhl
  2026-03-20 14:35     ` Alexandre Courbot
  1 sibling, 0 replies; 30+ messages in thread
From: Alice Ryhl @ 2026-03-17  6:47 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: acourbot, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue, Mar 03, 2026 at 05:22:54PM +0100, Danilo Krummrich wrote:
> From: Gary Guo <gary@garyguo.net>
> 
> These constructors create a coherent container of a single object
> instead of slice. They are named `zeroed` and `zeroed_with_attrs` to
> emphasis that they are created initialized zeroed. It is intended that
> there'll be new constructors that take `PinInit` instead of zeroing.
> 
> Signed-off-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Alice Ryhl <aliceryhl@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/8] rust: dma: add generalized container for types other than slices
  2026-03-03 16:22   ` [PATCH 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
@ 2026-03-17  6:50     ` Alice Ryhl
  2026-03-17 12:56       ` Gary Guo
  0 siblings, 1 reply; 30+ messages in thread
From: Alice Ryhl @ 2026-03-17  6:50 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: acourbot, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue, Mar 03, 2026 at 05:22:53PM +0100, Danilo Krummrich wrote:
> From: Gary Guo <gary@garyguo.net>
> 
> Currently, `CoherentAllocation` is concecptually a DMA coherent container
> of a slice of `[T]` of runtime-checked length. Generalize it by creating
> `dma::Coherent<T>` which can hold any value of `T`.
> `Coherent::alloc_with_attrs` is implemented but not yet exposed, as I
> believe we should not expose the way to obtain an uninitialized coherent
> region.
> 
> `Coherent<[T]>` provides a `len` method instead of the previous `count()`
> method to be consistent with methods on slices.
> 
> The existing type is re-defined as a type alias of `Coherent<[T]>` to ease
> transition. Methods in use are not yet removed.
> 
> Signed-off-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

I see you have both `size()` and `len()`. I think it would be clearer to
rename `size()` to `len_bytes()`.

Otherwise LGTM.
Reviewed-by: Alice Ryhl <aliceryhl@google.com>

Alice


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization
  2026-03-03 16:22   ` [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization Danilo Krummrich
  2026-03-12 13:51     ` Gary Guo
@ 2026-03-17  6:52     ` Alice Ryhl
  2026-03-17 14:40       ` Danilo Krummrich
  2026-03-20 14:35     ` Alexandre Courbot
  2 siblings, 1 reply; 30+ messages in thread
From: Alice Ryhl @ 2026-03-17  6:52 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: acourbot, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue, Mar 03, 2026 at 05:22:55PM +0100, Danilo Krummrich wrote:
> Currently, dma::Coherent cannot safely provide (mutable) access to its
> underlying memory because the memory might be concurrently accessed by a
> DMA device. This makes it difficult to safely initialize the memory
> before handing it over to the hardware.
> 
> Introduce dma::CoherentInit, a type that encapsulates a dma::Coherent
> before its DMA address is exposed to the device. dma::CoherentInit can
> guarantee exclusive access to the inner dma::Coherent and implement
> Deref and DerefMut.
> 
> Once the memory is properly initialized, dma::CoherentInit can be
> converted into a regular dma::Coherent.
> 
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

overall LGTM

Reviewed-by: Alice Ryhl <aliceryhl@google.com>

> +        let ptr = core::ptr::from_mut(&mut self[i]);

&raw mut self[i].

Alice

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/8] rust: dma: add generalized container for types other than slices
  2026-03-17  6:50     ` Alice Ryhl
@ 2026-03-17 12:56       ` Gary Guo
  2026-03-20 13:58         ` Alexandre Courbot
  0 siblings, 1 reply; 30+ messages in thread
From: Gary Guo @ 2026-03-17 12:56 UTC (permalink / raw)
  To: Alice Ryhl, Danilo Krummrich
  Cc: acourbot, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 17, 2026 at 6:50 AM GMT, Alice Ryhl wrote:
> On Tue, Mar 03, 2026 at 05:22:53PM +0100, Danilo Krummrich wrote:
>> From: Gary Guo <gary@garyguo.net>
>> 
>> Currently, `CoherentAllocation` is concecptually a DMA coherent container
>> of a slice of `[T]` of runtime-checked length. Generalize it by creating
>> `dma::Coherent<T>` which can hold any value of `T`.
>> `Coherent::alloc_with_attrs` is implemented but not yet exposed, as I
>> believe we should not expose the way to obtain an uninitialized coherent
>> region.
>> 
>> `Coherent<[T]>` provides a `len` method instead of the previous `count()`
>> method to be consistent with methods on slices.
>> 
>> The existing type is re-defined as a type alias of `Coherent<[T]>` to ease
>> transition. Methods in use are not yet removed.
>> 
>> Signed-off-by: Gary Guo <gary@garyguo.net>
>> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
>
> I see you have both `size()` and `len()`. I think it would be clearer to
> rename `size()` to `len_bytes()`.

This is an API that already exists and I simply moved it around so it is
implemented for all types, not just `[T]`.

I also think that `size()` quite unambiguously means the size in bytes (like, in
all allocation and I/O APIs, `size` means size of object in bytes); I don't see
much benefit in renaming it `len_bytes`.

Best,
Gary

>
> Otherwise LGTM.
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
>
> Alice


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization
  2026-03-17  6:52     ` Alice Ryhl
@ 2026-03-17 14:40       ` Danilo Krummrich
  2026-03-17 14:43         ` Alice Ryhl
  0 siblings, 1 reply; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-17 14:40 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: acourbot, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 17, 2026 at 7:52 AM CET, Alice Ryhl wrote:
> On Tue, Mar 03, 2026 at 05:22:55PM +0100, Danilo Krummrich wrote:
>> Currently, dma::Coherent cannot safely provide (mutable) access to its
>> underlying memory because the memory might be concurrently accessed by a
>> DMA device. This makes it difficult to safely initialize the memory
>> before handing it over to the hardware.
>> 
>> Introduce dma::CoherentInit, a type that encapsulates a dma::Coherent
>> before its DMA address is exposed to the device. dma::CoherentInit can
>> guarantee exclusive access to the inner dma::Coherent and implement
>> Deref and DerefMut.
>> 
>> Once the memory is properly initialized, dma::CoherentInit can be
>> converted into a regular dma::Coherent.
>> 
>> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
>
> overall LGTM
>
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>

Gary and me concluded that dma::CoherentBox would probably be a better name for
what this structure represents. I.e. you can carry it around and use it similar
to a "real" Box until it is converted to dma::Coherent in order to make it
available to the device.

Is your RB still valid with this rename?

>> +        let ptr = core::ptr::from_mut(&mut self[i]);
>
> &raw mut self[i].
>
> Alice


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization
  2026-03-17 14:40       ` Danilo Krummrich
@ 2026-03-17 14:43         ` Alice Ryhl
  0 siblings, 0 replies; 30+ messages in thread
From: Alice Ryhl @ 2026-03-17 14:43 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: acourbot, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue, Mar 17, 2026 at 3:40 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Tue Mar 17, 2026 at 7:52 AM CET, Alice Ryhl wrote:
> > On Tue, Mar 03, 2026 at 05:22:55PM +0100, Danilo Krummrich wrote:
> >> Currently, dma::Coherent cannot safely provide (mutable) access to its
> >> underlying memory because the memory might be concurrently accessed by a
> >> DMA device. This makes it difficult to safely initialize the memory
> >> before handing it over to the hardware.
> >>
> >> Introduce dma::CoherentInit, a type that encapsulates a dma::Coherent
> >> before its DMA address is exposed to the device. dma::CoherentInit can
> >> guarantee exclusive access to the inner dma::Coherent and implement
> >> Deref and DerefMut.
> >>
> >> Once the memory is properly initialized, dma::CoherentInit can be
> >> converted into a regular dma::Coherent.
> >>
> >> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> >
> > overall LGTM
> >
> > Reviewed-by: Alice Ryhl <aliceryhl@google.com>
>
> Gary and me concluded that dma::CoherentBox would probably be a better name for
> what this structure represents. I.e. you can carry it around and use it similar
> to a "real" Box until it is converted to dma::Coherent in order to make it
> available to the device.
>
> Is your RB still valid with this rename?

Yes.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization
  2026-03-20 14:35     ` Alexandre Courbot
@ 2026-03-20 13:51       ` Danilo Krummrich
  0 siblings, 0 replies; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-20 13:51 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On 3/20/2026 3:35 PM, Alexandre Courbot wrote:
> On Wed Mar 4, 2026 at 1:22 AM JST, Danilo Krummrich wrote:
>> +/// # fn test(dev: &Device<Bound>) -> Result {
>> +/// let mut dmem: CoherentInit<u64> =
>> +///     CoherentInit::zeroed_with_attrs(dev, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
> 
> Since this is an example, shall we use the simpler
> `CoherentInit::zeroed`?

Sure.

>> +    /// Initializes the element at `i` using the given initializer.
>> +    ///
>> +    /// Returns `EINVAL` if `i` is out of bounds.
>> +    pub fn init_at<E>(&mut self, i: usize, init: impl Init<T, E>) -> Result
> 
> Should this method be introduced in the next patch, or even in its own
> patch? It feels a bit out of place at this stage since the non-array
> `CoherentInit` doesn't have an equivalent.

The non-slice variant doesn't need it.

I don't see an advantage creating a separate patch for this.

> I was also wondering whether we could have an `init` method that
> initializes all the elements without having to zero the whole array
> first, but I guess it might be a bit difficult to implement in a
> flexible enough way.

You have this in the next patch, Coherent::init() works with arrays.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/8] rust: dma: add generalized container for types other than slices
  2026-03-17 12:56       ` Gary Guo
@ 2026-03-20 13:58         ` Alexandre Courbot
  0 siblings, 0 replies; 30+ messages in thread
From: Alexandre Courbot @ 2026-03-20 13:58 UTC (permalink / raw)
  To: Gary Guo
  Cc: Alice Ryhl, Danilo Krummrich, ojeda, boqun, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy, driver-core, nouveau, dri-devel, rust-for-linux,
	linux-kernel

On Tue Mar 17, 2026 at 9:56 PM JST, Gary Guo wrote:
> On Tue Mar 17, 2026 at 6:50 AM GMT, Alice Ryhl wrote:
>> On Tue, Mar 03, 2026 at 05:22:53PM +0100, Danilo Krummrich wrote:
>>> From: Gary Guo <gary@garyguo.net>
>>> 
>>> Currently, `CoherentAllocation` is concecptually a DMA coherent container
>>> of a slice of `[T]` of runtime-checked length. Generalize it by creating
>>> `dma::Coherent<T>` which can hold any value of `T`.
>>> `Coherent::alloc_with_attrs` is implemented but not yet exposed, as I
>>> believe we should not expose the way to obtain an uninitialized coherent
>>> region.
>>> 
>>> `Coherent<[T]>` provides a `len` method instead of the previous `count()`
>>> method to be consistent with methods on slices.
>>> 
>>> The existing type is re-defined as a type alias of `Coherent<[T]>` to ease
>>> transition. Methods in use are not yet removed.
>>> 
>>> Signed-off-by: Gary Guo <gary@garyguo.net>
>>> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
>>
>> I see you have both `size()` and `len()`. I think it would be clearer to
>> rename `size()` to `len_bytes()`.
>
> This is an API that already exists and I simply moved it around so it is
> implemented for all types, not just `[T]`.
>
> I also think that `size()` quite unambiguously means the size in bytes (like, in
> all allocation and I/O APIs, `size` means size of object in bytes); I don't see
> much benefit in renaming it `len_bytes`.

Agreed, furthermore `len` is used in the standard library so the pattern
of it returning a number of elements is pretty established.

Generally speaking I tend to prefer having the units mentioned in the
doccomment rather than the method name.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent`
  2026-03-03 16:22   ` [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
  2026-03-17  6:47     ` Alice Ryhl
@ 2026-03-20 14:35     ` Alexandre Courbot
  1 sibling, 0 replies; 30+ messages in thread
From: Alexandre Courbot @ 2026-03-20 14:35 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Wed Mar 4, 2026 at 1:22 AM JST, Danilo Krummrich wrote:
<snip>
> @@ -572,6 +598,41 @@ fn alloc_slice_with_attrs(
>              dma_attrs,
>          })
>      }
> +
> +    /// Allocates a zeroed region of type `T` of coherent memory.
> +    ///
> +    /// Unlike `Coherent::<[T; N]>::zeroed_with_attrs`, `Coherent::<T>::zeored_slices` support

s/zeored_slices/zeroed_slices

Otherwise,

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization
  2026-03-03 16:22   ` [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization Danilo Krummrich
  2026-03-12 13:51     ` Gary Guo
  2026-03-17  6:52     ` Alice Ryhl
@ 2026-03-20 14:35     ` Alexandre Courbot
  2026-03-20 13:51       ` Danilo Krummrich
  2 siblings, 1 reply; 30+ messages in thread
From: Alexandre Courbot @ 2026-03-20 14:35 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Wed Mar 4, 2026 at 1:22 AM JST, Danilo Krummrich wrote:
<snip>
> @@ -352,6 +358,151 @@ fn from(direction: DataDirection) -> Self {
>      }
>  }
>  
> +/// Initializer type for [`Coherent`].
> +///
> +/// A [`Coherent`] object can't provide access to its memory as (mutable) slice safely, since it
> +/// can't fulfill the requirements for creating a slice. For instance, it is not valid to have a
> +/// (mutable) slice to of the memory of a [`Coherent`] while the memory might be accessed by a
> +/// device.
> +///
> +/// In contrast, this initializer type is able to fulfill the requirements to safely obtain a
> +/// (mutable) slice, as it neither provides access to the DMA address of the embedded [`Coherent`],
> +/// nor can it be used with the DMA projection accessors.
> +///
> +/// Once initialized, this type can be converted to a regular [`Coherent`] object.
> +///
> +/// # Examples
> +///
> +/// `CoherentInit<T>`:
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentInit,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentInit<u64> =
> +///     CoherentInit::zeroed_with_attrs(dev, GFP_KERNEL, DMA_ATTR_NO_WARN)?;

Since this is an example, shall we use the simpler
`CoherentInit::zeroed`?

> +/// *dmem = 42;
> +/// let dmem: Coherent<u64> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +///
> +/// `CoherentInit<[T]>`:
> +///
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentInit,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentInit<[u64]> =
> +///     CoherentInit::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;

Same here.

> +/// dmem.fill(42);
> +/// let dmem: Coherent<[u64]> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +pub struct CoherentInit<T: AsBytes + FromBytes + KnownSize + ?Sized>(Coherent<T>);
> +
> +impl<T: AsBytes + FromBytes> CoherentInit<[T]> {
> +    /// Initializer variant of [`Coherent::zeroed_slice_with_attrs`].
> +    pub fn zeroed_slice_with_attrs(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +    ) -> Result<Self> {
> +        Coherent::zeroed_slice_with_attrs(dev, count, gfp_flags, dma_attrs).map(Self)
> +    }
> +
> +    /// Same as [CoherentInit::zeroed_slice_with_attrs], but with `dma::Attrs(0)`.
> +    pub fn zeroed_slice(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +    ) -> Result<Self> {
> +        Self::zeroed_slice_with_attrs(dev, count, gfp_flags, Attrs(0))
> +    }
> +
> +    /// Initializes the element at `i` using the given initializer.
> +    ///
> +    /// Returns `EINVAL` if `i` is out of bounds.
> +    pub fn init_at<E>(&mut self, i: usize, init: impl Init<T, E>) -> Result

Should this method be introduced in the next patch, or even in its own
patch? It feels a bit out of place at this stage since the non-array
`CoherentInit` doesn't have an equivalent.

I was also wondering whether we could have an `init` method that
initializes all the elements without having to zero the whole array
first, but I guess it might be a bit difficult to implement in a
flexible enough way.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
  2026-03-03 16:22   ` [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
  2026-03-03 17:21     ` Gary Guo
@ 2026-03-20 14:36     ` Alexandre Courbot
  1 sibling, 0 replies; 30+ messages in thread
From: Alexandre Courbot @ 2026-03-20 14:36 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Wed Mar 4, 2026 at 1:22 AM JST, Danilo Krummrich wrote:
> Convert wpr_meta to use Coherent::init() and simplify the
> initialization.  It also avoids a separate initialization of
> GspFwWprMeta on the stack.
>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] gpu: nova-core: convert to new dma::Coherent API
  2026-03-03 16:22   ` [PATCH 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
@ 2026-03-20 14:36     ` Alexandre Courbot
  2026-03-20 15:05       ` Gary Guo
  0 siblings, 1 reply; 30+ messages in thread
From: Alexandre Courbot @ 2026-03-20 14:36 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Wed Mar 4, 2026 at 1:22 AM JST, Danilo Krummrich wrote:
> From: Gary Guo <gary@garyguo.net>
>
> Remove all usages of dma::CoherentAllocation and use the new
> dma::Coherent type instead.
>
> Note that there are still remainders of the old dma::CoherentAllocation
> API, such as as_slice() and as_slice_mut().
>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  drivers/gpu/nova-core/dma.rs      | 19 +++++------
>  drivers/gpu/nova-core/falcon.rs   |  7 ++--
>  drivers/gpu/nova-core/firmware.rs | 10 ++----
>  drivers/gpu/nova-core/gsp.rs      | 18 ++++++----
>  drivers/gpu/nova-core/gsp/cmdq.rs | 55 ++++++++++++-------------------
>  5 files changed, 46 insertions(+), 63 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
> index 7215398969da..3c19d5ffcfe8 100644
> --- a/drivers/gpu/nova-core/dma.rs
> +++ b/drivers/gpu/nova-core/dma.rs
> @@ -9,13 +9,13 @@
>  
>  use kernel::{
>      device,
> -    dma::CoherentAllocation,
> +    dma::Coherent,
>      page::PAGE_SIZE,
>      prelude::*, //
>  };
>  
>  pub(crate) struct DmaObject {
> -    dma: CoherentAllocation<u8>,
> +    dma: Coherent<[u8]>,
>  }

Actually, do we even still have a need for `DmaObject` now that
`Coherent` is available? It would be nice to check (as a follow-up
patch, I can look into it) whether we can remove that whole nova-local
`dma` module.

>  
>  impl DmaObject {
> @@ -24,23 +24,22 @@ pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Sel
>              .map_err(|_| EINVAL)?
>              .pad_to_align()
>              .size();
> -        let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?;
> +        let dma = Coherent::zeroed_slice(dev, len, GFP_KERNEL)?;
>  
>          Ok(Self { dma })
>      }
>  
>      pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
> -        Self::new(dev, data.len()).and_then(|mut dma_obj| {
> -            // SAFETY: We have just allocated the DMA memory, we are the only users and
> -            // we haven't made the device aware of the handle yet.
> -            unsafe { dma_obj.write(data, 0)? }
> -            Ok(dma_obj)
> -        })
> +        let dma_obj = Self::new(dev, data.len())?;
> +        // SAFETY: We have just allocated the DMA memory, we are the only users and
> +        // we haven't made the device aware of the handle yet.
> +        unsafe { dma_obj.as_mut()[..data.len()].copy_from_slice(data) };
> +        Ok(dma_obj)
>      }
>  }
>  
>  impl Deref for DmaObject {
> -    type Target = CoherentAllocation<u8>;
> +    type Target = Coherent<[u8]>;
>  
>      fn deref(&self) -> &Self::Target {
>          &self.dma
> diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
> index 37bfee1d0949..39f5df568ddb 100644
> --- a/drivers/gpu/nova-core/falcon.rs
> +++ b/drivers/gpu/nova-core/falcon.rs
> @@ -25,10 +25,7 @@
>      driver::Bar0,
>      falcon::hal::LoadMethod,
>      gpu::Chipset,
> -    num::{
> -        FromSafeCast,
> -        IntoSafeCast, //
> -    },
> +    num::FromSafeCast,
>      regs,
>      regs::macros::RegisterBase, //
>  };
> @@ -434,7 +431,7 @@ fn dma_wr<F: FalconFirmware<Target = E>>(
>              }
>              FalconMem::Dmem => (
>                  0,
> -                fw.dma_handle_with_offset(load_offsets.src_start.into_safe_cast())?,
> +                fw.dma_handle() + DmaAddress::from(load_offsets.src_start),

Aren't we losing the bounds checking done by `dma_handle_with_offset`?

Mmm, but looking below we are checking that the end of the transfer
doesn't go beyond the object bounds, so that was probably redundant
anyway. Maybe we should do a `checked_add` still for safety.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/8] gpu: nova-core: convert to new dma::Coherent API
  2026-03-20 14:36     ` Alexandre Courbot
@ 2026-03-20 15:05       ` Gary Guo
  0 siblings, 0 replies; 30+ messages in thread
From: Gary Guo @ 2026-03-20 15:05 UTC (permalink / raw)
  To: Alexandre Courbot, Danilo Krummrich
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Fri Mar 20, 2026 at 2:36 PM GMT, Alexandre Courbot wrote:
> On Wed Mar 4, 2026 at 1:22 AM JST, Danilo Krummrich wrote:
>> From: Gary Guo <gary@garyguo.net>
>>
>> Remove all usages of dma::CoherentAllocation and use the new
>> dma::Coherent type instead.
>>
>> Note that there are still remainders of the old dma::CoherentAllocation
>> API, such as as_slice() and as_slice_mut().
>>
>> Signed-off-by: Gary Guo <gary@garyguo.net>
>> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
>> ---
>>  drivers/gpu/nova-core/dma.rs      | 19 +++++------
>>  drivers/gpu/nova-core/falcon.rs   |  7 ++--
>>  drivers/gpu/nova-core/firmware.rs | 10 ++----
>>  drivers/gpu/nova-core/gsp.rs      | 18 ++++++----
>>  drivers/gpu/nova-core/gsp/cmdq.rs | 55 ++++++++++++-------------------
>>  5 files changed, 46 insertions(+), 63 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
>> index 7215398969da..3c19d5ffcfe8 100644
>> --- a/drivers/gpu/nova-core/dma.rs
>> +++ b/drivers/gpu/nova-core/dma.rs
>> @@ -9,13 +9,13 @@
>>  
>>  use kernel::{
>>      device,
>> -    dma::CoherentAllocation,
>> +    dma::Coherent,
>>      page::PAGE_SIZE,
>>      prelude::*, //
>>  };
>>  
>>  pub(crate) struct DmaObject {
>> -    dma: CoherentAllocation<u8>,
>> +    dma: Coherent<[u8]>,
>>  }
>
> Actually, do we even still have a need for `DmaObject` now that
> `Coherent` is available? It would be nice to check (as a follow-up
> patch, I can look into it) whether we can remove that whole nova-local
> `dma` module.
>
>>  
>>  impl DmaObject {
>> @@ -24,23 +24,22 @@ pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Sel
>>              .map_err(|_| EINVAL)?
>>              .pad_to_align()
>>              .size();
>> -        let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?;
>> +        let dma = Coherent::zeroed_slice(dev, len, GFP_KERNEL)?;
>>  
>>          Ok(Self { dma })
>>      }
>>  
>>      pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
>> -        Self::new(dev, data.len()).and_then(|mut dma_obj| {
>> -            // SAFETY: We have just allocated the DMA memory, we are the only users and
>> -            // we haven't made the device aware of the handle yet.
>> -            unsafe { dma_obj.write(data, 0)? }
>> -            Ok(dma_obj)
>> -        })
>> +        let dma_obj = Self::new(dev, data.len())?;
>> +        // SAFETY: We have just allocated the DMA memory, we are the only users and
>> +        // we haven't made the device aware of the handle yet.
>> +        unsafe { dma_obj.as_mut()[..data.len()].copy_from_slice(data) };
>> +        Ok(dma_obj)
>>      }
>>  }
>>  
>>  impl Deref for DmaObject {
>> -    type Target = CoherentAllocation<u8>;
>> +    type Target = Coherent<[u8]>;
>>  
>>      fn deref(&self) -> &Self::Target {
>>          &self.dma
>> diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
>> index 37bfee1d0949..39f5df568ddb 100644
>> --- a/drivers/gpu/nova-core/falcon.rs
>> +++ b/drivers/gpu/nova-core/falcon.rs
>> @@ -25,10 +25,7 @@
>>      driver::Bar0,
>>      falcon::hal::LoadMethod,
>>      gpu::Chipset,
>> -    num::{
>> -        FromSafeCast,
>> -        IntoSafeCast, //
>> -    },
>> +    num::FromSafeCast,
>>      regs,
>>      regs::macros::RegisterBase, //
>>  };
>> @@ -434,7 +431,7 @@ fn dma_wr<F: FalconFirmware<Target = E>>(
>>              }
>>              FalconMem::Dmem => (
>>                  0,
>> -                fw.dma_handle_with_offset(load_offsets.src_start.into_safe_cast())?,
>> +                fw.dma_handle() + DmaAddress::from(load_offsets.src_start),
>
> Aren't we losing the bounds checking done by `dma_handle_with_offset`?
>
> Mmm, but looking below we are checking that the end of the transfer
> doesn't go beyond the object bounds, so that was probably redundant
> anyway. Maybe we should do a `checked_add` still for safety.

I've had this as `dma_handle_with_offset` no longer make sense in the context of
generic `Coherent` type. Although, I can probably add something like:

impl View<'_, Coherent<...>, ...> {
    fn dma_handle(&self) -> DmaAddress;
}

to replace this so you can get the DMA handle for any projections of the
coherent object (and the bounds checking is done when projecting).

Best,
Gary

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/8] dma::Coherent & dma::CoherentInit API
  2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
                     ` (7 preceding siblings ...)
  2026-03-03 16:22   ` [PATCH 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
@ 2026-03-24 12:33   ` Andreas Hindborg
  2026-03-24 12:36     ` Danilo Krummrich
  8 siblings, 1 reply; 30+ messages in thread
From: Andreas Hindborg @ 2026-03-24 12:33 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

"Danilo Krummrich" <dakr@kernel.org> writes:

> This patch series introduces the dma::Coherent API Gary worked out in the
> context of his I/O projection work.
>
> Additionally, introduce dma::CoherentInit, a type that encapsulates a
> dma::Coherent object before its DMA address is exposed to the device.
> dma::CoherentInit can guarantee exclusive access to the inner dma::Coherent
> object and implement Deref and DerefMut.
>
> Also add Coherent::init() and Coherent::init_with_attrs() so we can directly
> initialize a new dma::Coherent object through an impl Init<T, E>.
>
> Danilo Krummrich (5):
>   rust: dma: use "kernel vertical" style for imports
>   rust: dma: introduce dma::CoherentInit for memory initialization
>   rust: dma: add Coherent:init() and Coherent::init_with_attrs()
>   gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
>   gpu: nova-core: convert Gsp::new() to use CoherentInit
>
> Gary Guo (3):
>   rust: dma: add generalized container for types other than slices
>   rust: dma: add zeroed constructor to `Coherent`
>   gpu: nova-core: convert to new dma::Coherent API
>
>  drivers/gpu/nova-core/dma.rs      |  19 +-
>  drivers/gpu/nova-core/falcon.rs   |   7 +-
>  drivers/gpu/nova-core/firmware.rs |  10 +-
>  drivers/gpu/nova-core/gsp.rs      |  65 ++--
>  drivers/gpu/nova-core/gsp/boot.rs |   7 +-
>  drivers/gpu/nova-core/gsp/cmdq.rs |  55 +--
>  drivers/gpu/nova-core/gsp/fw.rs   |  82 ++--
>  rust/kernel/device.rs             |   4 +-
>  rust/kernel/dma.rs                | 626 +++++++++++++++++++++++-------
>  samples/rust/rust_dma.rs          |   8 +-
>  10 files changed, 619 insertions(+), 264 deletions(-)
>
>
> base-commit: 1195fcbda62f12108dc9be56fa4173897905b90c

What did you base this on?


Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/8] dma::Coherent & dma::CoherentInit API
  2026-03-24 12:33   ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Andreas Hindborg
@ 2026-03-24 12:36     ` Danilo Krummrich
  2026-03-24 12:46       ` Andreas Hindborg
  0 siblings, 1 reply; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-24 12:36 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On 3/24/26 1:33 PM, Andreas Hindborg wrote:
> What did you base this on?

drm-rust-next

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/8] dma::Coherent & dma::CoherentInit API
  2026-03-24 12:36     ` Danilo Krummrich
@ 2026-03-24 12:46       ` Andreas Hindborg
  2026-03-24 12:51         ` Danilo Krummrich
  0 siblings, 1 reply; 30+ messages in thread
From: Andreas Hindborg @ 2026-03-24 12:46 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

"Danilo Krummrich" <dakr@kernel.org> writes:

> On 3/24/26 1:33 PM, Andreas Hindborg wrote:
>> What did you base this on?
>
> drm-rust-next

I don't think 1195fcbda62f12108dc9be56fa4173897905b90c is in
drm-rust-next - or any of the other branches in of
https://gitlab.freedesktop.org/drm/rust/kernel.git .


Best regards,
Andreas Hindborg



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/8] dma::Coherent & dma::CoherentInit API
  2026-03-24 12:46       ` Andreas Hindborg
@ 2026-03-24 12:51         ` Danilo Krummrich
  0 siblings, 0 replies; 30+ messages in thread
From: Danilo Krummrich @ 2026-03-24 12:51 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On 3/24/26 1:46 PM, Andreas Hindborg wrote:
> "Danilo Krummrich" <dakr@kernel.org> writes:
> 
>> On 3/24/26 1:33 PM, Andreas Hindborg wrote:
>>> What did you base this on?
>>
>> drm-rust-next
> 
> I don't think 1195fcbda62f12108dc9be56fa4173897905b90c is in
> drm-rust-next - or any of the other branches in of
> https://gitlab.freedesktop.org/drm/rust/kernel.git .

Oh, I did not notice that you replied to v1, there's already a v2.

[1] https://lore.kernel.org/driver-core/20260320194626.36263-1-dakr@kernel.org/

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2026-03-24 12:51 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <9A1gWen63AbJmi06Y_vu_qF1S86nj3fcQEil9RzLAn8QkIYpCOrLMNO7JcgIP0BccPHhzaYaQ_te1S_gKuHRgg==@protonmail.internalid>
2026-03-03 16:22 ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Danilo Krummrich
2026-03-03 16:22   ` [PATCH 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
2026-03-03 17:12     ` Gary Guo
2026-03-03 16:22   ` [PATCH 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
2026-03-17  6:50     ` Alice Ryhl
2026-03-17 12:56       ` Gary Guo
2026-03-20 13:58         ` Alexandre Courbot
2026-03-03 16:22   ` [PATCH 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
2026-03-17  6:47     ` Alice Ryhl
2026-03-20 14:35     ` Alexandre Courbot
2026-03-03 16:22   ` [PATCH 4/8] rust: dma: introduce dma::CoherentInit for memory initialization Danilo Krummrich
2026-03-12 13:51     ` Gary Guo
2026-03-17  6:52     ` Alice Ryhl
2026-03-17 14:40       ` Danilo Krummrich
2026-03-17 14:43         ` Alice Ryhl
2026-03-20 14:35     ` Alexandre Courbot
2026-03-20 13:51       ` Danilo Krummrich
2026-03-03 16:22   ` [PATCH 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
2026-03-03 16:22   ` [PATCH 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
2026-03-03 17:21     ` Gary Guo
2026-03-20 14:36     ` Alexandre Courbot
2026-03-03 16:22   ` [PATCH 7/8] gpu: nova-core: convert Gsp::new() to use CoherentInit Danilo Krummrich
2026-03-03 17:33     ` Gary Guo
2026-03-03 16:22   ` [PATCH 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
2026-03-20 14:36     ` Alexandre Courbot
2026-03-20 15:05       ` Gary Guo
2026-03-24 12:33   ` [PATCH 0/8] dma::Coherent & dma::CoherentInit API Andreas Hindborg
2026-03-24 12:36     ` Danilo Krummrich
2026-03-24 12:46       ` Andreas Hindborg
2026-03-24 12:51         ` Danilo Krummrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox