public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API
@ 2026-03-20 19:45 Danilo Krummrich
  2026-03-20 19:45 ` [PATCH v2 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
                   ` (9 more replies)
  0 siblings, 10 replies; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

This patch series introduces the dma::Coherent API Gary worked out in the
context of his I/O projection work.

Additionally, introduce dma::CoherentBox, a type that encapsulates a
dma::Coherent object before its DMA address is exposed to the device.
dma::CoherentBox can guarantee exclusive access to the inner dma::Coherent
object and implement Deref and DerefMut.

Also add Coherent::init() and Coherent::init_with_attrs() so we can directly
initialize a new dma::Coherent object through an impl Init<T, E>.

Changes in v2:
  - Rebase onto the DMA and nova-core fixes that went into -rc4.
  - Rename CoherentInit to CoherentBox.
  - Add a bunch of #[inline] attributes.
  - Remove a few unnecessary trait bounds in "rust: dma: add generalized
    container for types other than slices".
  - Fix a typo and convert core::ptr::from_mut(&mut self[i]) into
    &raw mut self[i].

Danilo Krummrich (5):
  rust: dma: use "kernel vertical" style for imports
  rust: dma: introduce dma::CoherentBox for memory initialization
  rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
  gpu: nova-core: convert Gsp::new() to use CoherentBox

Gary Guo (3):
  rust: dma: add generalized container for types other than slices
  rust: dma: add zeroed constructor to `Coherent`
  gpu: nova-core: convert to new dma::Coherent API

 drivers/gpu/nova-core/dma.rs      |  19 +-
 drivers/gpu/nova-core/falcon.rs   |   5 +-
 drivers/gpu/nova-core/gsp.rs      |  68 ++--
 drivers/gpu/nova-core/gsp/boot.rs |   7 +-
 drivers/gpu/nova-core/gsp/cmdq.rs |  21 +-
 drivers/gpu/nova-core/gsp/fw.rs   | 128 +++---
 rust/kernel/device.rs             |   4 +-
 rust/kernel/dma.rs                | 632 +++++++++++++++++++++++-------
 samples/rust/rust_dma.rs          |   8 +-
 9 files changed, 626 insertions(+), 266 deletions(-)


base-commit: a19457958c3018783881c4416f272cd594f13049
-- 
2.53.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 1/8] rust: dma: use "kernel vertical" style for imports
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-24 14:02   ` Andreas Hindborg
  2026-03-20 19:45 ` [PATCH v2 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Convert all imports to use "kernel vertical" style.

With this, subsequent patches neither introduce unrelated changes nor
leave an inconsistent import pattern.

While at it, drop unnecessary imports covered by prelude::*.

Link: https://docs.kernel.org/rust/coding-guidelines.html#imports
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index a396f8435739..2eea7e2f8f04 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -5,12 +5,20 @@
 //! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
 
 use crate::{
-    bindings, build_assert, device,
-    device::{Bound, Core},
-    error::{to_result, Result},
+    bindings,
+    build_assert,
+    device::{
+        self,
+        Bound,
+        Core, //
+    },
+    error::to_result,
     prelude::*,
     sync::aref::ARef,
-    transmute::{AsBytes, FromBytes},
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    }, //
 };
 use core::ptr::NonNull;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 2/8] rust: dma: add generalized container for types other than slices
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
  2026-03-20 19:45 ` [PATCH v2 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-21 23:22   ` Aditya Rajan
  2026-03-24 13:42   ` Andreas Hindborg
  2026-03-20 19:45 ` [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

From: Gary Guo <gary@garyguo.net>

Currently, `CoherentAllocation` is concecptually a DMA coherent container
of a slice of `[T]` of runtime-checked length. Generalize it by creating
`dma::Coherent<T>` which can hold any value of `T`.
`Coherent::alloc_with_attrs` is implemented but not yet exposed, as I
believe we should not expose the way to obtain an uninitialized coherent
region.

`Coherent<[T]>` provides a `len` method instead of the previous `count()`
method to be consistent with methods on slices.

The existing type is re-defined as a type alias of `Coherent<[T]>` to ease
transition. Methods in use are not yet removed.

Signed-off-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/device.rs |   4 +-
 rust/kernel/dma.rs    | 361 ++++++++++++++++++++++++++----------------
 2 files changed, 226 insertions(+), 139 deletions(-)

diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
index 94e0548e7687..379058eca2ed 100644
--- a/rust/kernel/device.rs
+++ b/rust/kernel/device.rs
@@ -575,7 +575,7 @@ pub trait DeviceContext: private::Sealed {}
 /// The bound context indicates that for the entire duration of the lifetime of a [`Device<Bound>`]
 /// reference, the [`Device`] is guaranteed to be bound to a driver.
 ///
-/// Some APIs, such as [`dma::CoherentAllocation`] or [`Devres`] rely on the [`Device`] to be bound,
+/// Some APIs, such as [`dma::Coherent`] or [`Devres`] rely on the [`Device`] to be bound,
 /// which can be proven with the [`Bound`] device context.
 ///
 /// Any abstraction that can guarantee a scope where the corresponding bus device is bound, should
@@ -584,7 +584,7 @@ pub trait DeviceContext: private::Sealed {}
 ///
 /// [`Devres`]: kernel::devres::Devres
 /// [`Devres::access`]: kernel::devres::Devres::access
-/// [`dma::CoherentAllocation`]: kernel::dma::CoherentAllocation
+/// [`dma::Coherent`]: kernel::dma::Coherent
 pub struct Bound;
 
 mod private {
diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 2eea7e2f8f04..ff3e147f1a23 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -6,7 +6,6 @@
 
 use crate::{
     bindings,
-    build_assert,
     device::{
         self,
         Bound,
@@ -14,6 +13,7 @@
     },
     error::to_result,
     prelude::*,
+    ptr::KnownSize,
     sync::aref::ARef,
     transmute::{
         AsBytes,
@@ -357,18 +357,17 @@ fn from(direction: DataDirection) -> Self {
 /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
 /// large coherent DMA regions.
 ///
-/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
+/// A [`Coherent`] instance contains a pointer to the allocated region (in the
 /// processor's virtual address space) and the device address which can be given to the device
-/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
+/// as the DMA address base of the region. The region is released once [`Coherent`]
 /// is dropped.
 ///
 /// # Invariants
 ///
-/// - For the lifetime of an instance of [`CoherentAllocation`], the `cpu_addr` is a valid pointer
+/// - For the lifetime of an instance of [`Coherent`], the `cpu_addr` is a valid pointer
 ///   to an allocated region of coherent memory and `dma_handle` is the DMA address base of the
 ///   region.
-/// - The size in bytes of the allocation is equal to `size_of::<T> * count`.
-/// - `size_of::<T> * count` fits into a `usize`.
+/// - The size in bytes of the allocation is equal to size information via pointer.
 // TODO
 //
 // DMA allocations potentially carry device resources (e.g.IOMMU mappings), hence for soundness
@@ -379,124 +378,260 @@ fn from(direction: DataDirection) -> Self {
 // allocation from surviving device unbind; it would require RCU read side critical sections to
 // access the memory, which may require subsequent unnecessary copies.
 //
-// Hence, find a way to revoke the device resources of a `CoherentAllocation`, but not the
-// entire `CoherentAllocation` including the allocated memory itself.
-pub struct CoherentAllocation<T: AsBytes + FromBytes> {
+// Hence, find a way to revoke the device resources of a `Coherent`, but not the
+// entire `Coherent` including the allocated memory itself.
+pub struct Coherent<T: KnownSize + ?Sized> {
     dev: ARef<device::Device>,
     dma_handle: DmaAddress,
-    count: usize,
     cpu_addr: NonNull<T>,
     dma_attrs: Attrs,
 }
 
-impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
-    /// Allocates a region of `size_of::<T> * count` of coherent memory.
+impl<T: KnownSize + ?Sized> Coherent<T> {
+    /// Returns the size in bytes of this allocation.
+    #[inline]
+    pub fn size(&self) -> usize {
+        T::size(self.cpu_addr.as_ptr())
+    }
+
+    /// Returns the raw pointer to the allocated region in the CPU's virtual address space.
+    #[inline]
+    pub fn as_ptr(&self) -> *const T {
+        self.cpu_addr.as_ptr()
+    }
+
+    /// Returns the raw pointer to the allocated region in the CPU's virtual address space as
+    /// a mutable pointer.
+    #[inline]
+    pub fn as_mut_ptr(&self) -> *mut T {
+        self.cpu_addr.as_ptr()
+    }
+
+    /// Returns a DMA handle which may be given to the device as the DMA address base of
+    /// the region.
+    #[inline]
+    pub fn dma_handle(&self) -> DmaAddress {
+        self.dma_handle
+    }
+
+    /// Returns a reference to the data in the region.
     ///
-    /// # Examples
+    /// # Safety
     ///
-    /// ```
-    /// # use kernel::device::{Bound, Device};
-    /// use kernel::dma::{attrs::*, CoherentAllocation};
+    /// * Callers must ensure that the device does not read/write to/from memory while the returned
+    ///   slice is live.
+    /// * Callers must ensure that this call does not race with a write to the same region while
+    ///   the returned slice is live.
+    #[inline]
+    pub unsafe fn as_ref(&self) -> &T {
+        // SAFETY: per safety requirement.
+        unsafe { &*self.as_ptr() }
+    }
+
+    /// Returns a mutable reference to the data in the region.
     ///
-    /// # fn test(dev: &Device<Bound>) -> Result {
-    /// let c: CoherentAllocation<u64> =
-    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
-    /// # Ok::<(), Error>(()) }
-    /// ```
-    pub fn alloc_attrs(
+    /// # Safety
+    ///
+    /// * Callers must ensure that the device does not read/write to/from memory while the returned
+    ///   slice is live.
+    /// * Callers must ensure that this call does not race with a read or write to the same region
+    ///   while the returned slice is live.
+    #[expect(clippy::mut_from_ref, reason = "unsafe to use API")]
+    #[inline]
+    pub unsafe fn as_mut(&self) -> &mut T {
+        // SAFETY: per safety requirement.
+        unsafe { &mut *self.as_mut_ptr() }
+    }
+
+    /// Reads the value of `field` and ensures that its type is [`FromBytes`].
+    ///
+    /// # Safety
+    ///
+    /// This must be called from the [`dma_read`] macro which ensures that the `field` pointer is
+    /// validated beforehand.
+    ///
+    /// Public but hidden since it should only be used from [`dma_read`] macro.
+    #[doc(hidden)]
+    pub unsafe fn field_read<F: FromBytes>(&self, field: *const F) -> F {
+        // SAFETY:
+        // - By the safety requirements field is valid.
+        // - Using read_volatile() here is not sound as per the usual rules, the usage here is
+        // a special exception with the following notes in place. When dealing with a potential
+        // race from a hardware or code outside kernel (e.g. user-space program), we need that
+        // read on a valid memory is not UB. Currently read_volatile() is used for this, and the
+        // rationale behind is that it should generate the same code as READ_ONCE() which the
+        // kernel already relies on to avoid UB on data races. Note that the usage of
+        // read_volatile() is limited to this particular case, it cannot be used to prevent
+        // the UB caused by racing between two kernel functions nor do they provide atomicity.
+        unsafe { field.read_volatile() }
+    }
+
+    /// Writes a value to `field` and ensures that its type is [`AsBytes`].
+    ///
+    /// # Safety
+    ///
+    /// This must be called from the [`dma_write`] macro which ensures that the `field` pointer is
+    /// validated beforehand.
+    ///
+    /// Public but hidden since it should only be used from [`dma_write`] macro.
+    #[doc(hidden)]
+    pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
+        // SAFETY:
+        // - By the safety requirements field is valid.
+        // - Using write_volatile() here is not sound as per the usual rules, the usage here is
+        // a special exception with the following notes in place. When dealing with a potential
+        // race from a hardware or code outside kernel (e.g. user-space program), we need that
+        // write on a valid memory is not UB. Currently write_volatile() is used for this, and the
+        // rationale behind is that it should generate the same code as WRITE_ONCE() which the
+        // kernel already relies on to avoid UB on data races. Note that the usage of
+        // write_volatile() is limited to this particular case, it cannot be used to prevent
+        // the UB caused by racing between two kernel functions nor do they provide atomicity.
+        unsafe { field.write_volatile(val) }
+    }
+}
+
+impl<T: AsBytes + FromBytes> Coherent<T> {
+    /// Allocates a region of `T` of coherent memory.
+    #[expect(unused)]
+    fn alloc_with_attrs(
         dev: &device::Device<Bound>,
-        count: usize,
         gfp_flags: kernel::alloc::Flags,
         dma_attrs: Attrs,
-    ) -> Result<CoherentAllocation<T>> {
-        build_assert!(
-            core::mem::size_of::<T>() > 0,
-            "It doesn't make sense for the allocated type to be a ZST"
-        );
+    ) -> Result<Self> {
+        const {
+            assert!(
+                core::mem::size_of::<T>() > 0,
+                "It doesn't make sense for the allocated type to be a ZST"
+            );
+        }
 
-        let size = count
-            .checked_mul(core::mem::size_of::<T>())
-            .ok_or(EOVERFLOW)?;
         let mut dma_handle = 0;
         // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
         let addr = unsafe {
             bindings::dma_alloc_attrs(
                 dev.as_raw(),
-                size,
+                core::mem::size_of::<T>(),
                 &mut dma_handle,
                 gfp_flags.as_raw(),
                 dma_attrs.as_raw(),
             )
         };
-        let addr = NonNull::new(addr).ok_or(ENOMEM)?;
+        let cpu_addr = NonNull::new(addr.cast()).ok_or(ENOMEM)?;
         // INVARIANT:
-        // - We just successfully allocated a coherent region which is accessible for
-        //   `count` elements, hence the cpu address is valid. We also hold a refcounted reference
-        //   to the device.
-        // - The allocated `size` is equal to `size_of::<T> * count`.
-        // - The allocated `size` fits into a `usize`.
+        // - We just successfully allocated a coherent region which is adequately sized for `T`,
+        //   hence the cpu address is valid.
+        // - We also hold a refcounted reference to the device.
         Ok(Self {
             dev: dev.into(),
             dma_handle,
-            count,
-            cpu_addr: addr.cast(),
+            cpu_addr,
             dma_attrs,
         })
     }
 
-    /// Performs the same functionality as [`CoherentAllocation::alloc_attrs`], except the
-    /// `dma_attrs` is 0 by default.
-    pub fn alloc_coherent(
+    /// Allocates a region of `[T; len]` of coherent memory.
+    fn alloc_slice_with_attrs(
         dev: &device::Device<Bound>,
-        count: usize,
+        len: usize,
         gfp_flags: kernel::alloc::Flags,
-    ) -> Result<CoherentAllocation<T>> {
-        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
+        dma_attrs: Attrs,
+    ) -> Result<Coherent<[T]>> {
+        const {
+            assert!(
+                core::mem::size_of::<T>() > 0,
+                "It doesn't make sense for the allocated type to be a ZST"
+            );
+        }
+
+        // `dma_alloc_attrs` cannot handle zero-length allocation, bail early.
+        if len == 0 {
+            Err(EINVAL)?;
+        }
+
+        let size = core::mem::size_of::<T>().checked_mul(len).ok_or(ENOMEM)?;
+        let mut dma_handle = 0;
+        // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
+        let addr = unsafe {
+            bindings::dma_alloc_attrs(
+                dev.as_raw(),
+                size,
+                &mut dma_handle,
+                gfp_flags.as_raw(),
+                dma_attrs.as_raw(),
+            )
+        };
+        let cpu_addr = NonNull::slice_from_raw_parts(NonNull::new(addr.cast()).ok_or(ENOMEM)?, len);
+        // INVARIANT:
+        // - We just successfully allocated a coherent region which is adequately sized for
+        //   `[T; len]`, hence the cpu address is valid.
+        // - We also hold a refcounted reference to the device.
+        Ok(Coherent {
+            dev: dev.into(),
+            dma_handle,
+            cpu_addr,
+            dma_attrs,
+        })
     }
+}
 
+impl<T> Coherent<[T]> {
     /// Returns the number of elements `T` in this allocation.
     ///
     /// Note that this is not the size of the allocation in bytes, which is provided by
     /// [`Self::size`].
-    pub fn count(&self) -> usize {
-        self.count
+    #[inline]
+    #[expect(clippy::len_without_is_empty, reason = "Coherent slice is never empty")]
+    pub fn len(&self) -> usize {
+        self.cpu_addr.len()
     }
+}
 
-    /// Returns the size in bytes of this allocation.
-    pub fn size(&self) -> usize {
-        // INVARIANT: The type invariant of `Self` guarantees that `size_of::<T> * count` fits into
-        // a `usize`.
-        self.count * core::mem::size_of::<T>()
-    }
+// Type alias for compatibility.
+#[doc(hidden)]
+pub type CoherentAllocation<T> = Coherent<[T]>;
 
-    /// Returns the raw pointer to the allocated region in the CPU's virtual address space.
-    #[inline]
-    pub fn as_ptr(&self) -> *const [T] {
-        core::ptr::slice_from_raw_parts(self.cpu_addr.as_ptr(), self.count)
+impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
+    /// Allocates a region of `size_of::<T> * count` of coherent memory.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::device::{Bound, Device};
+    /// use kernel::dma::{attrs::*, CoherentAllocation};
+    ///
+    /// # fn test(dev: &Device<Bound>) -> Result {
+    /// let c: CoherentAllocation<u64> =
+    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    pub fn alloc_attrs(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<CoherentAllocation<T>> {
+        Coherent::alloc_slice_with_attrs(dev, count, gfp_flags, dma_attrs)
     }
 
-    /// Returns the raw pointer to the allocated region in the CPU's virtual address space as
-    /// a mutable pointer.
-    #[inline]
-    pub fn as_mut_ptr(&self) -> *mut [T] {
-        core::ptr::slice_from_raw_parts_mut(self.cpu_addr.as_ptr(), self.count)
+    /// Performs the same functionality as [`CoherentAllocation::alloc_attrs`], except the
+    /// `dma_attrs` is 0 by default.
+    pub fn alloc_coherent(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+    ) -> Result<CoherentAllocation<T>> {
+        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
     }
 
     /// Returns the base address to the allocated region in the CPU's virtual address space.
     pub fn start_ptr(&self) -> *const T {
-        self.cpu_addr.as_ptr()
+        self.as_ptr().cast()
     }
 
     /// Returns the base address to the allocated region in the CPU's virtual address space as
     /// a mutable pointer.
     pub fn start_ptr_mut(&mut self) -> *mut T {
-        self.cpu_addr.as_ptr()
-    }
-
-    /// Returns a DMA handle which may be given to the device as the DMA address base of
-    /// the region.
-    pub fn dma_handle(&self) -> DmaAddress {
-        self.dma_handle
+        self.as_mut_ptr().cast()
     }
 
     /// Returns a DMA handle starting at `offset` (in units of `T`) which may be given to the
@@ -504,11 +639,9 @@ pub fn dma_handle(&self) -> DmaAddress {
     ///
     /// Returns `EINVAL` if `offset` is not within the bounds of the allocation.
     pub fn dma_handle_with_offset(&self, offset: usize) -> Result<DmaAddress> {
-        if offset >= self.count {
+        if offset >= self.len() {
             Err(EINVAL)
         } else {
-            // INVARIANT: The type invariant of `Self` guarantees that `size_of::<T> * count` fits
-            // into a `usize`, and `offset` is inferior to `count`.
             Ok(self.dma_handle + (offset * core::mem::size_of::<T>()) as DmaAddress)
         }
     }
@@ -516,7 +649,7 @@ pub fn dma_handle_with_offset(&self, offset: usize) -> Result<DmaAddress> {
     /// Common helper to validate a range applied from the allocated region in the CPU's virtual
     /// address space.
     fn validate_range(&self, offset: usize, count: usize) -> Result {
-        if offset.checked_add(count).ok_or(EOVERFLOW)? > self.count {
+        if offset.checked_add(count).ok_or(EOVERFLOW)? > self.len() {
             return Err(EINVAL);
         }
         Ok(())
@@ -601,66 +734,20 @@ pub unsafe fn write(&mut self, src: &[T], offset: usize) -> Result {
         };
         Ok(())
     }
-
-    /// Reads the value of `field` and ensures that its type is [`FromBytes`].
-    ///
-    /// # Safety
-    ///
-    /// This must be called from the [`dma_read`] macro which ensures that the `field` pointer is
-    /// validated beforehand.
-    ///
-    /// Public but hidden since it should only be used from [`dma_read`] macro.
-    #[doc(hidden)]
-    pub unsafe fn field_read<F: FromBytes>(&self, field: *const F) -> F {
-        // SAFETY:
-        // - By the safety requirements field is valid.
-        // - Using read_volatile() here is not sound as per the usual rules, the usage here is
-        // a special exception with the following notes in place. When dealing with a potential
-        // race from a hardware or code outside kernel (e.g. user-space program), we need that
-        // read on a valid memory is not UB. Currently read_volatile() is used for this, and the
-        // rationale behind is that it should generate the same code as READ_ONCE() which the
-        // kernel already relies on to avoid UB on data races. Note that the usage of
-        // read_volatile() is limited to this particular case, it cannot be used to prevent
-        // the UB caused by racing between two kernel functions nor do they provide atomicity.
-        unsafe { field.read_volatile() }
-    }
-
-    /// Writes a value to `field` and ensures that its type is [`AsBytes`].
-    ///
-    /// # Safety
-    ///
-    /// This must be called from the [`dma_write`] macro which ensures that the `field` pointer is
-    /// validated beforehand.
-    ///
-    /// Public but hidden since it should only be used from [`dma_write`] macro.
-    #[doc(hidden)]
-    pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
-        // SAFETY:
-        // - By the safety requirements field is valid.
-        // - Using write_volatile() here is not sound as per the usual rules, the usage here is
-        // a special exception with the following notes in place. When dealing with a potential
-        // race from a hardware or code outside kernel (e.g. user-space program), we need that
-        // write on a valid memory is not UB. Currently write_volatile() is used for this, and the
-        // rationale behind is that it should generate the same code as WRITE_ONCE() which the
-        // kernel already relies on to avoid UB on data races. Note that the usage of
-        // write_volatile() is limited to this particular case, it cannot be used to prevent
-        // the UB caused by racing between two kernel functions nor do they provide atomicity.
-        unsafe { field.write_volatile(val) }
-    }
 }
 
 /// Note that the device configured to do DMA must be halted before this object is dropped.
-impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
+impl<T: KnownSize + ?Sized> Drop for Coherent<T> {
     fn drop(&mut self) {
-        let size = self.count * core::mem::size_of::<T>();
+        let size = T::size(self.cpu_addr.as_ptr());
         // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
         // The cpu address, and the dma handle are valid due to the type invariants on
-        // `CoherentAllocation`.
+        // `Coherent`.
         unsafe {
             bindings::dma_free_attrs(
                 self.dev.as_raw(),
                 size,
-                self.start_ptr_mut().cast(),
+                self.cpu_addr.as_ptr().cast(),
                 self.dma_handle,
                 self.dma_attrs.as_raw(),
             )
@@ -668,20 +755,20 @@ fn drop(&mut self) {
     }
 }
 
-// SAFETY: It is safe to send a `CoherentAllocation` to another thread if `T`
+// SAFETY: It is safe to send a `Coherent` to another thread if `T`
 // can be sent to another thread.
-unsafe impl<T: AsBytes + FromBytes + Send> Send for CoherentAllocation<T> {}
+unsafe impl<T: KnownSize + Send + ?Sized> Send for Coherent<T> {}
 
 /// Reads a field of an item from an allocated region of structs.
 ///
 /// The syntax is of the form `kernel::dma_read!(dma, proj)` where `dma` is an expression evaluating
-/// to a [`CoherentAllocation`] and `proj` is a [projection specification](kernel::ptr::project!).
+/// to a [`Coherent`] and `proj` is a [projection specification](kernel::ptr::project!).
 ///
 /// # Examples
 ///
 /// ```
 /// use kernel::device::Device;
-/// use kernel::dma::{attrs::*, CoherentAllocation};
+/// use kernel::dma::{attrs::*, Coherent};
 ///
 /// struct MyStruct { field: u32, }
 ///
@@ -690,7 +777,7 @@ unsafe impl<T: AsBytes + FromBytes + Send> Send for CoherentAllocation<T> {}
 /// // SAFETY: Instances of `MyStruct` have no uninitialized portions.
 /// unsafe impl kernel::transmute::AsBytes for MyStruct{};
 ///
-/// # fn test(alloc: &kernel::dma::CoherentAllocation<MyStruct>) -> Result {
+/// # fn test(alloc: &kernel::dma::Coherent<[MyStruct]>) -> Result {
 /// let whole = kernel::dma_read!(alloc, [2]?);
 /// let field = kernel::dma_read!(alloc, [1]?.field);
 /// # Ok::<(), Error>(()) }
@@ -700,17 +787,17 @@ macro_rules! dma_read {
     ($dma:expr, $($proj:tt)*) => {{
         let dma = &$dma;
         let ptr = $crate::ptr::project!(
-            $crate::dma::CoherentAllocation::as_ptr(dma), $($proj)*
+            $crate::dma::Coherent::as_ptr(dma), $($proj)*
         );
         // SAFETY: The pointer created by the projection is within the DMA region.
-        unsafe { $crate::dma::CoherentAllocation::field_read(dma, ptr) }
+        unsafe { $crate::dma::Coherent::field_read(dma, ptr) }
     }};
 }
 
 /// Writes to a field of an item from an allocated region of structs.
 ///
 /// The syntax is of the form `kernel::dma_write!(dma, proj, val)` where `dma` is an expression
-/// evaluating to a [`CoherentAllocation`], `proj` is a
+/// evaluating to a [`Coherent`], `proj` is a
 /// [projection specification](kernel::ptr::project!), and `val` is the value to be written to the
 /// projected location.
 ///
@@ -718,7 +805,7 @@ macro_rules! dma_read {
 ///
 /// ```
 /// use kernel::device::Device;
-/// use kernel::dma::{attrs::*, CoherentAllocation};
+/// use kernel::dma::{attrs::*, Coherent};
 ///
 /// struct MyStruct { member: u32, }
 ///
@@ -727,7 +814,7 @@ macro_rules! dma_read {
 /// // SAFETY: Instances of `MyStruct` have no uninitialized portions.
 /// unsafe impl kernel::transmute::AsBytes for MyStruct{};
 ///
-/// # fn test(alloc: &kernel::dma::CoherentAllocation<MyStruct>) -> Result {
+/// # fn test(alloc: &kernel::dma::Coherent<[MyStruct]>) -> Result {
 /// kernel::dma_write!(alloc, [2]?.member, 0xf);
 /// kernel::dma_write!(alloc, [1]?, MyStruct { member: 0xf });
 /// # Ok::<(), Error>(()) }
@@ -737,11 +824,11 @@ macro_rules! dma_write {
     (@parse [$dma:expr] [$($proj:tt)*] [, $val:expr]) => {{
         let dma = &$dma;
         let ptr = $crate::ptr::project!(
-            mut $crate::dma::CoherentAllocation::as_mut_ptr(dma), $($proj)*
+            mut $crate::dma::Coherent::as_mut_ptr(dma), $($proj)*
         );
         let val = $val;
         // SAFETY: The pointer created by the projection is within the DMA region.
-        unsafe { $crate::dma::CoherentAllocation::field_write(dma, ptr, val) }
+        unsafe { $crate::dma::Coherent::field_write(dma, ptr, val) }
     }};
     (@parse [$dma:expr] [$($proj:tt)*] [.$field:tt $($rest:tt)*]) => {
         $crate::dma_write!(@parse [$dma] [$($proj)* .$field] [$($rest)*])
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent`
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
  2026-03-20 19:45 ` [PATCH v2 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
  2026-03-20 19:45 ` [PATCH v2 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-21  6:37   ` Alexandre Courbot
  2026-03-24 13:46   ` Andreas Hindborg
  2026-03-20 19:45 ` [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization Danilo Krummrich
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

From: Gary Guo <gary@garyguo.net>

These constructors create a coherent container of a single object
instead of slice. They are named `zeroed` and `zeroed_with_attrs` to
emphasis that they are created initialized zeroed. It is intended that
there'll be new constructors that take `PinInit` instead of zeroing.

Signed-off-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs       | 81 ++++++++++++++++++++++++++++++++++++----
 samples/rust/rust_dma.rs |  8 ++--
 2 files changed, 77 insertions(+), 12 deletions(-)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index ff3e147f1a23..db645b01bdd0 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -47,7 +47,7 @@ pub trait Device: AsRef<device::Device<Core>> {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_mask(&self, mask: DmaMask) -> Result {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -64,7 +64,7 @@ unsafe fn dma_set_mask(&self, mask: DmaMask) -> Result {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_coherent_mask(&self, mask: DmaMask) -> Result {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -83,7 +83,7 @@ unsafe fn dma_set_coherent_mask(&self, mask: DmaMask) -> Result {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_mask_and_coherent(&self, mask: DmaMask) -> Result {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -102,7 +102,7 @@ unsafe fn dma_set_mask_and_coherent(&self, mask: DmaMask) -> Result {
     /// # Safety
     ///
     /// This method must not be called concurrently with any DMA allocation or mapping primitives,
-    /// such as [`CoherentAllocation::alloc_attrs`].
+    /// such as [`Coherent::zeroed`].
     unsafe fn dma_set_max_seg_size(&self, size: u32) {
         // SAFETY:
         // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
@@ -202,12 +202,12 @@ pub const fn value(&self) -> u64 {
 ///
 /// ```
 /// # use kernel::device::{Bound, Device};
-/// use kernel::dma::{attrs::*, CoherentAllocation};
+/// use kernel::dma::{attrs::*, Coherent};
 ///
 /// # fn test(dev: &Device<Bound>) -> Result {
 /// let attribs = DMA_ATTR_FORCE_CONTIGUOUS | DMA_ATTR_NO_WARN;
-/// let c: CoherentAllocation<u64> =
-///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, attribs)?;
+/// let c: Coherent<[u64]> =
+///     Coherent::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, attribs)?;
 /// # Ok::<(), Error>(()) }
 /// ```
 #[derive(Clone, Copy, PartialEq)]
@@ -492,7 +492,6 @@ pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
 
 impl<T: AsBytes + FromBytes> Coherent<T> {
     /// Allocates a region of `T` of coherent memory.
-    #[expect(unused)]
     fn alloc_with_attrs(
         dev: &device::Device<Bound>,
         gfp_flags: kernel::alloc::Flags,
@@ -529,6 +528,35 @@ fn alloc_with_attrs(
         })
     }
 
+    /// Allocates a region of type `T` of coherent memory.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::device::{Bound, Device};
+    /// use kernel::dma::{attrs::*, Coherent};
+    ///
+    /// # fn test(dev: &Device<Bound>) -> Result {
+    /// let c: Coherent<[u64; 4]> =
+    ///     Coherent::zeroed_with_attrs(dev, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    #[inline]
+    pub fn zeroed_with_attrs(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Self> {
+        Self::alloc_with_attrs(dev, gfp_flags | __GFP_ZERO, dma_attrs)
+    }
+
+    /// Performs the same functionality as [`Coherent::zeroed_with_attrs`], except the
+    /// `dma_attrs` is 0 by default.
+    #[inline]
+    pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> Result<Self> {
+        Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
+    }
+
     /// Allocates a region of `[T; len]` of coherent memory.
     fn alloc_slice_with_attrs(
         dev: &device::Device<Bound>,
@@ -572,6 +600,43 @@ fn alloc_slice_with_attrs(
             dma_attrs,
         })
     }
+
+    /// Allocates a zeroed region of type `T` of coherent memory.
+    ///
+    /// Unlike `Coherent::<[T; N]>::zeroed_with_attrs`, `Coherent::<T>::zeroed_slices` support
+    /// a runtime length.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::device::{Bound, Device};
+    /// use kernel::dma::{attrs::*, Coherent};
+    ///
+    /// # fn test(dev: &Device<Bound>) -> Result {
+    /// let c: Coherent<[u64]> =
+    ///     Coherent::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    #[inline]
+    pub fn zeroed_slice_with_attrs(
+        dev: &device::Device<Bound>,
+        len: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Coherent<[T]>> {
+        Coherent::alloc_slice_with_attrs(dev, len, gfp_flags | __GFP_ZERO, dma_attrs)
+    }
+
+    /// Performs the same functionality as [`Coherent::zeroed_slice_with_attrs`], except the
+    /// `dma_attrs` is 0 by default.
+    #[inline]
+    pub fn zeroed_slice(
+        dev: &device::Device<Bound>,
+        len: usize,
+        gfp_flags: kernel::alloc::Flags,
+    ) -> Result<Coherent<[T]>> {
+        Self::zeroed_slice_with_attrs(dev, len, gfp_flags, Attrs(0))
+    }
 }
 
 impl<T> Coherent<[T]> {
diff --git a/samples/rust/rust_dma.rs b/samples/rust/rust_dma.rs
index ce39b5545097..314ef51cd86c 100644
--- a/samples/rust/rust_dma.rs
+++ b/samples/rust/rust_dma.rs
@@ -6,7 +6,7 @@
 
 use kernel::{
     device::Core,
-    dma::{CoherentAllocation, DataDirection, Device, DmaMask},
+    dma::{Coherent, DataDirection, Device, DmaMask},
     page, pci,
     prelude::*,
     scatterlist::{Owned, SGTable},
@@ -16,7 +16,7 @@
 #[pin_data(PinnedDrop)]
 struct DmaSampleDriver {
     pdev: ARef<pci::Device>,
-    ca: CoherentAllocation<MyStruct>,
+    ca: Coherent<[MyStruct]>,
     #[pin]
     sgt: SGTable<Owned<VVec<u8>>>,
 }
@@ -64,8 +64,8 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
             // SAFETY: There are no concurrent calls to DMA allocation and mapping primitives.
             unsafe { pdev.dma_set_mask_and_coherent(mask)? };
 
-            let ca: CoherentAllocation<MyStruct> =
-                CoherentAllocation::alloc_coherent(pdev.as_ref(), TEST_VALUES.len(), GFP_KERNEL)?;
+            let ca: Coherent<[MyStruct]> =
+                Coherent::zeroed_slice(pdev.as_ref(), TEST_VALUES.len(), GFP_KERNEL)?;
 
             for (i, value) in TEST_VALUES.into_iter().enumerate() {
                 kernel::dma_write!(ca, [i]?, MyStruct::new(value.0, value.1));
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
                   ` (2 preceding siblings ...)
  2026-03-20 19:45 ` [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-20 20:55   ` Gary Guo
  2026-03-24 13:57   ` Andreas Hindborg
  2026-03-20 19:45 ` [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Currently, dma::Coherent cannot safely provide (mutable) access to its
underlying memory because the memory might be concurrently accessed by a
DMA device. This makes it difficult to safely initialize the memory
before handing it over to the hardware.

Introduce dma::CoherentBox, a type that encapsulates a dma::Coherent
before its DMA address is exposed to the device. dma::CoherentBox can
guarantee exclusive access to the inner dma::Coherent and implement
Deref and DerefMut.

Once the memory is properly initialized, dma::CoherentBox can be
converted into a regular dma::Coherent.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs | 154 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 153 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index db645b01bdd0..cefb54f0424a 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -20,7 +20,13 @@
         FromBytes, //
     }, //
 };
-use core::ptr::NonNull;
+use core::{
+    ops::{
+        Deref,
+        DerefMut, //
+    },
+    ptr::NonNull, //
+};
 
 /// DMA address type.
 ///
@@ -352,6 +358,152 @@ fn from(direction: DataDirection) -> Self {
     }
 }
 
+/// CPU-owned DMA allocation that can be converted into a device-shared [`Coherent`] object.
+///
+/// Unlike [`Coherent`], a [`CoherentBox`] is guaranteed to be fully owned by the CPU -- its DMA
+/// address is not exposed and it cannot be accessed by a device. This means it can safely be used
+/// like a normal boxed allocation (e.g. direct reads, writes, and mutable slices are all safe).
+///
+/// A typical use is to allocate a [`CoherentBox`], populate it with normal CPU access, and then
+/// convert it into a [`Coherent`] object to share it with the device.
+///
+/// # Examples
+///
+/// `CoherentBox<T>`:
+///
+/// ```
+/// # use kernel::device::{
+/// #     Bound,
+/// #     Device,
+/// # };
+/// use kernel::dma::{attrs::*,
+///     Coherent,
+///     CoherentBox,
+/// };
+///
+/// # fn test(dev: &Device<Bound>) -> Result {
+/// let mut dmem: CoherentBox<u64> = CoherentBox::zeroed(dev, GFP_KERNEL)?;
+/// *dmem = 42;
+/// let dmem: Coherent<u64> = dmem.into();
+/// # Ok::<(), Error>(()) }
+/// ```
+///
+/// `CoherentBox<[T]>`:
+///
+///
+/// ```
+/// # use kernel::device::{
+/// #     Bound,
+/// #     Device,
+/// # };
+/// use kernel::dma::{attrs::*,
+///     Coherent,
+///     CoherentBox,
+/// };
+///
+/// # fn test(dev: &Device<Bound>) -> Result {
+/// let mut dmem: CoherentBox<[u64]> = CoherentBox::zeroed_slice(dev, 4, GFP_KERNEL)?;
+/// dmem.fill(42);
+/// let dmem: Coherent<[u64]> = dmem.into();
+/// # Ok::<(), Error>(()) }
+/// ```
+pub struct CoherentBox<T: AsBytes + FromBytes + KnownSize + ?Sized>(Coherent<T>);
+
+impl<T: AsBytes + FromBytes> CoherentBox<[T]> {
+    /// [`CoherentBox`] variant of [`Coherent::zeroed_slice_with_attrs`].
+    #[inline]
+    pub fn zeroed_slice_with_attrs(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Self> {
+        Coherent::zeroed_slice_with_attrs(dev, count, gfp_flags, dma_attrs).map(Self)
+    }
+
+    /// Same as [CoherentBox::zeroed_slice_with_attrs], but with `dma::Attrs(0)`.
+    #[inline]
+    pub fn zeroed_slice(
+        dev: &device::Device<Bound>,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+    ) -> Result<Self> {
+        Self::zeroed_slice_with_attrs(dev, count, gfp_flags, Attrs(0))
+    }
+
+    /// Initializes the element at `i` using the given initializer.
+    ///
+    /// Returns `EINVAL` if `i` is out of bounds.
+    pub fn init_at<E>(&mut self, i: usize, init: impl Init<T, E>) -> Result
+    where
+        Error: From<E>,
+    {
+        if i >= self.0.len() {
+            return Err(EINVAL);
+        }
+
+        let ptr = &raw mut self[i];
+
+        // SAFETY:
+        // - `ptr` is valid, properly aligned, and within this allocation.
+        // - `T: AsBytes + FromBytes` guarantees all bit patterns are valid, so partial writes on
+        //   error cannot leave the element in an invalid state.
+        // - The DMA address has not been exposed yet, so there is no concurrent device access.
+        unsafe { init.__init(ptr)? };
+
+        Ok(())
+    }
+}
+
+impl<T: AsBytes + FromBytes> CoherentBox<T> {
+    /// Same as [`CoherentBox::zeroed_slice_with_attrs`], but for a single element.
+    #[inline]
+    pub fn zeroed_with_attrs(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+    ) -> Result<Self> {
+        Coherent::zeroed_with_attrs(dev, gfp_flags, dma_attrs).map(Self)
+    }
+
+    /// Same as [`CoherentBox::zeroed_slice`], but for a single element.
+    #[inline]
+    pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> Result<Self> {
+        Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
+    }
+}
+
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> Deref for CoherentBox<T> {
+    type Target = T;
+
+    #[inline]
+    fn deref(&self) -> &Self::Target {
+        // SAFETY:
+        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
+        //   device.
+        // - We have exclusive access to `self.0`.
+        unsafe { self.0.as_ref() }
+    }
+}
+
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> DerefMut for CoherentBox<T> {
+    #[inline]
+    fn deref_mut(&mut self) -> &mut Self::Target {
+        // SAFETY:
+        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
+        //   device.
+        // - We have exclusive access to `self.0`.
+        unsafe { self.0.as_mut() }
+    }
+}
+
+impl<T: AsBytes + FromBytes + KnownSize + ?Sized> From<CoherentBox<T>> for Coherent<T> {
+    #[inline]
+    fn from(value: CoherentBox<T>) -> Self {
+        value.0
+    }
+}
+
 /// An abstraction of the `dma_alloc_coherent` API.
 ///
 /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
                   ` (3 preceding siblings ...)
  2026-03-20 19:45 ` [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-20 20:56   ` Gary Guo
  2026-03-24 14:00   ` Andreas Hindborg
  2026-03-20 19:45 ` [PATCH v2 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Analogous to Coherent::zeroed() and Coherent::zeroed_with_attrs(), add
Coherent:init() and Coherent::init_with_attrs() which both take an impl
Init<T, E> argument initializing the DMA coherent memory.

Compared to CoherentInit, Coherent::init() is a one-shot constructor
that runs an Init closure and immediately exposes the DMA handle,
whereas CoherentInit is a multi-stage initializer that provides safe
&mut T access by withholding the DMA address until converted to
Coherent.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 rust/kernel/dma.rs | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index cefb54f0424a..6d2bec52806b 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -709,6 +709,44 @@ pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> R
         Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
     }
 
+    /// Same as [`Coherent::zeroed_with_attrs`], but instead of a zero-initialization the memory is
+    /// initialized with `init`.
+    pub fn init_with_attrs<E>(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attrs,
+        init: impl Init<T, E>,
+    ) -> Result<Self>
+    where
+        Error: From<E>,
+    {
+        let dmem = Self::alloc_with_attrs(dev, gfp_flags, dma_attrs)?;
+        let ptr = dmem.as_mut_ptr();
+
+        // SAFETY:
+        // - `ptr` is valid, properly aligned, and points to exclusively owned memory.
+        // - If `__init` fails, `self` is dropped, which safely frees the underlying `Coherent`'s
+        //   DMA memory. `T: AsBytes + FromBytes` ensures there are no complex `Drop` requirements
+        //   we are bypassing.
+        unsafe { init.__init(ptr)? };
+
+        Ok(dmem)
+    }
+
+    /// Same as [`Coherent::zeroed`], but instead of a zero-initialization the memory is initialized
+    /// with `init`.
+    #[inline]
+    pub fn init<E>(
+        dev: &device::Device<Bound>,
+        gfp_flags: kernel::alloc::Flags,
+        init: impl Init<T, E>,
+    ) -> Result<Self>
+    where
+        Error: From<E>,
+    {
+        Self::init_with_attrs(dev, gfp_flags, Attrs(0), init)
+    }
+
     /// Allocates a region of `[T; len]` of coherent memory.
     fn alloc_slice_with_attrs(
         dev: &device::Device<Bound>,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
                   ` (4 preceding siblings ...)
  2026-03-20 19:45 ` [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-20 21:04   ` Gary Guo
  2026-03-20 19:45 ` [PATCH v2 7/8] gpu: nova-core: convert Gsp::new() to use CoherentBox Danilo Krummrich
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Convert wpr_meta to use Coherent::init() and simplify the
initialization.  It also avoids a separate initialization of
GspFwWprMeta on the stack.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 drivers/gpu/nova-core/gsp/boot.rs |  7 ++-----
 drivers/gpu/nova-core/gsp/fw.rs   | 20 +++++++++++++++-----
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 5e73bd769dcc..e55210ebb6d1 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -2,8 +2,7 @@
 
 use kernel::{
     device,
-    dma::CoherentAllocation,
-    dma_write,
+    dma::Coherent,
     io::poll::read_poll_timeout,
     pci,
     prelude::*,
@@ -164,9 +163,7 @@ pub(crate) fn boot(
             bar,
         )?;
 
-        let wpr_meta =
-            CoherentAllocation::<GspFwWprMeta>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
-        dma_write!(wpr_meta, [0]?, GspFwWprMeta::new(&gsp_fw, &fb_layout));
+        let wpr_meta = Coherent::init(dev, GFP_KERNEL, GspFwWprMeta::new(&gsp_fw, &fb_layout))?;
 
         self.cmdq
             .send_command_no_wait(bar, commands::SetSystemInfo::new(pdev))?;
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index a061131b5412..4e3bfc6c4c47 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -204,7 +204,9 @@ pub(crate) fn wpr_heap_size(&self, chipset: Chipset, fb_size: u64) -> u64 {
 /// Structure passed to the GSP bootloader, containing the framebuffer layout as well as the DMA
 /// addresses of the GSP bootloader and firmware.
 #[repr(transparent)]
-pub(crate) struct GspFwWprMeta(bindings::GspFwWprMeta);
+pub(crate) struct GspFwWprMeta {
+    inner: bindings::GspFwWprMeta,
+}
 
 // SAFETY: Padding is explicit and does not contain uninitialized data.
 unsafe impl AsBytes for GspFwWprMeta {}
@@ -217,10 +219,14 @@ unsafe impl FromBytes for GspFwWprMeta {}
 type GspFwWprMetaBootInfo = bindings::GspFwWprMeta__bindgen_ty_1__bindgen_ty_1;
 
 impl GspFwWprMeta {
-    /// Fill in and return a `GspFwWprMeta` suitable for booting `gsp_firmware` using the
+    /// Returns an initializer for a `GspFwWprMeta` suitable for booting `gsp_firmware` using the
     /// `fb_layout` layout.
-    pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
-        Self(bindings::GspFwWprMeta {
+    pub(crate) fn new<'a>(
+        gsp_firmware: &'a GspFirmware,
+        fb_layout: &'a FbLayout,
+    ) -> impl Init<Self> + 'a {
+        #[allow(non_snake_case)]
+        let init_inner = init!(bindings::GspFwWprMeta {
             // CAST: we want to store the bits of `GSP_FW_WPR_META_MAGIC` unmodified.
             magic: bindings::GSP_FW_WPR_META_MAGIC as u64,
             revision: u64::from(bindings::GSP_FW_WPR_META_REVISION),
@@ -255,7 +261,11 @@ pub(crate) fn new(gsp_firmware: &GspFirmware, fb_layout: &FbLayout) -> Self {
             fbSize: fb_layout.fb.end - fb_layout.fb.start,
             vgaWorkspaceOffset: fb_layout.vga_workspace.start,
             vgaWorkspaceSize: fb_layout.vga_workspace.end - fb_layout.vga_workspace.start,
-            ..Default::default()
+            ..Zeroable::init_zeroed()
+        });
+
+        init!(GspFwWprMeta {
+            inner <- init_inner,
         })
     }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 7/8] gpu: nova-core: convert Gsp::new() to use CoherentBox
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
                   ` (5 preceding siblings ...)
  2026-03-20 19:45 ` [PATCH v2 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-20 21:06   ` Gary Guo
  2026-03-20 19:45 ` [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

Convert libos (LibosMemoryRegionInitArgument) and rmargs
(GspArgumentsPadded) to use CoherentBox / Coherent::init() and simplify
the initialization. This also avoids separate initialization on the
stack.

Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 drivers/gpu/nova-core/gsp.rs    | 47 +++++++++++--------------
 drivers/gpu/nova-core/gsp/fw.rs | 62 +++++++++++++++++++++++----------
 2 files changed, 65 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index 72f173726f87..f0a50bdc4c00 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -5,10 +5,11 @@
 use kernel::{
     device,
     dma::{
+        Coherent,
         CoherentAllocation,
+        CoherentBox,
         DmaAddress, //
     },
-    dma_write,
     pci,
     prelude::*,
     transmute::AsBytes, //
@@ -106,7 +107,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
 #[pin_data]
 pub(crate) struct Gsp {
     /// Libos arguments.
-    pub(crate) libos: CoherentAllocation<LibosMemoryRegionInitArgument>,
+    pub(crate) libos: Coherent<[LibosMemoryRegionInitArgument]>,
     /// Init log buffer.
     loginit: LogBuffer,
     /// Interrupts log buffer.
@@ -117,7 +118,7 @@ pub(crate) struct Gsp {
     #[pin]
     pub(crate) cmdq: Cmdq,
     /// RM arguments.
-    rmargs: CoherentAllocation<GspArgumentsPadded>,
+    rmargs: Coherent<GspArgumentsPadded>,
 }
 
 impl Gsp {
@@ -126,34 +127,28 @@ pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error
         pin_init::pin_init_scope(move || {
             let dev = pdev.as_ref();
 
+            // Initialise the logging structures. The OpenRM equivalents are in:
+            // _kgspInitLibosLoggingStructures (allocates memory for buffers)
+            // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
             Ok(try_pin_init!(Self {
-                libos: CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent(
-                    dev,
-                    GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
-                    GFP_KERNEL | __GFP_ZERO,
-                )?,
                 loginit: LogBuffer::new(dev)?,
                 logintr: LogBuffer::new(dev)?,
                 logrm: LogBuffer::new(dev)?,
                 cmdq <- Cmdq::new(dev),
-                rmargs: CoherentAllocation::<GspArgumentsPadded>::alloc_coherent(
-                    dev,
-                    1,
-                    GFP_KERNEL | __GFP_ZERO,
-                )?,
-                _: {
-                    // Initialise the logging structures. The OpenRM equivalents are in:
-                    // _kgspInitLibosLoggingStructures (allocates memory for buffers)
-                    // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
-                    dma_write!(
-                        libos, [0]?, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0)
-                    );
-                    dma_write!(
-                        libos, [1]?, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0)
-                    );
-                    dma_write!(libos, [2]?, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0));
-                    dma_write!(rmargs, [0]?.inner, fw::GspArgumentsCached::new(&cmdq));
-                    dma_write!(libos, [3]?, LibosMemoryRegionInitArgument::new("RMARGS", rmargs));
+                rmargs: Coherent::init(dev, GFP_KERNEL, GspArgumentsPadded::new(&cmdq))?,
+                libos: {
+                    let mut libos = CoherentBox::zeroed_slice(
+                        dev,
+                        GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
+                        GFP_KERNEL,
+                    )?;
+
+                    libos.init_at(0, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0))?;
+                    libos.init_at(1, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0))?;
+                    libos.init_at(2, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?;
+                    libos.init_at(3, LibosMemoryRegionInitArgument::new("RMARGS", rmargs))?;
+
+                    libos.into()
                 },
             }))
         })
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 4e3bfc6c4c47..0d8daf6a80b7 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -9,11 +9,12 @@
 use core::ops::Range;
 
 use kernel::{
-    dma::CoherentAllocation,
+    dma::Coherent,
     prelude::*,
     ptr::{
         Alignable,
-        Alignment, //
+        Alignment,
+        KnownSize, //
     },
     sizes::{
         SZ_128K,
@@ -648,7 +649,9 @@ unsafe impl AsBytes for RunCpuSequencer {}
 /// The memory allocated for the arguments must remain until the GSP sends the
 /// init_done RPC.
 #[repr(transparent)]
-pub(crate) struct LibosMemoryRegionInitArgument(bindings::LibosMemoryRegionInitArgument);
+pub(crate) struct LibosMemoryRegionInitArgument {
+    inner: bindings::LibosMemoryRegionInitArgument,
+}
 
 // SAFETY: Padding is explicit and does not contain uninitialized data.
 unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
@@ -658,10 +661,10 @@ unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
 unsafe impl FromBytes for LibosMemoryRegionInitArgument {}
 
 impl LibosMemoryRegionInitArgument {
-    pub(crate) fn new<A: AsBytes + FromBytes>(
+    pub(crate) fn new<'a, A: AsBytes + FromBytes + KnownSize + ?Sized>(
         name: &'static str,
-        obj: &CoherentAllocation<A>,
-    ) -> Self {
+        obj: &'a Coherent<A>,
+    ) -> impl Init<Self> + 'a {
         /// Generates the `ID8` identifier required for some GSP objects.
         fn id8(name: &str) -> u64 {
             let mut bytes = [0u8; core::mem::size_of::<u64>()];
@@ -673,7 +676,8 @@ fn id8(name: &str) -> u64 {
             u64::from_ne_bytes(bytes)
         }
 
-        Self(bindings::LibosMemoryRegionInitArgument {
+        #[allow(non_snake_case)]
+        let init_inner = init!(bindings::LibosMemoryRegionInitArgument {
             id8: id8(name),
             pa: obj.dma_handle(),
             size: num::usize_as_u64(obj.size()),
@@ -683,7 +687,11 @@ fn id8(name: &str) -> u64 {
             loc: num::u32_into_u8::<
                 { bindings::LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM },
             >(),
-            ..Default::default()
+            ..Zeroable::init_zeroed()
+        });
+
+        init!(LibosMemoryRegionInitArgument {
+            inner <- init_inner,
         })
     }
 }
@@ -862,15 +870,23 @@ unsafe impl FromBytes for GspMsgElement {}
 
 /// Arguments for GSP startup.
 #[repr(transparent)]
-pub(crate) struct GspArgumentsCached(bindings::GSP_ARGUMENTS_CACHED);
+#[derive(Zeroable)]
+pub(crate) struct GspArgumentsCached {
+    inner: bindings::GSP_ARGUMENTS_CACHED,
+}
 
 impl GspArgumentsCached {
     /// Creates the arguments for starting the GSP up using `cmdq` as its command queue.
-    pub(crate) fn new(cmdq: &Cmdq) -> Self {
-        Self(bindings::GSP_ARGUMENTS_CACHED {
-            messageQueueInitArguments: MessageQueueInitArguments::new(cmdq).0,
+    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
+        #[allow(non_snake_case)]
+        let init_inner = init!(bindings::GSP_ARGUMENTS_CACHED {
+            messageQueueInitArguments <- MessageQueueInitArguments::new(cmdq),
             bDmemStack: 1,
-            ..Default::default()
+            ..Zeroable::init_zeroed()
+        });
+
+        init!(GspArgumentsCached {
+            inner <- init_inner,
         })
     }
 }
@@ -882,11 +898,21 @@ unsafe impl AsBytes for GspArgumentsCached {}
 /// must all be a multiple of GSP_PAGE_SIZE in size, so add padding to force it
 /// to that size.
 #[repr(C)]
+#[derive(Zeroable)]
 pub(crate) struct GspArgumentsPadded {
     pub(crate) inner: GspArgumentsCached,
     _padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()],
 }
 
+impl GspArgumentsPadded {
+    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
+        init!(GspArgumentsPadded {
+            inner <- GspArgumentsCached::new(cmdq),
+            ..Zeroable::init_zeroed()
+        })
+    }
+}
+
 // SAFETY: Padding is explicit and will not contain uninitialized data.
 unsafe impl AsBytes for GspArgumentsPadded {}
 
@@ -895,18 +921,18 @@ unsafe impl AsBytes for GspArgumentsPadded {}
 unsafe impl FromBytes for GspArgumentsPadded {}
 
 /// Init arguments for the message queue.
-#[repr(transparent)]
-struct MessageQueueInitArguments(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS);
+type MessageQueueInitArguments = bindings::MESSAGE_QUEUE_INIT_ARGUMENTS;
 
 impl MessageQueueInitArguments {
     /// Creates a new init arguments structure for `cmdq`.
-    fn new(cmdq: &Cmdq) -> Self {
-        Self(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS {
+    #[allow(non_snake_case)]
+    fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
+        init!(MessageQueueInitArguments {
             sharedMemPhysAddr: cmdq.dma_handle(),
             pageTableEntryCount: num::usize_into_u32::<{ Cmdq::NUM_PTES }>(),
             cmdQueueOffset: num::usize_as_u64(Cmdq::CMDQ_OFFSET),
             statQueueOffset: num::usize_as_u64(Cmdq::STATQ_OFFSET),
-            ..Default::default()
+            ..Zeroable::init_zeroed()
         })
     }
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
                   ` (6 preceding siblings ...)
  2026-03-20 19:45 ` [PATCH v2 7/8] gpu: nova-core: convert Gsp::new() to use CoherentBox Danilo Krummrich
@ 2026-03-20 19:45 ` Danilo Krummrich
  2026-03-21 16:50   ` Gary Guo
  2026-03-21  5:13 ` [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Alexandre Courbot
  2026-03-23 21:56 ` Danilo Krummrich
  9 siblings, 1 reply; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-20 19:45 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

From: Gary Guo <gary@garyguo.net>

Remove all usages of dma::CoherentAllocation and use the new
dma::Coherent type instead.

Note that there are still remainders of the old dma::CoherentAllocation
API, such as as_slice() and as_slice_mut().

Signed-off-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
---
 drivers/gpu/nova-core/dma.rs      | 19 ++++++-------
 drivers/gpu/nova-core/falcon.rs   |  5 ++--
 drivers/gpu/nova-core/gsp.rs      | 21 ++++++++------
 drivers/gpu/nova-core/gsp/cmdq.rs | 21 ++++++--------
 drivers/gpu/nova-core/gsp/fw.rs   | 46 ++++++++++---------------------
 5 files changed, 47 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
index 7215398969da..3c19d5ffcfe8 100644
--- a/drivers/gpu/nova-core/dma.rs
+++ b/drivers/gpu/nova-core/dma.rs
@@ -9,13 +9,13 @@
 
 use kernel::{
     device,
-    dma::CoherentAllocation,
+    dma::Coherent,
     page::PAGE_SIZE,
     prelude::*, //
 };
 
 pub(crate) struct DmaObject {
-    dma: CoherentAllocation<u8>,
+    dma: Coherent<[u8]>,
 }
 
 impl DmaObject {
@@ -24,23 +24,22 @@ pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Sel
             .map_err(|_| EINVAL)?
             .pad_to_align()
             .size();
-        let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?;
+        let dma = Coherent::zeroed_slice(dev, len, GFP_KERNEL)?;
 
         Ok(Self { dma })
     }
 
     pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
-        Self::new(dev, data.len()).and_then(|mut dma_obj| {
-            // SAFETY: We have just allocated the DMA memory, we are the only users and
-            // we haven't made the device aware of the handle yet.
-            unsafe { dma_obj.write(data, 0)? }
-            Ok(dma_obj)
-        })
+        let dma_obj = Self::new(dev, data.len())?;
+        // SAFETY: We have just allocated the DMA memory, we are the only users and
+        // we haven't made the device aware of the handle yet.
+        unsafe { dma_obj.as_mut()[..data.len()].copy_from_slice(data) };
+        Ok(dma_obj)
     }
 }
 
 impl Deref for DmaObject {
-    type Target = CoherentAllocation<u8>;
+    type Target = Coherent<[u8]>;
 
     fn deref(&self) -> &Self::Target {
         &self.dma
diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 7097a206ec3c..5bf8da8760bf 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -26,8 +26,7 @@
     gpu::Chipset,
     num::{
         self,
-        FromSafeCast,
-        IntoSafeCast, //
+        FromSafeCast, //
     },
     regs,
     regs::macros::RegisterBase, //
@@ -653,7 +652,7 @@ fn dma_wr(
             }
             FalconMem::Dmem => (
                 0,
-                dma_obj.dma_handle_with_offset(load_offsets.src_start.into_safe_cast())?,
+                dma_obj.dma_handle() + DmaAddress::from(load_offsets.src_start),
             ),
         };
         if dma_start % DmaAddress::from(DMA_LEN) > 0 {
diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index f0a50bdc4c00..a045c4189989 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -6,13 +6,15 @@
     device,
     dma::{
         Coherent,
-        CoherentAllocation,
         CoherentBox,
         DmaAddress, //
     },
     pci,
     prelude::*,
-    transmute::AsBytes, //
+    transmute::{
+        AsBytes,
+        FromBytes, //
+    }, //
 };
 
 pub(crate) mod cmdq;
@@ -44,6 +46,9 @@
 #[repr(C)]
 struct PteArray<const NUM_ENTRIES: usize>([u64; NUM_ENTRIES]);
 
+/// SAFETY: arrays of `u64` implement `FromBytes` and we are but a wrapper around one.
+unsafe impl<const NUM_ENTRIES: usize> FromBytes for PteArray<NUM_ENTRIES> {}
+
 /// SAFETY: arrays of `u64` implement `AsBytes` and we are but a wrapper around one.
 unsafe impl<const NUM_ENTRIES: usize> AsBytes for PteArray<NUM_ENTRIES> {}
 
@@ -71,26 +76,24 @@ fn entry(start: DmaAddress, index: usize) -> Result<u64> {
 /// then pp points to index into the buffer where the next logging entry will
 /// be written. Therefore, the logging data is valid if:
 ///   1 <= pp < sizeof(buffer)/sizeof(u64)
-struct LogBuffer(CoherentAllocation<u8>);
+struct LogBuffer(Coherent<[u8]>);
 
 impl LogBuffer {
     /// Creates a new `LogBuffer` mapped on `dev`.
     fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         const NUM_PAGES: usize = RM_LOG_BUFFER_NUM_PAGES;
 
-        let mut obj = Self(CoherentAllocation::<u8>::alloc_coherent(
+        let obj = Self(Coherent::<u8>::zeroed_slice(
             dev,
             NUM_PAGES * GSP_PAGE_SIZE,
-            GFP_KERNEL | __GFP_ZERO,
+            GFP_KERNEL,
         )?);
 
         let start_addr = obj.0.dma_handle();
 
         // SAFETY: `obj` has just been created and we are its sole user.
-        let pte_region = unsafe {
-            obj.0
-                .as_slice_mut(size_of::<u64>(), NUM_PAGES * size_of::<u64>())?
-        };
+        let pte_region =
+            unsafe { &mut obj.0.as_mut()[size_of::<u64>()..][..NUM_PAGES * size_of::<u64>()] };
 
         // Write values one by one to avoid an on-stack instance of `PteArray`.
         for (i, chunk) in pte_region.chunks_exact_mut(size_of::<u64>()).enumerate() {
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index d36a62ba1c60..f38790601a0f 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -7,7 +7,7 @@
 use kernel::{
     device,
     dma::{
-        CoherentAllocation,
+        Coherent,
         DmaAddress, //
     },
     dma_write,
@@ -207,7 +207,7 @@ unsafe impl AsBytes for GspMem {}
 // that is not a problem because they are not used outside the kernel.
 unsafe impl FromBytes for GspMem {}
 
-/// Wrapper around [`GspMem`] to share it with the GPU using a [`CoherentAllocation`].
+/// Wrapper around [`GspMem`] to share it with the GPU using a [`Coherent`].
 ///
 /// This provides the low-level functionality to communicate with the GSP, including allocation of
 /// queue space to write messages to and management of read/write pointers.
@@ -218,7 +218,7 @@ unsafe impl FromBytes for GspMem {}
 ///   pointer and the GSP read pointer. This region is returned by [`Self::driver_write_area`].
 /// * The driver owns (i.e. can read from) the part of the GSP message queue between the CPU read
 ///   pointer and the GSP write pointer. This region is returned by [`Self::driver_read_area`].
-struct DmaGspMem(CoherentAllocation<GspMem>);
+struct DmaGspMem(Coherent<GspMem>);
 
 impl DmaGspMem {
     /// Allocate a new instance and map it for `dev`.
@@ -226,21 +226,20 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         const MSGQ_SIZE: u32 = num::usize_into_u32::<{ size_of::<Msgq>() }>();
         const RX_HDR_OFF: u32 = num::usize_into_u32::<{ mem::offset_of!(Msgq, rx) }>();
 
-        let gsp_mem =
-            CoherentAllocation::<GspMem>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
+        let gsp_mem = Coherent::<GspMem>::zeroed(dev, GFP_KERNEL)?;
 
         let start = gsp_mem.dma_handle();
         // Write values one by one to avoid an on-stack instance of `PteArray`.
         for i in 0..GspMem::PTE_ARRAY_SIZE {
-            dma_write!(gsp_mem, [0]?.ptes.0[i], PteArray::<0>::entry(start, i)?);
+            dma_write!(gsp_mem, .ptes.0[i], PteArray::<0>::entry(start, i)?);
         }
 
         dma_write!(
             gsp_mem,
-            [0]?.cpuq.tx,
+            .cpuq.tx,
             MsgqTxHeader::new(MSGQ_SIZE, RX_HDR_OFF, MSGQ_NUM_PAGES)
         );
-        dma_write!(gsp_mem, [0]?.cpuq.rx, MsgqRxHeader::new());
+        dma_write!(gsp_mem, .cpuq.rx, MsgqRxHeader::new());
 
         Ok(Self(gsp_mem))
     }
@@ -255,10 +254,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
         let rx = self.gsp_read_ptr() as usize;
 
         // SAFETY:
-        // - The `CoherentAllocation` contains exactly one object.
         // - We will only access the driver-owned part of the shared memory.
         // - Per the safety statement of the function, no concurrent access will be performed.
-        let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
+        let gsp_mem = unsafe { &mut *self.0.as_mut() };
         // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
         let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
 
@@ -309,10 +307,9 @@ fn driver_write_area_size(&self) -> usize {
         let rx = self.cpu_read_ptr() as usize;
 
         // SAFETY:
-        // - The `CoherentAllocation` contains exactly one object.
         // - We will only access the driver-owned part of the shared memory.
         // - Per the safety statement of the function, no concurrent access will be performed.
-        let gsp_mem = &unsafe { self.0.as_slice(0, 1) }.unwrap()[0];
+        let gsp_mem = unsafe { &*self.0.as_ptr() };
         let data = &gsp_mem.gspq.msgq.data;
 
         // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 0d8daf6a80b7..847b5eb215d4 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -40,8 +40,7 @@
     },
 };
 
-// TODO: Replace with `IoView` projections once available; the `unwrap()` calls go away once we
-// switch to the new `dma::Coherent` API.
+// TODO: Replace with `IoView` projections once available.
 pub(super) mod gsp_mem {
     use core::sync::atomic::{
         fence,
@@ -49,10 +48,9 @@ pub(super) mod gsp_mem {
     };
 
     use kernel::{
-        dma::CoherentAllocation,
+        dma::Coherent,
         dma_read,
-        dma_write,
-        prelude::*, //
+        dma_write, //
     };
 
     use crate::gsp::cmdq::{
@@ -60,49 +58,35 @@ pub(super) mod gsp_mem {
         MSGQ_NUM_PAGES, //
     };
 
-    pub(in crate::gsp) fn gsp_write_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
-        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
-        || -> Result<u32> { Ok(dma_read!(qs, [0]?.gspq.tx.0.writePtr) % MSGQ_NUM_PAGES) }().unwrap()
+    pub(in crate::gsp) fn gsp_write_ptr(qs: &Coherent<GspMem>) -> u32 {
+        dma_read!(qs, .gspq.tx.0.writePtr) % MSGQ_NUM_PAGES
     }
 
-    pub(in crate::gsp) fn gsp_read_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
-        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
-        || -> Result<u32> { Ok(dma_read!(qs, [0]?.gspq.rx.0.readPtr) % MSGQ_NUM_PAGES) }().unwrap()
+    pub(in crate::gsp) fn gsp_read_ptr(qs: &Coherent<GspMem>) -> u32 {
+        dma_read!(qs, .gspq.rx.0.readPtr) % MSGQ_NUM_PAGES
     }
 
-    pub(in crate::gsp) fn cpu_read_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
-        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
-        || -> Result<u32> { Ok(dma_read!(qs, [0]?.cpuq.rx.0.readPtr) % MSGQ_NUM_PAGES) }().unwrap()
+    pub(in crate::gsp) fn cpu_read_ptr(qs: &Coherent<GspMem>) -> u32 {
+        dma_read!(qs, .cpuq.rx.0.readPtr) % MSGQ_NUM_PAGES
     }
 
-    pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &CoherentAllocation<GspMem>, count: u32) {
+    pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &Coherent<GspMem>, count: u32) {
         let rptr = cpu_read_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
 
         // Ensure read pointer is properly ordered.
         fence(Ordering::SeqCst);
 
-        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
-        || -> Result {
-            dma_write!(qs, [0]?.cpuq.rx.0.readPtr, rptr);
-            Ok(())
-        }()
-        .unwrap()
+        dma_write!(qs, .cpuq.rx.0.readPtr, rptr);
     }
 
-    pub(in crate::gsp) fn cpu_write_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
-        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
-        || -> Result<u32> { Ok(dma_read!(qs, [0]?.cpuq.tx.0.writePtr) % MSGQ_NUM_PAGES) }().unwrap()
+    pub(in crate::gsp) fn cpu_write_ptr(qs: &Coherent<GspMem>) -> u32 {
+        dma_read!(qs, .cpuq.tx.0.writePtr) % MSGQ_NUM_PAGES
     }
 
-    pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &CoherentAllocation<GspMem>, count: u32) {
+    pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &Coherent<GspMem>, count: u32) {
         let wptr = cpu_write_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
 
-        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
-        || -> Result {
-            dma_write!(qs, [0]?.cpuq.tx.0.writePtr, wptr);
-            Ok(())
-        }()
-        .unwrap();
+        dma_write!(qs, .cpuq.tx.0.writePtr, wptr);
 
         // Ensure all command data is visible before triggering the GSP read.
         fence(Ordering::SeqCst);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization
  2026-03-20 19:45 ` [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization Danilo Krummrich
@ 2026-03-20 20:55   ` Gary Guo
  2026-03-24 13:57   ` Andreas Hindborg
  1 sibling, 0 replies; 31+ messages in thread
From: Gary Guo @ 2026-03-20 20:55 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Fri Mar 20, 2026 at 7:45 PM GMT, Danilo Krummrich wrote:
> Currently, dma::Coherent cannot safely provide (mutable) access to its
> underlying memory because the memory might be concurrently accessed by a
> DMA device. This makes it difficult to safely initialize the memory
> before handing it over to the hardware.
>
> Introduce dma::CoherentBox, a type that encapsulates a dma::Coherent
> before its DMA address is exposed to the device. dma::CoherentBox can
> guarantee exclusive access to the inner dma::Coherent and implement
> Deref and DerefMut.
>
> Once the memory is properly initialized, dma::CoherentBox can be
> converted into a regular dma::Coherent.
>
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Gary Guo <gary@garyguo.net>

See some nits below:

> ---
>  rust/kernel/dma.rs | 154 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 153 insertions(+), 1 deletion(-)
>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index db645b01bdd0..cefb54f0424a 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -20,7 +20,13 @@
>          FromBytes, //
>      }, //
>  };
> -use core::ptr::NonNull;
> +use core::{
> +    ops::{
> +        Deref,
> +        DerefMut, //
> +    },
> +    ptr::NonNull, //
> +};
>  
>  /// DMA address type.
>  ///
> @@ -352,6 +358,152 @@ fn from(direction: DataDirection) -> Self {
>      }
>  }
>  
> +/// CPU-owned DMA allocation that can be converted into a device-shared [`Coherent`] object.
> +///
> +/// Unlike [`Coherent`], a [`CoherentBox`] is guaranteed to be fully owned by the CPU -- its DMA
> +/// address is not exposed and it cannot be accessed by a device. This means it can safely be used
> +/// like a normal boxed allocation (e.g. direct reads, writes, and mutable slices are all safe).
> +///
> +/// A typical use is to allocate a [`CoherentBox`], populate it with normal CPU access, and then
> +/// convert it into a [`Coherent`] object to share it with the device.
> +///
> +/// # Examples
> +///
> +/// `CoherentBox<T>`:
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentBox,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentBox<u64> = CoherentBox::zeroed(dev, GFP_KERNEL)?;
> +/// *dmem = 42;
> +/// let dmem: Coherent<u64> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +///
> +/// `CoherentBox<[T]>`:
> +///
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentBox,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentBox<[u64]> = CoherentBox::zeroed_slice(dev, 4, GFP_KERNEL)?;
> +/// dmem.fill(42);
> +/// let dmem: Coherent<[u64]> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +pub struct CoherentBox<T: AsBytes + FromBytes + KnownSize + ?Sized>(Coherent<T>);

Similar to the changes I've made to `Coherent`, this can also just have
`KnownSize + ?Sized` bound on the struct, and only have those bounds on the
constructor.

This saves a quite bit of duplication where everything needs to say `T: AsBytes
+ FromBytes`.

Should be something fix-up-able on apply, or something left for future cleanup.

> +
> +impl<T: AsBytes + FromBytes> CoherentBox<[T]> {

this and ...

> +    /// [`CoherentBox`] variant of [`Coherent::zeroed_slice_with_attrs`].
> +    #[inline]
> +    pub fn zeroed_slice_with_attrs(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +    ) -> Result<Self> {
> +        Coherent::zeroed_slice_with_attrs(dev, count, gfp_flags, dma_attrs).map(Self)
> +    }
> +
> +    /// Same as [CoherentBox::zeroed_slice_with_attrs], but with `dma::Attrs(0)`.
> +    #[inline]
> +    pub fn zeroed_slice(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +    ) -> Result<Self> {
> +        Self::zeroed_slice_with_attrs(dev, count, gfp_flags, Attrs(0))
> +    }
> +
> +    /// Initializes the element at `i` using the given initializer.
> +    ///
> +    /// Returns `EINVAL` if `i` is out of bounds.
> +    pub fn init_at<E>(&mut self, i: usize, init: impl Init<T, E>) -> Result
> +    where
> +        Error: From<E>,
> +    {
> +        if i >= self.0.len() {
> +            return Err(EINVAL);
> +        }
> +
> +        let ptr = &raw mut self[i];
> +
> +        // SAFETY:
> +        // - `ptr` is valid, properly aligned, and within this allocation.
> +        // - `T: AsBytes + FromBytes` guarantees all bit patterns are valid, so partial writes on
> +        //   error cannot leave the element in an invalid state.
> +        // - The DMA address has not been exposed yet, so there is no concurrent device access.
> +        unsafe { init.__init(ptr)? };
> +
> +        Ok(())
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentBox<T> {

this should be the only two places that need `AsBytes + FromBytes`.

Best,
Gary

> +    /// Same as [`CoherentBox::zeroed_slice_with_attrs`], but for a single element.
> +    #[inline]
> +    pub fn zeroed_with_attrs(
> +        dev: &device::Device<Bound>,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +    ) -> Result<Self> {
> +        Coherent::zeroed_with_attrs(dev, gfp_flags, dma_attrs).map(Self)
> +    }
> +
> +    /// Same as [`CoherentBox::zeroed_slice`], but for a single element.
> +    #[inline]
> +    pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> Result<Self> {
> +        Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes + KnownSize + ?Sized> Deref for CoherentBox<T> {
> +    type Target = T;
> +
> +    #[inline]
> +    fn deref(&self) -> &Self::Target {
> +        // SAFETY:
> +        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
> +        //   device.
> +        // - We have exclusive access to `self.0`.
> +        unsafe { self.0.as_ref() }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes + KnownSize + ?Sized> DerefMut for CoherentBox<T> {
> +    #[inline]
> +    fn deref_mut(&mut self) -> &mut Self::Target {
> +        // SAFETY:
> +        // - We have not exposed the DMA address yet, so there can't be any concurrent access by a
> +        //   device.
> +        // - We have exclusive access to `self.0`.
> +        unsafe { self.0.as_mut() }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes + KnownSize + ?Sized> From<CoherentBox<T>> for Coherent<T> {
> +    #[inline]
> +    fn from(value: CoherentBox<T>) -> Self {
> +        value.0
> +    }
> +}
> +
>  /// An abstraction of the `dma_alloc_coherent` API.
>  ///
>  /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  2026-03-20 19:45 ` [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
@ 2026-03-20 20:56   ` Gary Guo
  2026-03-24 14:00   ` Andreas Hindborg
  1 sibling, 0 replies; 31+ messages in thread
From: Gary Guo @ 2026-03-20 20:56 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Fri Mar 20, 2026 at 7:45 PM GMT, Danilo Krummrich wrote:
> Analogous to Coherent::zeroed() and Coherent::zeroed_with_attrs(), add
> Coherent:init() and Coherent::init_with_attrs() which both take an impl
> Init<T, E> argument initializing the DMA coherent memory.
> 
> Compared to CoherentInit, Coherent::init() is a one-shot constructor
> that runs an Init closure and immediately exposes the DMA handle,
> whereas CoherentInit is a multi-stage initializer that provides safe
> &mut T access by withholding the DMA address until converted to
> Coherent.
> 
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Gary Guo <gary@garyguo.net>

> ---
>  rust/kernel/dma.rs | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
  2026-03-20 19:45 ` [PATCH v2 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
@ 2026-03-20 21:04   ` Gary Guo
  0 siblings, 0 replies; 31+ messages in thread
From: Gary Guo @ 2026-03-20 21:04 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Fri Mar 20, 2026 at 7:45 PM GMT, Danilo Krummrich wrote:
> Convert wpr_meta to use Coherent::init() and simplify the
> initialization.  It also avoids a separate initialization of
> GspFwWprMeta on the stack.
> 
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Gary Guo <gary@garyguo.net>

> ---
>  drivers/gpu/nova-core/gsp/boot.rs |  7 ++-----
>  drivers/gpu/nova-core/gsp/fw.rs   | 20 +++++++++++++++-----
>  2 files changed, 17 insertions(+), 10 deletions(-)


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 7/8] gpu: nova-core: convert Gsp::new() to use CoherentBox
  2026-03-20 19:45 ` [PATCH v2 7/8] gpu: nova-core: convert Gsp::new() to use CoherentBox Danilo Krummrich
@ 2026-03-20 21:06   ` Gary Guo
  0 siblings, 0 replies; 31+ messages in thread
From: Gary Guo @ 2026-03-20 21:06 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Fri Mar 20, 2026 at 7:45 PM GMT, Danilo Krummrich wrote:
> Convert libos (LibosMemoryRegionInitArgument) and rmargs
> (GspArgumentsPadded) to use CoherentBox / Coherent::init() and simplify
> the initialization. This also avoids separate initialization on the
> stack.
>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  drivers/gpu/nova-core/gsp.rs    | 47 +++++++++++--------------
>  drivers/gpu/nova-core/gsp/fw.rs | 62 +++++++++++++++++++++++----------
>  2 files changed, 65 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
> index 72f173726f87..f0a50bdc4c00 100644
> --- a/drivers/gpu/nova-core/gsp.rs
> +++ b/drivers/gpu/nova-core/gsp.rs
> @@ -5,10 +5,11 @@
>  use kernel::{
>      device,
>      dma::{
> +        Coherent,
>          CoherentAllocation,
> +        CoherentBox,
>          DmaAddress, //
>      },
> -    dma_write,
>      pci,
>      prelude::*,
>      transmute::AsBytes, //
> @@ -106,7 +107,7 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>  #[pin_data]
>  pub(crate) struct Gsp {
>      /// Libos arguments.
> -    pub(crate) libos: CoherentAllocation<LibosMemoryRegionInitArgument>,
> +    pub(crate) libos: Coherent<[LibosMemoryRegionInitArgument]>,
>      /// Init log buffer.
>      loginit: LogBuffer,
>      /// Interrupts log buffer.
> @@ -117,7 +118,7 @@ pub(crate) struct Gsp {
>      #[pin]
>      pub(crate) cmdq: Cmdq,
>      /// RM arguments.
> -    rmargs: CoherentAllocation<GspArgumentsPadded>,
> +    rmargs: Coherent<GspArgumentsPadded>,
>  }
>  
>  impl Gsp {
> @@ -126,34 +127,28 @@ pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error
>          pin_init::pin_init_scope(move || {
>              let dev = pdev.as_ref();
>  
> +            // Initialise the logging structures. The OpenRM equivalents are in:
> +            // _kgspInitLibosLoggingStructures (allocates memory for buffers)
> +            // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
>              Ok(try_pin_init!(Self {
> -                libos: CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent(
> -                    dev,
> -                    GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
> -                    GFP_KERNEL | __GFP_ZERO,
> -                )?,
>                  loginit: LogBuffer::new(dev)?,
>                  logintr: LogBuffer::new(dev)?,
>                  logrm: LogBuffer::new(dev)?,
>                  cmdq <- Cmdq::new(dev),
> -                rmargs: CoherentAllocation::<GspArgumentsPadded>::alloc_coherent(
> -                    dev,
> -                    1,
> -                    GFP_KERNEL | __GFP_ZERO,
> -                )?,
> -                _: {
> -                    // Initialise the logging structures. The OpenRM equivalents are in:
> -                    // _kgspInitLibosLoggingStructures (allocates memory for buffers)
> -                    // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
> -                    dma_write!(
> -                        libos, [0]?, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0)
> -                    );
> -                    dma_write!(
> -                        libos, [1]?, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0)
> -                    );
> -                    dma_write!(libos, [2]?, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0));
> -                    dma_write!(rmargs, [0]?.inner, fw::GspArgumentsCached::new(&cmdq));
> -                    dma_write!(libos, [3]?, LibosMemoryRegionInitArgument::new("RMARGS", rmargs));
> +                rmargs: Coherent::init(dev, GFP_KERNEL, GspArgumentsPadded::new(&cmdq))?,
> +                libos: {
> +                    let mut libos = CoherentBox::zeroed_slice(
> +                        dev,
> +                        GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
> +                        GFP_KERNEL,
> +                    )?;
> +
> +                    libos.init_at(0, LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0))?;
> +                    libos.init_at(1, LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0))?;
> +                    libos.init_at(2, LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?;
> +                    libos.init_at(3, LibosMemoryRegionInitArgument::new("RMARGS", rmargs))?;
> +
> +                    libos.into()
>                  },
>              }))
>          })
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index 4e3bfc6c4c47..0d8daf6a80b7 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -9,11 +9,12 @@
>  use core::ops::Range;
>  
>  use kernel::{
> -    dma::CoherentAllocation,
> +    dma::Coherent,
>      prelude::*,
>      ptr::{
>          Alignable,
> -        Alignment, //
> +        Alignment,
> +        KnownSize, //
>      },
>      sizes::{
>          SZ_128K,
> @@ -648,7 +649,9 @@ unsafe impl AsBytes for RunCpuSequencer {}
>  /// The memory allocated for the arguments must remain until the GSP sends the
>  /// init_done RPC.
>  #[repr(transparent)]
> -pub(crate) struct LibosMemoryRegionInitArgument(bindings::LibosMemoryRegionInitArgument);
> +pub(crate) struct LibosMemoryRegionInitArgument {
> +    inner: bindings::LibosMemoryRegionInitArgument,
> +}
>  
>  // SAFETY: Padding is explicit and does not contain uninitialized data.
>  unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
> @@ -658,10 +661,10 @@ unsafe impl AsBytes for LibosMemoryRegionInitArgument {}
>  unsafe impl FromBytes for LibosMemoryRegionInitArgument {}
>  
>  impl LibosMemoryRegionInitArgument {
> -    pub(crate) fn new<A: AsBytes + FromBytes>(
> +    pub(crate) fn new<'a, A: AsBytes + FromBytes + KnownSize + ?Sized>(
>          name: &'static str,
> -        obj: &CoherentAllocation<A>,
> -    ) -> Self {
> +        obj: &'a Coherent<A>,
> +    ) -> impl Init<Self> + 'a {
>          /// Generates the `ID8` identifier required for some GSP objects.
>          fn id8(name: &str) -> u64 {
>              let mut bytes = [0u8; core::mem::size_of::<u64>()];
> @@ -673,7 +676,8 @@ fn id8(name: &str) -> u64 {
>              u64::from_ne_bytes(bytes)
>          }
>  
> -        Self(bindings::LibosMemoryRegionInitArgument {
> +        #[allow(non_snake_case)]
> +        let init_inner = init!(bindings::LibosMemoryRegionInitArgument {
>              id8: id8(name),
>              pa: obj.dma_handle(),
>              size: num::usize_as_u64(obj.size()),
> @@ -683,7 +687,11 @@ fn id8(name: &str) -> u64 {
>              loc: num::u32_into_u8::<
>                  { bindings::LibosMemoryRegionLoc_LIBOS_MEMORY_REGION_LOC_SYSMEM },
>              >(),
> -            ..Default::default()
> +            ..Zeroable::init_zeroed()
> +        });
> +
> +        init!(LibosMemoryRegionInitArgument {
> +            inner <- init_inner,
>          })
>      }
>  }
> @@ -862,15 +870,23 @@ unsafe impl FromBytes for GspMsgElement {}
>  
>  /// Arguments for GSP startup.
>  #[repr(transparent)]
> -pub(crate) struct GspArgumentsCached(bindings::GSP_ARGUMENTS_CACHED);
> +#[derive(Zeroable)]
> +pub(crate) struct GspArgumentsCached {
> +    inner: bindings::GSP_ARGUMENTS_CACHED,
> +}
>  
>  impl GspArgumentsCached {
>      /// Creates the arguments for starting the GSP up using `cmdq` as its command queue.
> -    pub(crate) fn new(cmdq: &Cmdq) -> Self {
> -        Self(bindings::GSP_ARGUMENTS_CACHED {
> -            messageQueueInitArguments: MessageQueueInitArguments::new(cmdq).0,
> +    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
> +        #[allow(non_snake_case)]
> +        let init_inner = init!(bindings::GSP_ARGUMENTS_CACHED {
> +            messageQueueInitArguments <- MessageQueueInitArguments::new(cmdq),
>              bDmemStack: 1,
> -            ..Default::default()
> +            ..Zeroable::init_zeroed()
> +        });
> +
> +        init!(GspArgumentsCached {
> +            inner <- init_inner,
>          })
>      }
>  }
> @@ -882,11 +898,21 @@ unsafe impl AsBytes for GspArgumentsCached {}
>  /// must all be a multiple of GSP_PAGE_SIZE in size, so add padding to force it
>  /// to that size.
>  #[repr(C)]
> +#[derive(Zeroable)]
>  pub(crate) struct GspArgumentsPadded {
>      pub(crate) inner: GspArgumentsCached,
>      _padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()],
>  }
>  
> +impl GspArgumentsPadded {
> +    pub(crate) fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
> +        init!(GspArgumentsPadded {
> +            inner <- GspArgumentsCached::new(cmdq),
> +            ..Zeroable::init_zeroed()
> +        })
> +    }
> +}
> +
>  // SAFETY: Padding is explicit and will not contain uninitialized data.
>  unsafe impl AsBytes for GspArgumentsPadded {}
>  
> @@ -895,18 +921,18 @@ unsafe impl AsBytes for GspArgumentsPadded {}
>  unsafe impl FromBytes for GspArgumentsPadded {}
>  
>  /// Init arguments for the message queue.
> -#[repr(transparent)]
> -struct MessageQueueInitArguments(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS);
> +type MessageQueueInitArguments = bindings::MESSAGE_QUEUE_INIT_ARGUMENTS;

Why is this typedef while the others new type? Because this one is not
`pub(crate)`?

Anyway, just a question.

Reviewed-by: Gary Guo <gary@garyguo.net>

>  
>  impl MessageQueueInitArguments {
>      /// Creates a new init arguments structure for `cmdq`.
> -    fn new(cmdq: &Cmdq) -> Self {
> -        Self(bindings::MESSAGE_QUEUE_INIT_ARGUMENTS {
> +    #[allow(non_snake_case)]
> +    fn new(cmdq: &Cmdq) -> impl Init<Self> + '_ {
> +        init!(MessageQueueInitArguments {
>              sharedMemPhysAddr: cmdq.dma_handle(),
>              pageTableEntryCount: num::usize_into_u32::<{ Cmdq::NUM_PTES }>(),
>              cmdQueueOffset: num::usize_as_u64(Cmdq::CMDQ_OFFSET),
>              statQueueOffset: num::usize_as_u64(Cmdq::STATQ_OFFSET),
> -            ..Default::default()
> +            ..Zeroable::init_zeroed()
>          })
>      }
>  }


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
                   ` (7 preceding siblings ...)
  2026-03-20 19:45 ` [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
@ 2026-03-21  5:13 ` Alexandre Courbot
  2026-03-23 21:56 ` Danilo Krummrich
  9 siblings, 0 replies; 31+ messages in thread
From: Alexandre Courbot @ 2026-03-21  5:13 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Sat Mar 21, 2026 at 4:45 AM JST, Danilo Krummrich wrote:
> This patch series introduces the dma::Coherent API Gary worked out in the
> context of his I/O projection work.
>
> Additionally, introduce dma::CoherentBox, a type that encapsulates a
> dma::Coherent object before its DMA address is exposed to the device.
> dma::CoherentBox can guarantee exclusive access to the inner dma::Coherent
> object and implement Deref and DerefMut.
>
> Also add Coherent::init() and Coherent::init_with_attrs() so we can directly
> initialize a new dma::Coherent object through an impl Init<T, E>.

The series,

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>

Based on this I have been able to remove nova-core's `DmaObject` and
consequently its now-useless `dma` module. It's indeed much nicer. I'll
submit the series after some more cleanup and testing.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent`
  2026-03-20 19:45 ` [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
@ 2026-03-21  6:37   ` Alexandre Courbot
  2026-03-24 13:46   ` Andreas Hindborg
  1 sibling, 0 replies; 31+ messages in thread
From: Alexandre Courbot @ 2026-03-21  6:37 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Sat Mar 21, 2026 at 4:45 AM JST, Danilo Krummrich wrote:
<snip>
> @@ -529,6 +528,35 @@ fn alloc_with_attrs(
>          })
>      }
>  
> +    /// Allocates a region of type `T` of coherent memory.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// # use kernel::device::{Bound, Device};
> +    /// use kernel::dma::{attrs::*, Coherent};

nit: aren't we using the kernel import format in examples as well?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API
  2026-03-20 19:45 ` [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
@ 2026-03-21 16:50   ` Gary Guo
  2026-03-21 18:22     ` Danilo Krummrich
  0 siblings, 1 reply; 31+ messages in thread
From: Gary Guo @ 2026-03-21 16:50 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Fri Mar 20, 2026 at 7:45 PM GMT, Danilo Krummrich wrote:
> From: Gary Guo <gary@garyguo.net>
>
> Remove all usages of dma::CoherentAllocation and use the new
> dma::Coherent type instead.
>
> Note that there are still remainders of the old dma::CoherentAllocation
> API, such as as_slice() and as_slice_mut().

Hi Danilo,

It looks like with Alex's "gpu: nova-core: create falcon firmware DMA objects
lazily" landed, all others users of the old API are now gone.

So this line could be dropped and `impl CoherentAllocation` and the type alias
can be removed after this patch.

Best,
Gary

>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> Co-developed-by: Danilo Krummrich <dakr@kernel.org>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  drivers/gpu/nova-core/dma.rs      | 19 ++++++-------
>  drivers/gpu/nova-core/falcon.rs   |  5 ++--
>  drivers/gpu/nova-core/gsp.rs      | 21 ++++++++------
>  drivers/gpu/nova-core/gsp/cmdq.rs | 21 ++++++--------
>  drivers/gpu/nova-core/gsp/fw.rs   | 46 ++++++++++---------------------
>  5 files changed, 47 insertions(+), 65 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs
> index 7215398969da..3c19d5ffcfe8 100644
> --- a/drivers/gpu/nova-core/dma.rs
> +++ b/drivers/gpu/nova-core/dma.rs
> @@ -9,13 +9,13 @@
>  
>  use kernel::{
>      device,
> -    dma::CoherentAllocation,
> +    dma::Coherent,
>      page::PAGE_SIZE,
>      prelude::*, //
>  };
>  
>  pub(crate) struct DmaObject {
> -    dma: CoherentAllocation<u8>,
> +    dma: Coherent<[u8]>,
>  }
>  
>  impl DmaObject {
> @@ -24,23 +24,22 @@ pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Sel
>              .map_err(|_| EINVAL)?
>              .pad_to_align()
>              .size();
> -        let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?;
> +        let dma = Coherent::zeroed_slice(dev, len, GFP_KERNEL)?;
>  
>          Ok(Self { dma })
>      }
>  
>      pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> {
> -        Self::new(dev, data.len()).and_then(|mut dma_obj| {
> -            // SAFETY: We have just allocated the DMA memory, we are the only users and
> -            // we haven't made the device aware of the handle yet.
> -            unsafe { dma_obj.write(data, 0)? }
> -            Ok(dma_obj)
> -        })
> +        let dma_obj = Self::new(dev, data.len())?;
> +        // SAFETY: We have just allocated the DMA memory, we are the only users and
> +        // we haven't made the device aware of the handle yet.
> +        unsafe { dma_obj.as_mut()[..data.len()].copy_from_slice(data) };
> +        Ok(dma_obj)
>      }
>  }
>  
>  impl Deref for DmaObject {
> -    type Target = CoherentAllocation<u8>;
> +    type Target = Coherent<[u8]>;
>  
>      fn deref(&self) -> &Self::Target {
>          &self.dma
> diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
> index 7097a206ec3c..5bf8da8760bf 100644
> --- a/drivers/gpu/nova-core/falcon.rs
> +++ b/drivers/gpu/nova-core/falcon.rs
> @@ -26,8 +26,7 @@
>      gpu::Chipset,
>      num::{
>          self,
> -        FromSafeCast,
> -        IntoSafeCast, //
> +        FromSafeCast, //
>      },
>      regs,
>      regs::macros::RegisterBase, //
> @@ -653,7 +652,7 @@ fn dma_wr(
>              }
>              FalconMem::Dmem => (
>                  0,
> -                dma_obj.dma_handle_with_offset(load_offsets.src_start.into_safe_cast())?,
> +                dma_obj.dma_handle() + DmaAddress::from(load_offsets.src_start),
>              ),
>          };
>          if dma_start % DmaAddress::from(DMA_LEN) > 0 {
> diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
> index f0a50bdc4c00..a045c4189989 100644
> --- a/drivers/gpu/nova-core/gsp.rs
> +++ b/drivers/gpu/nova-core/gsp.rs
> @@ -6,13 +6,15 @@
>      device,
>      dma::{
>          Coherent,
> -        CoherentAllocation,
>          CoherentBox,
>          DmaAddress, //
>      },
>      pci,
>      prelude::*,
> -    transmute::AsBytes, //
> +    transmute::{
> +        AsBytes,
> +        FromBytes, //
> +    }, //
>  };
>  
>  pub(crate) mod cmdq;
> @@ -44,6 +46,9 @@
>  #[repr(C)]
>  struct PteArray<const NUM_ENTRIES: usize>([u64; NUM_ENTRIES]);
>  
> +/// SAFETY: arrays of `u64` implement `FromBytes` and we are but a wrapper around one.
> +unsafe impl<const NUM_ENTRIES: usize> FromBytes for PteArray<NUM_ENTRIES> {}
> +
>  /// SAFETY: arrays of `u64` implement `AsBytes` and we are but a wrapper around one.
>  unsafe impl<const NUM_ENTRIES: usize> AsBytes for PteArray<NUM_ENTRIES> {}
>  
> @@ -71,26 +76,24 @@ fn entry(start: DmaAddress, index: usize) -> Result<u64> {
>  /// then pp points to index into the buffer where the next logging entry will
>  /// be written. Therefore, the logging data is valid if:
>  ///   1 <= pp < sizeof(buffer)/sizeof(u64)
> -struct LogBuffer(CoherentAllocation<u8>);
> +struct LogBuffer(Coherent<[u8]>);
>  
>  impl LogBuffer {
>      /// Creates a new `LogBuffer` mapped on `dev`.
>      fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          const NUM_PAGES: usize = RM_LOG_BUFFER_NUM_PAGES;
>  
> -        let mut obj = Self(CoherentAllocation::<u8>::alloc_coherent(
> +        let obj = Self(Coherent::<u8>::zeroed_slice(
>              dev,
>              NUM_PAGES * GSP_PAGE_SIZE,
> -            GFP_KERNEL | __GFP_ZERO,
> +            GFP_KERNEL,
>          )?);
>  
>          let start_addr = obj.0.dma_handle();
>  
>          // SAFETY: `obj` has just been created and we are its sole user.
> -        let pte_region = unsafe {
> -            obj.0
> -                .as_slice_mut(size_of::<u64>(), NUM_PAGES * size_of::<u64>())?
> -        };
> +        let pte_region =
> +            unsafe { &mut obj.0.as_mut()[size_of::<u64>()..][..NUM_PAGES * size_of::<u64>()] };
>  
>          // Write values one by one to avoid an on-stack instance of `PteArray`.
>          for (i, chunk) in pte_region.chunks_exact_mut(size_of::<u64>()).enumerate() {
> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index d36a62ba1c60..f38790601a0f 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs
> @@ -7,7 +7,7 @@
>  use kernel::{
>      device,
>      dma::{
> -        CoherentAllocation,
> +        Coherent,
>          DmaAddress, //
>      },
>      dma_write,
> @@ -207,7 +207,7 @@ unsafe impl AsBytes for GspMem {}
>  // that is not a problem because they are not used outside the kernel.
>  unsafe impl FromBytes for GspMem {}
>  
> -/// Wrapper around [`GspMem`] to share it with the GPU using a [`CoherentAllocation`].
> +/// Wrapper around [`GspMem`] to share it with the GPU using a [`Coherent`].
>  ///
>  /// This provides the low-level functionality to communicate with the GSP, including allocation of
>  /// queue space to write messages to and management of read/write pointers.
> @@ -218,7 +218,7 @@ unsafe impl FromBytes for GspMem {}
>  ///   pointer and the GSP read pointer. This region is returned by [`Self::driver_write_area`].
>  /// * The driver owns (i.e. can read from) the part of the GSP message queue between the CPU read
>  ///   pointer and the GSP write pointer. This region is returned by [`Self::driver_read_area`].
> -struct DmaGspMem(CoherentAllocation<GspMem>);
> +struct DmaGspMem(Coherent<GspMem>);
>  
>  impl DmaGspMem {
>      /// Allocate a new instance and map it for `dev`.
> @@ -226,21 +226,20 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          const MSGQ_SIZE: u32 = num::usize_into_u32::<{ size_of::<Msgq>() }>();
>          const RX_HDR_OFF: u32 = num::usize_into_u32::<{ mem::offset_of!(Msgq, rx) }>();
>  
> -        let gsp_mem =
> -            CoherentAllocation::<GspMem>::alloc_coherent(dev, 1, GFP_KERNEL | __GFP_ZERO)?;
> +        let gsp_mem = Coherent::<GspMem>::zeroed(dev, GFP_KERNEL)?;
>  
>          let start = gsp_mem.dma_handle();
>          // Write values one by one to avoid an on-stack instance of `PteArray`.
>          for i in 0..GspMem::PTE_ARRAY_SIZE {
> -            dma_write!(gsp_mem, [0]?.ptes.0[i], PteArray::<0>::entry(start, i)?);
> +            dma_write!(gsp_mem, .ptes.0[i], PteArray::<0>::entry(start, i)?);
>          }
>  
>          dma_write!(
>              gsp_mem,
> -            [0]?.cpuq.tx,
> +            .cpuq.tx,
>              MsgqTxHeader::new(MSGQ_SIZE, RX_HDR_OFF, MSGQ_NUM_PAGES)
>          );
> -        dma_write!(gsp_mem, [0]?.cpuq.rx, MsgqRxHeader::new());
> +        dma_write!(gsp_mem, .cpuq.rx, MsgqRxHeader::new());
>  
>          Ok(Self(gsp_mem))
>      }
> @@ -255,10 +254,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
>          let rx = self.gsp_read_ptr() as usize;
>  
>          // SAFETY:
> -        // - The `CoherentAllocation` contains exactly one object.
>          // - We will only access the driver-owned part of the shared memory.
>          // - Per the safety statement of the function, no concurrent access will be performed.
> -        let gsp_mem = &mut unsafe { self.0.as_slice_mut(0, 1) }.unwrap()[0];
> +        let gsp_mem = unsafe { &mut *self.0.as_mut() };
>          // PANIC: per the invariant of `cpu_write_ptr`, `tx` is `< MSGQ_NUM_PAGES`.
>          let (before_tx, after_tx) = gsp_mem.cpuq.msgq.data.split_at_mut(tx);
>  
> @@ -309,10 +307,9 @@ fn driver_write_area_size(&self) -> usize {
>          let rx = self.cpu_read_ptr() as usize;
>  
>          // SAFETY:
> -        // - The `CoherentAllocation` contains exactly one object.
>          // - We will only access the driver-owned part of the shared memory.
>          // - Per the safety statement of the function, no concurrent access will be performed.
> -        let gsp_mem = &unsafe { self.0.as_slice(0, 1) }.unwrap()[0];
> +        let gsp_mem = unsafe { &*self.0.as_ptr() };
>          let data = &gsp_mem.gspq.msgq.data;
>  
>          // The area starting at `rx` and ending at `tx - 1` modulo MSGQ_NUM_PAGES, inclusive,
> diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
> index 0d8daf6a80b7..847b5eb215d4 100644
> --- a/drivers/gpu/nova-core/gsp/fw.rs
> +++ b/drivers/gpu/nova-core/gsp/fw.rs
> @@ -40,8 +40,7 @@
>      },
>  };
>  
> -// TODO: Replace with `IoView` projections once available; the `unwrap()` calls go away once we
> -// switch to the new `dma::Coherent` API.
> +// TODO: Replace with `IoView` projections once available.
>  pub(super) mod gsp_mem {
>      use core::sync::atomic::{
>          fence,
> @@ -49,10 +48,9 @@ pub(super) mod gsp_mem {
>      };
>  
>      use kernel::{
> -        dma::CoherentAllocation,
> +        dma::Coherent,
>          dma_read,
> -        dma_write,
> -        prelude::*, //
> +        dma_write, //
>      };
>  
>      use crate::gsp::cmdq::{
> @@ -60,49 +58,35 @@ pub(super) mod gsp_mem {
>          MSGQ_NUM_PAGES, //
>      };
>  
> -    pub(in crate::gsp) fn gsp_write_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
> -        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
> -        || -> Result<u32> { Ok(dma_read!(qs, [0]?.gspq.tx.0.writePtr) % MSGQ_NUM_PAGES) }().unwrap()
> +    pub(in crate::gsp) fn gsp_write_ptr(qs: &Coherent<GspMem>) -> u32 {
> +        dma_read!(qs, .gspq.tx.0.writePtr) % MSGQ_NUM_PAGES
>      }
>  
> -    pub(in crate::gsp) fn gsp_read_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
> -        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
> -        || -> Result<u32> { Ok(dma_read!(qs, [0]?.gspq.rx.0.readPtr) % MSGQ_NUM_PAGES) }().unwrap()
> +    pub(in crate::gsp) fn gsp_read_ptr(qs: &Coherent<GspMem>) -> u32 {
> +        dma_read!(qs, .gspq.rx.0.readPtr) % MSGQ_NUM_PAGES
>      }
>  
> -    pub(in crate::gsp) fn cpu_read_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
> -        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
> -        || -> Result<u32> { Ok(dma_read!(qs, [0]?.cpuq.rx.0.readPtr) % MSGQ_NUM_PAGES) }().unwrap()
> +    pub(in crate::gsp) fn cpu_read_ptr(qs: &Coherent<GspMem>) -> u32 {
> +        dma_read!(qs, .cpuq.rx.0.readPtr) % MSGQ_NUM_PAGES
>      }
>  
> -    pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &CoherentAllocation<GspMem>, count: u32) {
> +    pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &Coherent<GspMem>, count: u32) {
>          let rptr = cpu_read_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
>  
>          // Ensure read pointer is properly ordered.
>          fence(Ordering::SeqCst);
>  
> -        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
> -        || -> Result {
> -            dma_write!(qs, [0]?.cpuq.rx.0.readPtr, rptr);
> -            Ok(())
> -        }()
> -        .unwrap()
> +        dma_write!(qs, .cpuq.rx.0.readPtr, rptr);
>      }
>  
> -    pub(in crate::gsp) fn cpu_write_ptr(qs: &CoherentAllocation<GspMem>) -> u32 {
> -        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
> -        || -> Result<u32> { Ok(dma_read!(qs, [0]?.cpuq.tx.0.writePtr) % MSGQ_NUM_PAGES) }().unwrap()
> +    pub(in crate::gsp) fn cpu_write_ptr(qs: &Coherent<GspMem>) -> u32 {
> +        dma_read!(qs, .cpuq.tx.0.writePtr) % MSGQ_NUM_PAGES
>      }
>  
> -    pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &CoherentAllocation<GspMem>, count: u32) {
> +    pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &Coherent<GspMem>, count: u32) {
>          let wptr = cpu_write_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
>  
> -        // PANIC: A `dma::CoherentAllocation` always contains at least one element.
> -        || -> Result {
> -            dma_write!(qs, [0]?.cpuq.tx.0.writePtr, wptr);
> -            Ok(())
> -        }()
> -        .unwrap();
> +        dma_write!(qs, .cpuq.tx.0.writePtr, wptr);
>  
>          // Ensure all command data is visible before triggering the GSP read.
>          fence(Ordering::SeqCst);


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API
  2026-03-21 16:50   ` Gary Guo
@ 2026-03-21 18:22     ` Danilo Krummrich
  2026-03-21 22:36       ` Gary Guo
  0 siblings, 1 reply; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-21 18:22 UTC (permalink / raw)
  To: Gary Guo
  Cc: aliceryhl, acourbot, ojeda, boqun, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On 3/21/26 5:50 PM, Gary Guo wrote:
> It looks like with Alex's "gpu: nova-core: create falcon firmware DMA objects
> lazily" landed, all others users of the old API are now gone.

Good catch.

> So this line could be dropped and `impl CoherentAllocation` and the type alias
> can be removed after this patch.

I will drop the line on apply and add the below patch to remove
CoherentAllocation.

- Danilo

commit e96bc0b65bec48ac0f1cf2fc15b39c1b26b9d973 (HEAD -> drm-rust-next)
Author: Danilo Krummrich <dakr@kernel.org>
Date:   Sat Mar 21 19:18:08 2026 +0100

    rust: dma: remove dma::CoherentAllocation<T>

    Now that everything has been converted to the new dma::Coherent<T> API,
    remove dma::CoherentAllocation<T>.

    Suggested-by: Gary Guo <gary@garyguo.net>
    Signed-off-by: Danilo Krummrich <dakr@kernel.org>

diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
index 6d2bec52806b..779d4babab9a 100644
--- a/rust/kernel/dma.rs
+++ b/rust/kernel/dma.rs
@@ -841,156 +841,6 @@ pub fn len(&self) -> usize {
     }
 }

-// Type alias for compatibility.
-#[doc(hidden)]
-pub type CoherentAllocation<T> = Coherent<[T]>;
-
-impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
-    /// Allocates a region of `size_of::<T> * count` of coherent memory.
-    ///
-    /// # Examples
-    ///
-    /// ```
-    /// # use kernel::device::{Bound, Device};
-    /// use kernel::dma::{attrs::*, CoherentAllocation};
-    ///
-    /// # fn test(dev: &Device<Bound>) -> Result {
-    /// let c: CoherentAllocation<u64> =
-    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
-    /// # Ok::<(), Error>(()) }
-    /// ```
-    pub fn alloc_attrs(
-        dev: &device::Device<Bound>,
-        count: usize,
-        gfp_flags: kernel::alloc::Flags,
-        dma_attrs: Attrs,
-    ) -> Result<CoherentAllocation<T>> {
-        Coherent::alloc_slice_with_attrs(dev, count, gfp_flags, dma_attrs)
-    }
-
-    /// Performs the same functionality as [`CoherentAllocation::alloc_attrs`], except the
-    /// `dma_attrs` is 0 by default.
-    pub fn alloc_coherent(
-        dev: &device::Device<Bound>,
-        count: usize,
-        gfp_flags: kernel::alloc::Flags,
-    ) -> Result<CoherentAllocation<T>> {
-        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
-    }
-
-    /// Returns the base address to the allocated region in the CPU's virtual address space.
-    pub fn start_ptr(&self) -> *const T {
-        self.as_ptr().cast()
-    }
-
-    /// Returns the base address to the allocated region in the CPU's virtual address space as
-    /// a mutable pointer.
-    pub fn start_ptr_mut(&mut self) -> *mut T {
-        self.as_mut_ptr().cast()
-    }
-
-    /// Returns a DMA handle starting at `offset` (in units of `T`) which may be given to the
-    /// device as the DMA address base of the region.
-    ///
-    /// Returns `EINVAL` if `offset` is not within the bounds of the allocation.
-    pub fn dma_handle_with_offset(&self, offset: usize) -> Result<DmaAddress> {
-        if offset >= self.len() {
-            Err(EINVAL)
-        } else {
-            Ok(self.dma_handle + (offset * core::mem::size_of::<T>()) as DmaAddress)
-        }
-    }
-
-    /// Common helper to validate a range applied from the allocated region in the CPU's virtual
-    /// address space.
-    fn validate_range(&self, offset: usize, count: usize) -> Result {
-        if offset.checked_add(count).ok_or(EOVERFLOW)? > self.len() {
-            return Err(EINVAL);
-        }
-        Ok(())
-    }
-
-    /// Returns the data from the region starting from `offset` as a slice.
-    /// `offset` and `count` are in units of `T`, not the number of bytes.
-    ///
-    /// For ringbuffer type of r/w access or use-cases where the pointer to the live data is needed,
-    /// [`CoherentAllocation::start_ptr`] or [`CoherentAllocation::start_ptr_mut`] could be used
-    /// instead.
-    ///
-    /// # Safety
-    ///
-    /// * Callers must ensure that the device does not read/write to/from memory while the returned
-    ///   slice is live.
-    /// * Callers must ensure that this call does not race with a write to the same region while
-    ///   the returned slice is live.
-    pub unsafe fn as_slice(&self, offset: usize, count: usize) -> Result<&[T]> {
-        self.validate_range(offset, count)?;
-        // SAFETY:
-        // - The pointer is valid due to type invariant on `CoherentAllocation`,
-        //   we've just checked that the range and index is within bounds. The immutability of the
-        //   data is also guaranteed by the safety requirements of the function.
-        // - `offset + count` can't overflow since it is smaller than `self.count` and we've checked
-        //   that `self.count` won't overflow early in the constructor.
-        Ok(unsafe { core::slice::from_raw_parts(self.start_ptr().add(offset), count) })
-    }
-
-    /// Performs the same functionality as [`CoherentAllocation::as_slice`], except that a mutable
-    /// slice is returned.
-    ///
-    /// # Safety
-    ///
-    /// * Callers must ensure that the device does not read/write to/from memory while the returned
-    ///   slice is live.
-    /// * Callers must ensure that this call does not race with a read or write to the same region
-    ///   while the returned slice is live.
-    pub unsafe fn as_slice_mut(&mut self, offset: usize, count: usize) -> Result<&mut [T]> {
-        self.validate_range(offset, count)?;
-        // SAFETY:
-        // - The pointer is valid due to type invariant on `CoherentAllocation`,
-        //   we've just checked that the range and index is within bounds. The immutability of the
-        //   data is also guaranteed by the safety requirements of the function.
-        // - `offset + count` can't overflow since it is smaller than `self.count` and we've checked
-        //   that `self.count` won't overflow early in the constructor.
-        Ok(unsafe { core::slice::from_raw_parts_mut(self.start_ptr_mut().add(offset), count) })
-    }
-
-    /// Writes data to the region starting from `offset`. `offset` is in units of `T`, not the
-    /// number of bytes.
-    ///
-    /// # Safety
-    ///
-    /// * Callers must ensure that this call does not race with a read or write to the same region
-    ///   that overlaps with this write.
-    ///
-    /// # Examples
-    ///
-    /// ```
-    /// # fn test(alloc: &mut kernel::dma::CoherentAllocation<u8>) -> Result {
-    /// let somedata: [u8; 4] = [0xf; 4];
-    /// let buf: &[u8] = &somedata;
-    /// // SAFETY: There is no concurrent HW operation on the device and no other R/W access to the
-    /// // region.
-    /// unsafe { alloc.write(buf, 0)?; }
-    /// # Ok::<(), Error>(()) }
-    /// ```
-    pub unsafe fn write(&mut self, src: &[T], offset: usize) -> Result {
-        self.validate_range(offset, src.len())?;
-        // SAFETY:
-        // - The pointer is valid due to type invariant on `CoherentAllocation`
-        //   and we've just checked that the range and index is within bounds.
-        // - `offset + count` can't overflow since it is smaller than `self.count` and we've checked
-        //   that `self.count` won't overflow early in the constructor.
-        unsafe {
-            core::ptr::copy_nonoverlapping(
-                src.as_ptr(),
-                self.start_ptr_mut().add(offset),
-                src.len(),
-            )
-        };
-        Ok(())
-    }
-}
-
 /// Note that the device configured to do DMA must be halted before this object is dropped.
 impl<T: KnownSize + ?Sized> Drop for Coherent<T> {
     fn drop(&mut self) {

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API
  2026-03-21 18:22     ` Danilo Krummrich
@ 2026-03-21 22:36       ` Gary Guo
  0 siblings, 0 replies; 31+ messages in thread
From: Gary Guo @ 2026-03-21 22:36 UTC (permalink / raw)
  To: Danilo Krummrich, Gary Guo
  Cc: aliceryhl, acourbot, ojeda, boqun, bjorn3_gh, lossin, a.hindborg,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Sat Mar 21, 2026 at 6:22 PM GMT, Danilo Krummrich wrote:
> On 3/21/26 5:50 PM, Gary Guo wrote:
> > It looks like with Alex's "gpu: nova-core: create falcon firmware DMA objects
> > lazily" landed, all others users of the old API are now gone.
> 
> Good catch.
> 
> > So this line could be dropped and `impl CoherentAllocation` and the type alias
> > can be removed after this patch.
> 
> I will drop the line on apply and add the below patch to remove
> CoherentAllocation.
> 
> - Danilo
> 
> commit e96bc0b65bec48ac0f1cf2fc15b39c1b26b9d973 (HEAD -> drm-rust-next)
> Author: Danilo Krummrich <dakr@kernel.org>
> Date:   Sat Mar 21 19:18:08 2026 +0100
> 
>     rust: dma: remove dma::CoherentAllocation<T>
> 
>     Now that everything has been converted to the new dma::Coherent<T> API,
>     remove dma::CoherentAllocation<T>.
> 
>     Suggested-by: Gary Guo <gary@garyguo.net>
>     Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Gary Guo <gary@garyguo.net>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/8] rust: dma: add generalized container for types other than slices
  2026-03-20 19:45 ` [PATCH v2 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
@ 2026-03-21 23:22   ` Aditya Rajan
  2026-03-22  0:47     ` Gary Guo
  2026-03-24 13:42   ` Andreas Hindborg
  1 sibling, 1 reply; 31+ messages in thread
From: Aditya Rajan @ 2026-03-21 23:22 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Fri Mar 20, 2026 at 12:45 PM PDT, Danilo Krummrich wrote:
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index 2eea7e2f8f04..ff3e147f1a23 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> +impl<T> Coherent<[T]> {
>      /// Returns the number of elements `T` in this allocation.
>      ///
>      /// Note that this is not the size of the allocation in bytes, which is provided by
>      /// [`Self::size`].
> -    pub fn count(&self) -> usize {
> -        self.count
There is still a mention of `self.count` in the safety comments; perhaps that should be changed to `self.len()` ?
For example, see the safety comments in the following functions: `as_slice`, `as_slice_mut`, and `write` in same file.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/8] rust: dma: add generalized container for types other than slices
  2026-03-21 23:22   ` Aditya Rajan
@ 2026-03-22  0:47     ` Gary Guo
  2026-03-22  6:24       ` Aditya Rajan
  0 siblings, 1 reply; 31+ messages in thread
From: Gary Guo @ 2026-03-22  0:47 UTC (permalink / raw)
  To: Aditya Rajan, Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun,
	gary, bjorn3_gh, lossin, a.hindborg, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Sat Mar 21, 2026 at 11:22 PM GMT, Aditya Rajan wrote:
> On Fri Mar 20, 2026 at 12:45 PM PDT, Danilo Krummrich wrote:
>> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
>> index 2eea7e2f8f04..ff3e147f1a23 100644
>> --- a/rust/kernel/dma.rs
>> +++ b/rust/kernel/dma.rs
>> +impl<T> Coherent<[T]> {
>>      /// Returns the number of elements `T` in this allocation.
>>      ///
>>      /// Note that this is not the size of the allocation in bytes, which is provided by
>>      /// [`Self::size`].
>> -    pub fn count(&self) -> usize {
>> -        self.count
> There is still a mention of `self.count` in the safety comments; perhaps that should be changed to `self.len()` ?
> For example, see the safety comments in the following functions: `as_slice`, `as_slice_mut`, and `write` in same file.

They're going to be removed later, see https://lore.kernel.org/rust-for-linux/DH87CCNRR5EC.2SK3IT6N6Q8V5@nvidia.com/T/#m30142b65bb87af64860838565c784c304e02b563

Best,
Gary

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/8] rust: dma: add generalized container for types other than slices
  2026-03-22  0:47     ` Gary Guo
@ 2026-03-22  6:24       ` Aditya Rajan
  0 siblings, 0 replies; 31+ messages in thread
From: Aditya Rajan @ 2026-03-22  6:24 UTC (permalink / raw)
  To: Gary Guo, Aditya Rajan, Danilo Krummrich, aliceryhl, acourbot,
	ojeda, boqun, bjorn3_gh, lossin, a.hindborg, tmgross,
	abdiel.janulgue, daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Sat Mar 21, 2026 at 5:47 PM PDT, Gary Guo wrote:
> They're going to be removed later, see https://lore.kernel.org/rust-for-linux/DH87CCNRR5EC.2SK3IT6N6Q8V5@nvidia.com/T/#m30142b65bb87af64860838565c784c304e02b563
Got it ! Patch LGTM !

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API
  2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
                   ` (8 preceding siblings ...)
  2026-03-21  5:13 ` [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Alexandre Courbot
@ 2026-03-23 21:56 ` Danilo Krummrich
  9 siblings, 0 replies; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-23 21:56 UTC (permalink / raw)
  To: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	a.hindborg, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

On Fri Mar 20, 2026 at 8:45 PM CET, Danilo Krummrich wrote:

Applied to drm-rust-next, thanks!

> Danilo Krummrich (5):
>   rust: dma: use "kernel vertical" style for imports
>   rust: dma: introduce dma::CoherentBox for memory initialization

    [ Remove unnecessary trait bounds. - Danilo ]

>   rust: dma: add Coherent:init() and Coherent::init_with_attrs()
>   gpu: nova-core: use Coherent::init to initialize GspFwWprMeta
>   gpu: nova-core: convert Gsp::new() to use CoherentBox

    rust: dma: remove dma::CoherentAllocation<T>

> Gary Guo (3):
>   rust: dma: add generalized container for types other than slices
>   rust: dma: add zeroed constructor to `Coherent`

    [ Use kernel import style. - Danilo ]

>   gpu: nova-core: convert to new dma::Coherent API

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/8] rust: dma: add generalized container for types other than slices
  2026-03-20 19:45 ` [PATCH v2 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
  2026-03-21 23:22   ` Aditya Rajan
@ 2026-03-24 13:42   ` Andreas Hindborg
  2026-03-24 14:06     ` Gary Guo
  1 sibling, 1 reply; 31+ messages in thread
From: Andreas Hindborg @ 2026-03-24 13:42 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

"Danilo Krummrich" <dakr@kernel.org> writes:

> From: Gary Guo <gary@garyguo.net>
>
> Currently, `CoherentAllocation` is concecptually a DMA coherent container
> of a slice of `[T]` of runtime-checked length. Generalize it by creating
> `dma::Coherent<T>` which can hold any value of `T`.
> `Coherent::alloc_with_attrs` is implemented but not yet exposed, as I
> believe we should not expose the way to obtain an uninitialized coherent
> region.
>
> `Coherent<[T]>` provides a `len` method instead of the previous `count()`
> method to be consistent with methods on slices.
>
> The existing type is re-defined as a type alias of `Coherent<[T]>` to ease
> transition. Methods in use are not yet removed.
>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  rust/kernel/device.rs |   4 +-
>  rust/kernel/dma.rs    | 361 ++++++++++++++++++++++++++----------------
>  2 files changed, 226 insertions(+), 139 deletions(-)
>
> diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
> index 94e0548e7687..379058eca2ed 100644
> --- a/rust/kernel/device.rs
> +++ b/rust/kernel/device.rs
> @@ -575,7 +575,7 @@ pub trait DeviceContext: private::Sealed {}
>  /// The bound context indicates that for the entire duration of the lifetime of a [`Device<Bound>`]
>  /// reference, the [`Device`] is guaranteed to be bound to a driver.
>  ///
> -/// Some APIs, such as [`dma::CoherentAllocation`] or [`Devres`] rely on the [`Device`] to be bound,
> +/// Some APIs, such as [`dma::Coherent`] or [`Devres`] rely on the [`Device`] to be bound,
>  /// which can be proven with the [`Bound`] device context.
>  ///
>  /// Any abstraction that can guarantee a scope where the corresponding bus device is bound, should
> @@ -584,7 +584,7 @@ pub trait DeviceContext: private::Sealed {}
>  ///
>  /// [`Devres`]: kernel::devres::Devres
>  /// [`Devres::access`]: kernel::devres::Devres::access
> -/// [`dma::CoherentAllocation`]: kernel::dma::CoherentAllocation
> +/// [`dma::Coherent`]: kernel::dma::Coherent
>  pub struct Bound;
>
>  mod private {
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index 2eea7e2f8f04..ff3e147f1a23 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -6,7 +6,6 @@
>
>  use crate::{
>      bindings,
> -    build_assert,
>      device::{
>          self,
>          Bound,
> @@ -14,6 +13,7 @@
>      },
>      error::to_result,
>      prelude::*,
> +    ptr::KnownSize,
>      sync::aref::ARef,
>      transmute::{
>          AsBytes,
> @@ -357,18 +357,17 @@ fn from(direction: DataDirection) -> Self {
>  /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
>  /// large coherent DMA regions.
>  ///
> -/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
> +/// A [`Coherent`] instance contains a pointer to the allocated region (in the
>  /// processor's virtual address space) and the device address which can be given to the device
> -/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
> +/// as the DMA address base of the region. The region is released once [`Coherent`]
>  /// is dropped.
>  ///
>  /// # Invariants
>  ///
> -/// - For the lifetime of an instance of [`CoherentAllocation`], the `cpu_addr` is a valid pointer
> +/// - For the lifetime of an instance of [`Coherent`], the `cpu_addr` is a valid pointer
>  ///   to an allocated region of coherent memory and `dma_handle` is the DMA address base of the
>  ///   region.
> -/// - The size in bytes of the allocation is equal to `size_of::<T> * count`.
> -/// - `size_of::<T> * count` fits into a `usize`.
> +/// - The size in bytes of the allocation is equal to size information via pointer.

Should this be "equal to the size reported by `KnownSize::size`". I
don't follow the current wording "via pointer".

>  // TODO
>  //
>  // DMA allocations potentially carry device resources (e.g.IOMMU mappings), hence for soundness
> @@ -379,124 +378,260 @@ fn from(direction: DataDirection) -> Self {
>  // allocation from surviving device unbind; it would require RCU read side critical sections to
>  // access the memory, which may require subsequent unnecessary copies.
>  //
> -// Hence, find a way to revoke the device resources of a `CoherentAllocation`, but not the
> -// entire `CoherentAllocation` including the allocated memory itself.
> -pub struct CoherentAllocation<T: AsBytes + FromBytes> {
> +// Hence, find a way to revoke the device resources of a `Coherent`, but not the
> +// entire `Coherent` including the allocated memory itself.
> +pub struct Coherent<T: KnownSize + ?Sized> {
>      dev: ARef<device::Device>,
>      dma_handle: DmaAddress,
> -    count: usize,
>      cpu_addr: NonNull<T>,
>      dma_attrs: Attrs,
>  }
>
> -impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
> -    /// Allocates a region of `size_of::<T> * count` of coherent memory.
> +impl<T: KnownSize + ?Sized> Coherent<T> {
> +    /// Returns the size in bytes of this allocation.
> +    #[inline]
> +    pub fn size(&self) -> usize {
> +        T::size(self.cpu_addr.as_ptr())
> +    }
> +
> +    /// Returns the raw pointer to the allocated region in the CPU's virtual address space.
> +    #[inline]
> +    pub fn as_ptr(&self) -> *const T {
> +        self.cpu_addr.as_ptr()
> +    }
> +
> +    /// Returns the raw pointer to the allocated region in the CPU's virtual address space as
> +    /// a mutable pointer.
> +    #[inline]
> +    pub fn as_mut_ptr(&self) -> *mut T {
> +        self.cpu_addr.as_ptr()
> +    }
> +
> +    /// Returns a DMA handle which may be given to the device as the DMA address base of
> +    /// the region.
> +    #[inline]
> +    pub fn dma_handle(&self) -> DmaAddress {
> +        self.dma_handle
> +    }
> +
> +    /// Returns a reference to the data in the region.
>      ///
> -    /// # Examples
> +    /// # Safety
>      ///
> -    /// ```
> -    /// # use kernel::device::{Bound, Device};
> -    /// use kernel::dma::{attrs::*, CoherentAllocation};
> +    /// * Callers must ensure that the device does not read/write to/from memory while the returned
> +    ///   slice is live.
> +    /// * Callers must ensure that this call does not race with a write to the same region while
> +    ///   the returned slice is live.
> +    #[inline]
> +    pub unsafe fn as_ref(&self) -> &T {
> +        // SAFETY: per safety requirement.

Is this enough? Don't you need to assert a valid bit pattern? I assume
you get this from `FromBytes`, but that bound is not on `T` in this impl
block.

> +        unsafe { &*self.as_ptr() }
> +    }
> +
> +    /// Returns a mutable reference to the data in the region.
>      ///
> -    /// # fn test(dev: &Device<Bound>) -> Result {
> -    /// let c: CoherentAllocation<u64> =
> -    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
> -    /// # Ok::<(), Error>(()) }
> -    /// ```
> -    pub fn alloc_attrs(
> +    /// # Safety
> +    ///
> +    /// * Callers must ensure that the device does not read/write to/from memory while the returned
> +    ///   slice is live.
> +    /// * Callers must ensure that this call does not race with a read or write to the same region
> +    ///   while the returned slice is live.
> +    #[expect(clippy::mut_from_ref, reason = "unsafe to use API")]
> +    #[inline]
> +    pub unsafe fn as_mut(&self) -> &mut T {
> +        // SAFETY: per safety requirement.

Same.

> +        unsafe { &mut *self.as_mut_ptr() }
> +    }
> +
> +    /// Reads the value of `field` and ensures that its type is [`FromBytes`].
> +    ///
> +    /// # Safety
> +    ///
> +    /// This must be called from the [`dma_read`] macro which ensures that the `field` pointer is
> +    /// validated beforehand.
> +    ///
> +    /// Public but hidden since it should only be used from [`dma_read`] macro.
> +    #[doc(hidden)]
> +    pub unsafe fn field_read<F: FromBytes>(&self, field: *const F) -> F {
> +        // SAFETY:
> +        // - By the safety requirements field is valid.
> +        // - Using read_volatile() here is not sound as per the usual rules, the usage here is
> +        // a special exception with the following notes in place. When dealing with a potential
> +        // race from a hardware or code outside kernel (e.g. user-space program), we need that
> +        // read on a valid memory is not UB. Currently read_volatile() is used for this, and the
> +        // rationale behind is that it should generate the same code as READ_ONCE() which the
> +        // kernel already relies on to avoid UB on data races. Note that the usage of
> +        // read_volatile() is limited to this particular case, it cannot be used to prevent
> +        // the UB caused by racing between two kernel functions nor do they provide atomicity.
> +        unsafe { field.read_volatile() }
> +    }
> +
> +    /// Writes a value to `field` and ensures that its type is [`AsBytes`].
> +    ///
> +    /// # Safety
> +    ///
> +    /// This must be called from the [`dma_write`] macro which ensures that the `field` pointer is
> +    /// validated beforehand.
> +    ///
> +    /// Public but hidden since it should only be used from [`dma_write`] macro.
> +    #[doc(hidden)]
> +    pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
> +        // SAFETY:
> +        // - By the safety requirements field is valid.
> +        // - Using write_volatile() here is not sound as per the usual rules, the usage here is
> +        // a special exception with the following notes in place. When dealing with a potential
> +        // race from a hardware or code outside kernel (e.g. user-space program), we need that
> +        // write on a valid memory is not UB. Currently write_volatile() is used for this, and the
> +        // rationale behind is that it should generate the same code as WRITE_ONCE() which the
> +        // kernel already relies on to avoid UB on data races. Note that the usage of
> +        // write_volatile() is limited to this particular case, it cannot be used to prevent
> +        // the UB caused by racing between two kernel functions nor do they provide atomicity.
> +        unsafe { field.write_volatile(val) }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> Coherent<T> {
> +    /// Allocates a region of `T` of coherent memory.
> +    #[expect(unused)]
> +    fn alloc_with_attrs(
>          dev: &device::Device<Bound>,
> -        count: usize,
>          gfp_flags: kernel::alloc::Flags,
>          dma_attrs: Attrs,
> -    ) -> Result<CoherentAllocation<T>> {
> -        build_assert!(
> -            core::mem::size_of::<T>() > 0,
> -            "It doesn't make sense for the allocated type to be a ZST"
> -        );
> +    ) -> Result<Self> {
> +        const {
> +            assert!(
> +                core::mem::size_of::<T>() > 0,
> +                "It doesn't make sense for the allocated type to be a ZST"
> +            );
> +        }
>
> -        let size = count
> -            .checked_mul(core::mem::size_of::<T>())
> -            .ok_or(EOVERFLOW)?;
>          let mut dma_handle = 0;
>          // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
>          let addr = unsafe {
>              bindings::dma_alloc_attrs(
>                  dev.as_raw(),
> -                size,
> +                core::mem::size_of::<T>(),
>                  &mut dma_handle,
>                  gfp_flags.as_raw(),
>                  dma_attrs.as_raw(),
>              )
>          };
> -        let addr = NonNull::new(addr).ok_or(ENOMEM)?;
> +        let cpu_addr = NonNull::new(addr.cast()).ok_or(ENOMEM)?;
>          // INVARIANT:
> -        // - We just successfully allocated a coherent region which is accessible for
> -        //   `count` elements, hence the cpu address is valid. We also hold a refcounted reference
> -        //   to the device.
> -        // - The allocated `size` is equal to `size_of::<T> * count`.
> -        // - The allocated `size` fits into a `usize`.
> +        // - We just successfully allocated a coherent region which is adequately sized for `T`,
> +        //   hence the cpu address is valid.

The invariant says "valid for the lifetime", do you need to argue for
the duration of the validity?

> +        // - We also hold a refcounted reference to the device.

This does not relate to the second invariant of `Self`.

>          Ok(Self {
>              dev: dev.into(),
>              dma_handle,
> -            count,
> -            cpu_addr: addr.cast(),
> +            cpu_addr,
>              dma_attrs,
>          })
>      }
>
> -    /// Performs the same functionality as [`CoherentAllocation::alloc_attrs`], except the
> -    /// `dma_attrs` is 0 by default.
> -    pub fn alloc_coherent(
> +    /// Allocates a region of `[T; len]` of coherent memory.
> +    fn alloc_slice_with_attrs(
>          dev: &device::Device<Bound>,
> -        count: usize,
> +        len: usize,
>          gfp_flags: kernel::alloc::Flags,
> -    ) -> Result<CoherentAllocation<T>> {
> -        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0))
> +        dma_attrs: Attrs,
> +    ) -> Result<Coherent<[T]>> {
> +        const {
> +            assert!(
> +                core::mem::size_of::<T>() > 0,
> +                "It doesn't make sense for the allocated type to be a ZST"
> +            );
> +        }
> +
> +        // `dma_alloc_attrs` cannot handle zero-length allocation, bail early.
> +        if len == 0 {
> +            Err(EINVAL)?;
> +        }
> +
> +        let size = core::mem::size_of::<T>().checked_mul(len).ok_or(ENOMEM)?;
> +        let mut dma_handle = 0;
> +        // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
> +        let addr = unsafe {
> +            bindings::dma_alloc_attrs(
> +                dev.as_raw(),
> +                size,
> +                &mut dma_handle,
> +                gfp_flags.as_raw(),
> +                dma_attrs.as_raw(),
> +            )
> +        };
> +        let cpu_addr = NonNull::slice_from_raw_parts(NonNull::new(addr.cast()).ok_or(ENOMEM)?, len);
> +        // INVARIANT:
> +        // - We just successfully allocated a coherent region which is adequately sized for
> +        //   `[T; len]`, hence the cpu address is valid.
> +        // - We also hold a refcounted reference to the device.

Same.


Best regards,
Andreas Hindborg



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent`
  2026-03-20 19:45 ` [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
  2026-03-21  6:37   ` Alexandre Courbot
@ 2026-03-24 13:46   ` Andreas Hindborg
  1 sibling, 0 replies; 31+ messages in thread
From: Andreas Hindborg @ 2026-03-24 13:46 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

"Danilo Krummrich" <dakr@kernel.org> writes:

> From: Gary Guo <gary@garyguo.net>
>
> These constructors create a coherent container of a single object
> instead of slice. They are named `zeroed` and `zeroed_with_attrs` to
> emphasis that they are created initialized zeroed. It is intended that
> there'll be new constructors that take `PinInit` instead of zeroing.
>
> Signed-off-by: Gary Guo <gary@garyguo.net>
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  rust/kernel/dma.rs       | 81 ++++++++++++++++++++++++++++++++++++----
>  samples/rust/rust_dma.rs |  8 ++--
>  2 files changed, 77 insertions(+), 12 deletions(-)
>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index ff3e147f1a23..db645b01bdd0 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -47,7 +47,7 @@ pub trait Device: AsRef<device::Device<Core>> {
>      /// # Safety
>      ///
>      /// This method must not be called concurrently with any DMA allocation or mapping primitives,
> -    /// such as [`CoherentAllocation::alloc_attrs`].
> +    /// such as [`Coherent::zeroed`].
>      unsafe fn dma_set_mask(&self, mask: DmaMask) -> Result {
>          // SAFETY:
>          // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
> @@ -64,7 +64,7 @@ unsafe fn dma_set_mask(&self, mask: DmaMask) -> Result {
>      /// # Safety
>      ///
>      /// This method must not be called concurrently with any DMA allocation or mapping primitives,
> -    /// such as [`CoherentAllocation::alloc_attrs`].
> +    /// such as [`Coherent::zeroed`].
>      unsafe fn dma_set_coherent_mask(&self, mask: DmaMask) -> Result {
>          // SAFETY:
>          // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
> @@ -83,7 +83,7 @@ unsafe fn dma_set_coherent_mask(&self, mask: DmaMask) -> Result {
>      /// # Safety
>      ///
>      /// This method must not be called concurrently with any DMA allocation or mapping primitives,
> -    /// such as [`CoherentAllocation::alloc_attrs`].
> +    /// such as [`Coherent::zeroed`].
>      unsafe fn dma_set_mask_and_coherent(&self, mask: DmaMask) -> Result {
>          // SAFETY:
>          // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
> @@ -102,7 +102,7 @@ unsafe fn dma_set_mask_and_coherent(&self, mask: DmaMask) -> Result {
>      /// # Safety
>      ///
>      /// This method must not be called concurrently with any DMA allocation or mapping primitives,
> -    /// such as [`CoherentAllocation::alloc_attrs`].
> +    /// such as [`Coherent::zeroed`].
>      unsafe fn dma_set_max_seg_size(&self, size: u32) {
>          // SAFETY:
>          // - By the type invariant of `device::Device`, `self.as_ref().as_raw()` is valid.
> @@ -202,12 +202,12 @@ pub const fn value(&self) -> u64 {
>  ///
>  /// ```
>  /// # use kernel::device::{Bound, Device};
> -/// use kernel::dma::{attrs::*, CoherentAllocation};
> +/// use kernel::dma::{attrs::*, Coherent};
>  ///
>  /// # fn test(dev: &Device<Bound>) -> Result {
>  /// let attribs = DMA_ATTR_FORCE_CONTIGUOUS | DMA_ATTR_NO_WARN;
> -/// let c: CoherentAllocation<u64> =
> -///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, attribs)?;
> +/// let c: Coherent<[u64]> =
> +///     Coherent::zeroed_slice_with_attrs(dev, 4, GFP_KERNEL, attribs)?;
>  /// # Ok::<(), Error>(()) }
>  /// ```
>  #[derive(Clone, Copy, PartialEq)]
> @@ -492,7 +492,6 @@ pub unsafe fn field_write<F: AsBytes>(&self, field: *mut F, val: F) {
>
>  impl<T: AsBytes + FromBytes> Coherent<T> {
>      /// Allocates a region of `T` of coherent memory.
> -    #[expect(unused)]
>      fn alloc_with_attrs(
>          dev: &device::Device<Bound>,
>          gfp_flags: kernel::alloc::Flags,
> @@ -529,6 +528,35 @@ fn alloc_with_attrs(
>          })
>      }
>
> +    /// Allocates a region of type `T` of coherent memory.

For consistency "Allocates a ZEROED region...".

> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// # use kernel::device::{Bound, Device};
> +    /// use kernel::dma::{attrs::*, Coherent};
> +    ///
> +    /// # fn test(dev: &Device<Bound>) -> Result {
> +    /// let c: Coherent<[u64; 4]> =
> +    ///     Coherent::zeroed_with_attrs(dev, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    #[inline]
> +    pub fn zeroed_with_attrs(
> +        dev: &device::Device<Bound>,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +    ) -> Result<Self> {
> +        Self::alloc_with_attrs(dev, gfp_flags | __GFP_ZERO, dma_attrs)
> +    }
> +
> +    /// Performs the same functionality as [`Coherent::zeroed_with_attrs`], except the
> +    /// `dma_attrs` is 0 by default.
> +    #[inline]
> +    pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> Result<Self> {
> +        Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
> +    }
> +
>      /// Allocates a region of `[T; len]` of coherent memory.
>      fn alloc_slice_with_attrs(
>          dev: &device::Device<Bound>,
> @@ -572,6 +600,43 @@ fn alloc_slice_with_attrs(
>              dma_attrs,
>          })
>      }
> +
> +    /// Allocates a zeroed region of type `T` of coherent memory.

Should it be "of type `[T]`"?


Best regards,
Andreas Hindborg


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization
  2026-03-20 19:45 ` [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization Danilo Krummrich
  2026-03-20 20:55   ` Gary Guo
@ 2026-03-24 13:57   ` Andreas Hindborg
  1 sibling, 0 replies; 31+ messages in thread
From: Andreas Hindborg @ 2026-03-24 13:57 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

"Danilo Krummrich" <dakr@kernel.org> writes:

> Currently, dma::Coherent cannot safely provide (mutable) access to its
> underlying memory because the memory might be concurrently accessed by a
> DMA device. This makes it difficult to safely initialize the memory
> before handing it over to the hardware.
>
> Introduce dma::CoherentBox, a type that encapsulates a dma::Coherent
> before its DMA address is exposed to the device. dma::CoherentBox can
> guarantee exclusive access to the inner dma::Coherent and implement
> Deref and DerefMut.
>
> Once the memory is properly initialized, dma::CoherentBox can be
> converted into a regular dma::Coherent.
>
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  rust/kernel/dma.rs | 154 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 153 insertions(+), 1 deletion(-)
>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index db645b01bdd0..cefb54f0424a 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -20,7 +20,13 @@
>          FromBytes, //
>      }, //
>  };
> -use core::ptr::NonNull;
> +use core::{
> +    ops::{
> +        Deref,
> +        DerefMut, //
> +    },
> +    ptr::NonNull, //
> +};
>
>  /// DMA address type.
>  ///
> @@ -352,6 +358,152 @@ fn from(direction: DataDirection) -> Self {
>      }
>  }
>
> +/// CPU-owned DMA allocation that can be converted into a device-shared [`Coherent`] object.
> +///
> +/// Unlike [`Coherent`], a [`CoherentBox`] is guaranteed to be fully owned by the CPU -- its DMA
> +/// address is not exposed and it cannot be accessed by a device. This means it can safely be used
> +/// like a normal boxed allocation (e.g. direct reads, writes, and mutable slices are all safe).
> +///
> +/// A typical use is to allocate a [`CoherentBox`], populate it with normal CPU access, and then
> +/// convert it into a [`Coherent`] object to share it with the device.
> +///
> +/// # Examples
> +///
> +/// `CoherentBox<T>`:
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentBox,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentBox<u64> = CoherentBox::zeroed(dev, GFP_KERNEL)?;
> +/// *dmem = 42;
> +/// let dmem: Coherent<u64> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +///
> +/// `CoherentBox<[T]>`:
> +///
> +///
> +/// ```
> +/// # use kernel::device::{
> +/// #     Bound,
> +/// #     Device,
> +/// # };
> +/// use kernel::dma::{attrs::*,
> +///     Coherent,
> +///     CoherentBox,
> +/// };
> +///
> +/// # fn test(dev: &Device<Bound>) -> Result {
> +/// let mut dmem: CoherentBox<[u64]> = CoherentBox::zeroed_slice(dev, 4, GFP_KERNEL)?;
> +/// dmem.fill(42);
> +/// let dmem: Coherent<[u64]> = dmem.into();
> +/// # Ok::<(), Error>(()) }
> +/// ```
> +pub struct CoherentBox<T: AsBytes + FromBytes + KnownSize + ?Sized>(Coherent<T>);
> +
> +impl<T: AsBytes + FromBytes> CoherentBox<[T]> {
> +    /// [`CoherentBox`] variant of [`Coherent::zeroed_slice_with_attrs`].
> +    #[inline]
> +    pub fn zeroed_slice_with_attrs(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +    ) -> Result<Self> {
> +        Coherent::zeroed_slice_with_attrs(dev, count, gfp_flags, dma_attrs).map(Self)
> +    }
> +
> +    /// Same as [CoherentBox::zeroed_slice_with_attrs], but with `dma::Attrs(0)`.
> +    #[inline]
> +    pub fn zeroed_slice(
> +        dev: &device::Device<Bound>,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +    ) -> Result<Self> {
> +        Self::zeroed_slice_with_attrs(dev, count, gfp_flags, Attrs(0))
> +    }
> +
> +    /// Initializes the element at `i` using the given initializer.
> +    ///
> +    /// Returns `EINVAL` if `i` is out of bounds.

Could you add an example for this function?


Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>


Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  2026-03-20 19:45 ` [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
  2026-03-20 20:56   ` Gary Guo
@ 2026-03-24 14:00   ` Andreas Hindborg
  2026-03-24 15:03     ` Danilo Krummrich
  1 sibling, 1 reply; 31+ messages in thread
From: Andreas Hindborg @ 2026-03-24 14:00 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

"Danilo Krummrich" <dakr@kernel.org> writes:

> Analogous to Coherent::zeroed() and Coherent::zeroed_with_attrs(), add
> Coherent:init() and Coherent::init_with_attrs() which both take an impl
> Init<T, E> argument initializing the DMA coherent memory.
>
> Compared to CoherentInit, Coherent::init() is a one-shot constructor
> that runs an Init closure and immediately exposes the DMA handle,
> whereas CoherentInit is a multi-stage initializer that provides safe
> &mut T access by withholding the DMA address until converted to
> Coherent.

You forgot to update this to CoherentBox

>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  rust/kernel/dma.rs | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index cefb54f0424a..6d2bec52806b 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -709,6 +709,44 @@ pub fn zeroed(dev: &device::Device<Bound>, gfp_flags: kernel::alloc::Flags) -> R
>          Self::zeroed_with_attrs(dev, gfp_flags, Attrs(0))
>      }
>
> +    /// Same as [`Coherent::zeroed_with_attrs`], but instead of a zero-initialization the memory is
> +    /// initialized with `init`.
> +    pub fn init_with_attrs<E>(
> +        dev: &device::Device<Bound>,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attrs,
> +        init: impl Init<T, E>,
> +    ) -> Result<Self>
> +    where
> +        Error: From<E>,
> +    {
> +        let dmem = Self::alloc_with_attrs(dev, gfp_flags, dma_attrs)?;
> +        let ptr = dmem.as_mut_ptr();
> +
> +        // SAFETY:
> +        // - `ptr` is valid, properly aligned, and points to exclusively owned memory.
> +        // - If `__init` fails, `self` is dropped, which safely frees the underlying `Coherent`'s
> +        //   DMA memory. `T: AsBytes + FromBytes` ensures there are no complex `Drop` requirements
> +        //   we are bypassing.
> +        unsafe { init.__init(ptr)? };
> +
> +        Ok(dmem)
> +    }
> +
> +    /// Same as [`Coherent::zeroed`], but instead of a zero-initialization the memory is initialized
> +    /// with `init`.
> +    #[inline]
> +    pub fn init<E>(
> +        dev: &device::Device<Bound>,
> +        gfp_flags: kernel::alloc::Flags,
> +        init: impl Init<T, E>,
> +    ) -> Result<Self>
> +    where
> +        Error: From<E>,
> +    {
> +        Self::init_with_attrs(dev, gfp_flags, Attrs(0), init)
> +    }
> +

I think we are missing an array initializer for `Coherent<[T]>`.

Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>


Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 1/8] rust: dma: use "kernel vertical" style for imports
  2026-03-20 19:45 ` [PATCH v2 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
@ 2026-03-24 14:02   ` Andreas Hindborg
  0 siblings, 0 replies; 31+ messages in thread
From: Andreas Hindborg @ 2026-03-24 14:02 UTC (permalink / raw)
  To: Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun, gary,
	bjorn3_gh, lossin, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel,
	Danilo Krummrich

"Danilo Krummrich" <dakr@kernel.org> writes:

> Convert all imports to use "kernel vertical" style.
>
> With this, subsequent patches neither introduce unrelated changes nor
> leave an inconsistent import pattern.
>
> While at it, drop unnecessary imports covered by prelude::*.
>
> Link: https://docs.kernel.org/rust/coding-guidelines.html#imports
> Reviewed-by: Gary Guo <gary@garyguo.net>
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>

Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>


Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/8] rust: dma: add generalized container for types other than slices
  2026-03-24 13:42   ` Andreas Hindborg
@ 2026-03-24 14:06     ` Gary Guo
  2026-03-24 14:37       ` Andreas Hindborg
  0 siblings, 1 reply; 31+ messages in thread
From: Gary Guo @ 2026-03-24 14:06 UTC (permalink / raw)
  To: Andreas Hindborg, Danilo Krummrich, aliceryhl, acourbot, ojeda,
	boqun, gary, bjorn3_gh, lossin, tmgross, abdiel.janulgue,
	daniel.almeida, robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 24, 2026 at 1:42 PM GMT, Andreas Hindborg wrote:
> "Danilo Krummrich" <dakr@kernel.org> writes:
>
>> +    /// Returns a reference to the data in the region.
>>      ///
>> -    /// # Examples
>> +    /// # Safety
>>      ///
>> -    /// ```
>> -    /// # use kernel::device::{Bound, Device};
>> -    /// use kernel::dma::{attrs::*, CoherentAllocation};
>> +    /// * Callers must ensure that the device does not read/write to/from memory while the returned
>> +    ///   slice is live.
>> +    /// * Callers must ensure that this call does not race with a write to the same region while
>> +    ///   the returned slice is live.
>> +    #[inline]
>> +    pub unsafe fn as_ref(&self) -> &T {
>> +        // SAFETY: per safety requirement.
>
> Is this enough? Don't you need to assert a valid bit pattern? I assume
> you get this from `FromBytes`, but that bound is not on `T` in this impl
> block.

Concetually `Coherent<T>` is a container of `T` so it should always hold a valid
bit pattern for the type. Perhaps this should be added to the type invariant.

FromBytes is only needed at the constructor level as that's the only place where
you're asserting that initial bit pattern is valid.


>
>> +        unsafe { &*self.as_ptr() }
>> +    }
>> +
>> +    /// Returns a mutable reference to the data in the region.
>>      ///
>> -    /// # fn test(dev: &Device<Bound>) -> Result {
>> -    /// let c: CoherentAllocation<u64> =
>> -    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
>> -    /// # Ok::<(), Error>(()) }
>> -    /// ```
>> -    pub fn alloc_attrs(
>> +    /// # Safety
>> +    ///
>> +    /// * Callers must ensure that the device does not read/write to/from memory while the returned
>> +    ///   slice is live.
>> +    /// * Callers must ensure that this call does not race with a read or write to the same region
>> +    ///   while the returned slice is live.
>> +    #[expect(clippy::mut_from_ref, reason = "unsafe to use API")]
>> +    #[inline]
>> +    pub unsafe fn as_mut(&self) -> &mut T {
>> +        // SAFETY: per safety requirement.
>
> Same.
>
>> +        unsafe { &mut *self.as_mut_ptr() }
>> +    }
>> +
>> +    /// Reads the value of `field` and ensures that its type is [`FromBytes`].
>> +    ///
>> +    /// # Safety
>> +    ///
>> +    /// This must be called from the [`dma_read`] macro which ensures that the `field` pointer is
>> +    /// validated beforehand.
>> +    ///
>> [...]
>> +
>> +impl<T: AsBytes + FromBytes> Coherent<T> {
>> +    /// Allocates a region of `T` of coherent memory.
>> +    #[expect(unused)]
>> +    fn alloc_with_attrs(
>>          dev: &device::Device<Bound>,
>> -        count: usize,
>>          gfp_flags: kernel::alloc::Flags,
>>          dma_attrs: Attrs,
>> -    ) -> Result<CoherentAllocation<T>> {
>> -        build_assert!(
>> -            core::mem::size_of::<T>() > 0,
>> -            "It doesn't make sense for the allocated type to be a ZST"
>> -        );
>> +    ) -> Result<Self> {
>> +        const {
>> +            assert!(
>> +                core::mem::size_of::<T>() > 0,
>> +                "It doesn't make sense for the allocated type to be a ZST"
>> +            );
>> +        }
>>
>> -        let size = count
>> -            .checked_mul(core::mem::size_of::<T>())
>> -            .ok_or(EOVERFLOW)?;
>>          let mut dma_handle = 0;
>>          // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
>>          let addr = unsafe {
>>              bindings::dma_alloc_attrs(
>>                  dev.as_raw(),
>> -                size,
>> +                core::mem::size_of::<T>(),
>>                  &mut dma_handle,
>>                  gfp_flags.as_raw(),
>>                  dma_attrs.as_raw(),
>>              )
>>          };
>> -        let addr = NonNull::new(addr).ok_or(ENOMEM)?;
>> +        let cpu_addr = NonNull::new(addr.cast()).ok_or(ENOMEM)?;
>>          // INVARIANT:
>> -        // - We just successfully allocated a coherent region which is accessible for
>> -        //   `count` elements, hence the cpu address is valid. We also hold a refcounted reference
>> -        //   to the device.
>> -        // - The allocated `size` is equal to `size_of::<T> * count`.
>> -        // - The allocated `size` fits into a `usize`.
>> +        // - We just successfully allocated a coherent region which is adequately sized for `T`,
>> +        //   hence the cpu address is valid.
>
> The invariant says "valid for the lifetime", do you need to argue for
> the duration of the validity?

It says "For the lifetime of an instance of ..." which is basically equal to
saying nothing. I think that part should just be removed.

>
>> +        // - We also hold a refcounted reference to the device.
>
> This does not relate to the second invariant of `Self`.

The second invariant is about size, which is addressed in the first bullet
point. This is about the third invariant, which the doc-comemnt just says "TODO"
currently.

Best,
Gary


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/8] rust: dma: add generalized container for types other than slices
  2026-03-24 14:06     ` Gary Guo
@ 2026-03-24 14:37       ` Andreas Hindborg
  0 siblings, 0 replies; 31+ messages in thread
From: Andreas Hindborg @ 2026-03-24 14:37 UTC (permalink / raw)
  To: Gary Guo, Danilo Krummrich, aliceryhl, acourbot, ojeda, boqun,
	gary, bjorn3_gh, lossin, tmgross, abdiel.janulgue, daniel.almeida,
	robin.murphy
  Cc: driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

"Gary Guo" <gary@garyguo.net> writes:

> On Tue Mar 24, 2026 at 1:42 PM GMT, Andreas Hindborg wrote:
>> "Danilo Krummrich" <dakr@kernel.org> writes:
>>
>>> +    /// Returns a reference to the data in the region.
>>>      ///
>>> -    /// # Examples
>>> +    /// # Safety
>>>      ///
>>> -    /// ```
>>> -    /// # use kernel::device::{Bound, Device};
>>> -    /// use kernel::dma::{attrs::*, CoherentAllocation};
>>> +    /// * Callers must ensure that the device does not read/write to/from memory while the returned
>>> +    ///   slice is live.
>>> +    /// * Callers must ensure that this call does not race with a write to the same region while
>>> +    ///   the returned slice is live.
>>> +    #[inline]
>>> +    pub unsafe fn as_ref(&self) -> &T {
>>> +        // SAFETY: per safety requirement.
>>
>> Is this enough? Don't you need to assert a valid bit pattern? I assume
>> you get this from `FromBytes`, but that bound is not on `T` in this impl
>> block.
>
> Concetually `Coherent<T>` is a container of `T` so it should always hold a valid
> bit pattern for the type. Perhaps this should be added to the type invariant.
>
> FromBytes is only needed at the constructor level as that's the only place where
> you're asserting that initial bit pattern is valid.

Makes sense to add the invariant and assert at construction time.

>
>
>>
>>> +        unsafe { &*self.as_ptr() }
>>> +    }
>>> +
>>> +    /// Returns a mutable reference to the data in the region.
>>>      ///
>>> -    /// # fn test(dev: &Device<Bound>) -> Result {
>>> -    /// let c: CoherentAllocation<u64> =
>>> -    ///     CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL, DMA_ATTR_NO_WARN)?;
>>> -    /// # Ok::<(), Error>(()) }
>>> -    /// ```
>>> -    pub fn alloc_attrs(
>>> +    /// # Safety
>>> +    ///
>>> +    /// * Callers must ensure that the device does not read/write to/from memory while the returned
>>> +    ///   slice is live.
>>> +    /// * Callers must ensure that this call does not race with a read or write to the same region
>>> +    ///   while the returned slice is live.
>>> +    #[expect(clippy::mut_from_ref, reason = "unsafe to use API")]
>>> +    #[inline]
>>> +    pub unsafe fn as_mut(&self) -> &mut T {
>>> +        // SAFETY: per safety requirement.
>>
>> Same.
>>
>>> +        unsafe { &mut *self.as_mut_ptr() }
>>> +    }
>>> +
>>> +    /// Reads the value of `field` and ensures that its type is [`FromBytes`].
>>> +    ///
>>> +    /// # Safety
>>> +    ///
>>> +    /// This must be called from the [`dma_read`] macro which ensures that the `field` pointer is
>>> +    /// validated beforehand.
>>> +    ///
>>> [...]
>>> +
>>> +impl<T: AsBytes + FromBytes> Coherent<T> {
>>> +    /// Allocates a region of `T` of coherent memory.
>>> +    #[expect(unused)]
>>> +    fn alloc_with_attrs(
>>>          dev: &device::Device<Bound>,
>>> -        count: usize,
>>>          gfp_flags: kernel::alloc::Flags,
>>>          dma_attrs: Attrs,
>>> -    ) -> Result<CoherentAllocation<T>> {
>>> -        build_assert!(
>>> -            core::mem::size_of::<T>() > 0,
>>> -            "It doesn't make sense for the allocated type to be a ZST"
>>> -        );
>>> +    ) -> Result<Self> {
>>> +        const {
>>> +            assert!(
>>> +                core::mem::size_of::<T>() > 0,
>>> +                "It doesn't make sense for the allocated type to be a ZST"
>>> +            );
>>> +        }
>>>
>>> -        let size = count
>>> -            .checked_mul(core::mem::size_of::<T>())
>>> -            .ok_or(EOVERFLOW)?;
>>>          let mut dma_handle = 0;
>>>          // SAFETY: Device pointer is guaranteed as valid by the type invariant on `Device`.
>>>          let addr = unsafe {
>>>              bindings::dma_alloc_attrs(
>>>                  dev.as_raw(),
>>> -                size,
>>> +                core::mem::size_of::<T>(),
>>>                  &mut dma_handle,
>>>                  gfp_flags.as_raw(),
>>>                  dma_attrs.as_raw(),
>>>              )
>>>          };
>>> -        let addr = NonNull::new(addr).ok_or(ENOMEM)?;
>>> +        let cpu_addr = NonNull::new(addr.cast()).ok_or(ENOMEM)?;
>>>          // INVARIANT:
>>> -        // - We just successfully allocated a coherent region which is accessible for
>>> -        //   `count` elements, hence the cpu address is valid. We also hold a refcounted reference
>>> -        //   to the device.
>>> -        // - The allocated `size` is equal to `size_of::<T> * count`.
>>> -        // - The allocated `size` fits into a `usize`.
>>> +        // - We just successfully allocated a coherent region which is adequately sized for `T`,
>>> +        //   hence the cpu address is valid.
>>
>> The invariant says "valid for the lifetime", do you need to argue for
>> the duration of the validity?
>
> It says "For the lifetime of an instance of ..." which is basically equal to
> saying nothing. I think that part should just be removed.

Sounds good.

>
>>
>>> +        // - We also hold a refcounted reference to the device.
>>
>> This does not relate to the second invariant of `Self`.
>
> The second invariant is about size, which is addressed in the first bullet
> point. This is about the third invariant, which the doc-comemnt just says "TODO"
> currently.

So it is not an invariant, and it should not be covered here.

I would suggest that if you use bullet list both in definition and
justification, please make sure the justification bullets match the
definition bullets. If you need to, you can make sub-lists by indenting.


Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  2026-03-24 14:00   ` Andreas Hindborg
@ 2026-03-24 15:03     ` Danilo Krummrich
  2026-03-24 15:40       ` Andreas Hindborg
  0 siblings, 1 reply; 31+ messages in thread
From: Danilo Krummrich @ 2026-03-24 15:03 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

On Tue Mar 24, 2026 at 3:00 PM CET, Andreas Hindborg wrote:
>> +    /// Same as [`Coherent::zeroed`], but instead of a zero-initialization the memory is initialized
>> +    /// with `init`.
>> +    #[inline]
>> +    pub fn init<E>(
>> +        dev: &device::Device<Bound>,
>> +        gfp_flags: kernel::alloc::Flags,
>> +        init: impl Init<T, E>,
>> +    ) -> Result<Self>
>> +    where
>> +        Error: From<E>,
>> +    {
>> +        Self::init_with_attrs(dev, gfp_flags, Attrs(0), init)
>> +    }
>> +
>
> I think we are missing an array initializer for `Coherent<[T]>`.

This method is already compatible with arrays, T can be an array type itself,
e.g. T = [MyStruct; 5].

For instance, the diff in [1] should work.

> Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>

Thanks! The patch series has been applied yesterday, and the branch it has been
applied to is immutable, so I can't add additional tags.

However, the Link: included in the patch still points to this conversation.

Also note that subsequent review is still valued; we can always send follow-up
patches if required.

Thanks,
Danilo

[1]

diff --git a/samples/rust/rust_dma.rs b/samples/rust/rust_dma.rs
index 129bb4b39c04..3d3ffc21ea77 100644
--- a/samples/rust/rust_dma.rs
+++ b/samples/rust/rust_dma.rs
@@ -22,6 +22,7 @@
 struct DmaSampleDriver {
     pdev: ARef<pci::Device>,
     ca: Coherent<[MyStruct]>,
+    ca_init: Coherent<[MyStruct; 5]>,
     #[pin]
     sgt: SGTable<Owned<VVec<u8>>>,
 }
@@ -76,6 +77,15 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
                 kernel::dma_write!(ca, [i]?, MyStruct::new(value.0, value.1));
             }

+            let ca_init: Coherent<[MyStruct; 5]> = Coherent::init(
+                pdev.as_ref(),
+                GFP_KERNEL,
+                pin_init::init_array_from_fn(|i| {
+                    let (h, b) = TEST_VALUES[i];
+                    MyStruct::new(h, b)
+                }),
+            )?;
+
             let size = 4 * page::PAGE_SIZE;
             let pages = VVec::with_capacity(size, GFP_KERNEL)?;

@@ -84,6 +94,7 @@ fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> impl PinInit<Self, E
             Ok(try_pin_init!(Self {
                 pdev: pdev.into(),
                 ca,
+                ca_init,
                 sgt <- sgt,
             }))
         })
@@ -98,6 +109,12 @@ fn check_dma(&self) -> Result {

             assert_eq!(val0, value.0);
             assert_eq!(val1, value.1);
+
+            let val0 = kernel::dma_read!(self.ca_init, [i]?.h);
+            let val1 = kernel::dma_read!(self.ca_init, [i]?.b);
+
+            assert_eq!(val0, value.0);
+            assert_eq!(val1, value.1);
         }

         Ok(())

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs()
  2026-03-24 15:03     ` Danilo Krummrich
@ 2026-03-24 15:40       ` Andreas Hindborg
  0 siblings, 0 replies; 31+ messages in thread
From: Andreas Hindborg @ 2026-03-24 15:40 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: aliceryhl, acourbot, ojeda, boqun, gary, bjorn3_gh, lossin,
	tmgross, abdiel.janulgue, daniel.almeida, robin.murphy,
	driver-core, nouveau, dri-devel, rust-for-linux, linux-kernel

"Danilo Krummrich" <dakr@kernel.org> writes:

> On Tue Mar 24, 2026 at 3:00 PM CET, Andreas Hindborg wrote:
>>> +    /// Same as [`Coherent::zeroed`], but instead of a zero-initialization the memory is initialized
>>> +    /// with `init`.
>>> +    #[inline]
>>> +    pub fn init<E>(
>>> +        dev: &device::Device<Bound>,
>>> +        gfp_flags: kernel::alloc::Flags,
>>> +        init: impl Init<T, E>,
>>> +    ) -> Result<Self>
>>> +    where
>>> +        Error: From<E>,
>>> +    {
>>> +        Self::init_with_attrs(dev, gfp_flags, Attrs(0), init)
>>> +    }
>>> +
>>
>> I think we are missing an array initializer for `Coherent<[T]>`.
>
> This method is already compatible with arrays, T can be an array type itself,
> e.g. T = [MyStruct; 5].
>
> For instance, the diff in [1] should work.

Cool!

>> Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
>
> Thanks! The patch series has been applied yesterday, and the branch it has been
> applied to is immutable, so I can't add additional tags.

You guys move fast!

> However, the Link: included in the patch still points to this conversation.
>
> Also note that subsequent review is still valued; we can always send follow-up
> patches if required.

At any rate I can send the changes I suggested if they are deemed valid.


Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2026-03-24 15:40 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-20 19:45 [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Danilo Krummrich
2026-03-20 19:45 ` [PATCH v2 1/8] rust: dma: use "kernel vertical" style for imports Danilo Krummrich
2026-03-24 14:02   ` Andreas Hindborg
2026-03-20 19:45 ` [PATCH v2 2/8] rust: dma: add generalized container for types other than slices Danilo Krummrich
2026-03-21 23:22   ` Aditya Rajan
2026-03-22  0:47     ` Gary Guo
2026-03-22  6:24       ` Aditya Rajan
2026-03-24 13:42   ` Andreas Hindborg
2026-03-24 14:06     ` Gary Guo
2026-03-24 14:37       ` Andreas Hindborg
2026-03-20 19:45 ` [PATCH v2 3/8] rust: dma: add zeroed constructor to `Coherent` Danilo Krummrich
2026-03-21  6:37   ` Alexandre Courbot
2026-03-24 13:46   ` Andreas Hindborg
2026-03-20 19:45 ` [PATCH v2 4/8] rust: dma: introduce dma::CoherentBox for memory initialization Danilo Krummrich
2026-03-20 20:55   ` Gary Guo
2026-03-24 13:57   ` Andreas Hindborg
2026-03-20 19:45 ` [PATCH v2 5/8] rust: dma: add Coherent:init() and Coherent::init_with_attrs() Danilo Krummrich
2026-03-20 20:56   ` Gary Guo
2026-03-24 14:00   ` Andreas Hindborg
2026-03-24 15:03     ` Danilo Krummrich
2026-03-24 15:40       ` Andreas Hindborg
2026-03-20 19:45 ` [PATCH v2 6/8] gpu: nova-core: use Coherent::init to initialize GspFwWprMeta Danilo Krummrich
2026-03-20 21:04   ` Gary Guo
2026-03-20 19:45 ` [PATCH v2 7/8] gpu: nova-core: convert Gsp::new() to use CoherentBox Danilo Krummrich
2026-03-20 21:06   ` Gary Guo
2026-03-20 19:45 ` [PATCH v2 8/8] gpu: nova-core: convert to new dma::Coherent API Danilo Krummrich
2026-03-21 16:50   ` Gary Guo
2026-03-21 18:22     ` Danilo Krummrich
2026-03-21 22:36       ` Gary Guo
2026-03-21  5:13 ` [PATCH v2 0/8] dma::Coherent & dma::CoherentBox API Alexandre Courbot
2026-03-23 21:56 ` Danilo Krummrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox