rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/2] Add dma coherent allocator abstraction
@ 2024-12-10 22:14 Abdiel Janulgue
  2024-12-10 22:14 ` [PATCH v7 1/2] rust: error: Add EOVERFLOW Abdiel Janulgue
  2024-12-10 22:14 ` [PATCH v7 2/2] rust: add dma coherent allocator abstraction Abdiel Janulgue
  0 siblings, 2 replies; 15+ messages in thread
From: Abdiel Janulgue @ 2024-12-10 22:14 UTC (permalink / raw)
  To: rust-for-linux
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, airlied,
	open list:DMA MAPPING HELPERS, Abdiel Janulgue

Changes since v6:
- Include the dma_attrs in the constructor, use alloc::Flags as inpiration

Changes since v5:
- Remove unnecessary lifetime annotation when returning the CPU buffer.

Changes since v4:
- Documentation and example fixes, use Markdown formatting (Miguel Ojeda).
- Discard read()/write() helpers to remove bound on Copy and fix overhead
  (Daniel Almeida).
- Improve error-handling in the constructor block (Andreas Hindborg).

Changes since v3:
- Reject ZST types by checking the type size in the constructor in
  addition to requiring FromBytes/AsBytes traits for the type (Alice Ryhl).

Changes since v2:
- Fixed missing header for generating the bindings.

Changes since v1:
- Fix missing info in commit log where EOVERFLOW is used.
- Restrict the dma coherent allocator to numeric types for now for valid
  behaviour (Daniel Almeida).
- Build slice dynamically.

Abdiel Janulgue (2):
  rust: error: Add EOVERFLOW
  rust: add dma coherent allocator abstraction.

 rust/bindings/bindings_helper.h |   1 +
 rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
 rust/kernel/error.rs            |   1 +
 rust/kernel/lib.rs              |   1 +
 4 files changed, 226 insertions(+)
 create mode 100644 rust/kernel/dma.rs


base-commit: fac04efc5c793dccbd07e2d59af9f90b7fc0dca4
-- 
2.43.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v7 1/2] rust: error: Add EOVERFLOW
  2024-12-10 22:14 [PATCH v7 0/2] Add dma coherent allocator abstraction Abdiel Janulgue
@ 2024-12-10 22:14 ` Abdiel Janulgue
  2024-12-10 22:14 ` [PATCH v7 2/2] rust: add dma coherent allocator abstraction Abdiel Janulgue
  1 sibling, 0 replies; 15+ messages in thread
From: Abdiel Janulgue @ 2024-12-10 22:14 UTC (permalink / raw)
  To: rust-for-linux
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, airlied,
	open list:DMA MAPPING HELPERS, Abdiel Janulgue

Trivial addition for missing EOVERFLOW error. This is used by a
subsequent patch that might require returning EOVERFLOW as a result
of `checked_mul`.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
---
 rust/kernel/error.rs | 1 +
 1 file changed, 1 insertion(+)

diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index 52c502432447..cd57fac7f1f9 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -63,6 +63,7 @@ macro_rules! declare_err {
     declare_err!(EPIPE, "Broken pipe.");
     declare_err!(EDOM, "Math argument out of domain of func.");
     declare_err!(ERANGE, "Math result not representable.");
+    declare_err!(EOVERFLOW, "Value too large for defined data type.");
     declare_err!(ERESTARTSYS, "Restart the system call.");
     declare_err!(ERESTARTNOINTR, "System call was interrupted by a signal and will be restarted.");
     declare_err!(ERESTARTNOHAND, "Restart if no handler.");
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-10 22:14 [PATCH v7 0/2] Add dma coherent allocator abstraction Abdiel Janulgue
  2024-12-10 22:14 ` [PATCH v7 1/2] rust: error: Add EOVERFLOW Abdiel Janulgue
@ 2024-12-10 22:14 ` Abdiel Janulgue
  2024-12-13  8:53   ` Abdiel Janulgue
                     ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Abdiel Janulgue @ 2024-12-10 22:14 UTC (permalink / raw)
  To: rust-for-linux
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, airlied,
	open list:DMA MAPPING HELPERS, Abdiel Janulgue, Daniel Almeida

Add a simple dma coherent allocator rust abstraction. Based on
Andreas Hindborg's dma abstractions from the rnvme driver, which
was also based on earlier work by Wedson Almeida Filho.

Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
---
 rust/bindings/bindings_helper.h |   1 +
 rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
 rust/kernel/lib.rs              |   1 +
 3 files changed, 225 insertions(+)
 create mode 100644 rust/kernel/dma.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 5c4dfe22f41a..49bf713b9bb6 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -11,6 +11,7 @@
 #include <linux/blk_types.h>
 #include <linux/blkdev.h>
 #include <linux/cred.h>
+#include <linux/dma-mapping.h>
 #include <linux/errname.h>
 #include <linux/ethtool.h>
 #include <linux/file.h>
diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
new file mode 100644
index 000000000000..29ae744d6f2b
--- /dev/null
+++ b/rust/kernel/dma.rs
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Direct memory access (DMA).
+//!
+//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
+
+use crate::{
+    bindings,
+    build_assert,
+    device::Device,
+    error::code::*,
+    error::Result,
+    types::ARef,
+    transmute::{AsBytes, FromBytes},
+};
+
+/// Possible attributes associated with a DMA mapping.
+///
+/// They can be combined with the operators `|`, `&`, and `!`.
+///
+/// Values can be used from the [`attrs`] module.
+#[derive(Clone, Copy, PartialEq)]
+pub struct Attribs(u32);
+
+impl Attribs {
+    /// Get the raw representation of this attribute.
+    pub(crate) fn as_raw(self) -> u64 {
+        self.0.into()
+    }
+
+    /// Check whether `flags` is contained in `self`.
+    pub fn contains(self, flags: Attribs) -> bool {
+        (self & flags) == flags
+    }
+}
+
+impl core::ops::BitOr for Attribs {
+    type Output = Self;
+    fn bitor(self, rhs: Self) -> Self::Output {
+        Self(self.0 | rhs.0)
+    }
+}
+
+impl core::ops::BitAnd for Attribs {
+    type Output = Self;
+    fn bitand(self, rhs: Self) -> Self::Output {
+        Self(self.0 & rhs.0)
+    }
+}
+
+impl core::ops::Not for Attribs {
+    type Output = Self;
+    fn not(self) -> Self::Output {
+        Self(!self.0)
+    }
+}
+
+/// DMA mapping attrributes.
+pub mod attrs {
+    use super::Attribs;
+
+    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
+    /// and writes may pass each other.
+    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
+
+    /// Specifies that writes to the mapping may be buffered to improve performance.
+    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
+
+    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
+    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
+
+    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
+    /// that it has been already transferred to 'device' domain.
+    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
+
+    /// Forces contiguous allocation of the buffer in physical memory.
+    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
+
+    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
+    /// to allocate memory to in a way that gives better TLB efficiency.
+    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
+
+    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
+    /// __GFP_NOWARN).
+    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
+
+    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
+    /// ideally inaccessible or at least read-only at lesser-privileged levels).
+    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
+}
+
+/// An abstraction of the `dma_alloc_coherent` API.
+///
+/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
+/// large consistent DMA regions.
+///
+/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
+/// processor's virtual address space) and the device address which can be given to the device
+/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
+/// is dropped.
+///
+/// # Invariants
+///
+/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
+/// to an allocated region of consistent memory and we hold a reference to the device.
+pub struct CoherentAllocation<T: AsBytes + FromBytes> {
+    dev: ARef<Device>,
+    dma_handle: bindings::dma_addr_t,
+    count: usize,
+    cpu_addr: *mut T,
+    dma_attrs: Attribs,
+}
+
+impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
+    /// Allocates a region of `size_of::<T> * count` of consistent memory.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// use kernel::device::Device;
+    /// use kernel::dma::{attrs::*, CoherentAllocation};
+    ///
+    /// # fn test(dev: &Device) -> Result {
+    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
+    ///                                                                  DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    pub fn alloc_attrs(
+        dev: &Device,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attribs,
+    ) -> Result<CoherentAllocation<T>> {
+        build_assert!(core::mem::size_of::<T>() > 0,
+                      "It doesn't make sense for the allocated type to be a ZST");
+
+        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
+        let mut dma_handle = 0;
+        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
+        // We ensure that we catch the failure on this function and throw an ENOMEM
+        let ret = unsafe {
+            bindings::dma_alloc_attrs(
+                dev.as_raw(),
+                size,
+                &mut dma_handle, gfp_flags.as_raw(),
+                dma_attrs.as_raw(),
+            )
+        };
+        if ret.is_null() {
+            return Err(ENOMEM)
+        }
+        // INVARIANT: We just successfully allocated a coherent region which is accessible for
+        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
+        // to the device.
+        Ok(Self {
+            dev: dev.into(),
+            dma_handle,
+            count,
+            cpu_addr: ret as *mut T,
+            dma_attrs,
+        })
+    }
+
+    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
+    pub fn alloc_coherent(dev: &Device,
+                          count: usize,
+                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
+        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
+    }
+
+    /// Returns the base address to the allocated region and the dma handle. The caller takes
+    /// ownership of the returned resources.
+    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
+        let ret = (self.cpu_addr as _, self.dma_handle);
+        core::mem::forget(self);
+        ret
+    }
+
+    /// Returns the base address to the allocated region in the CPU's virtual address space.
+    pub fn start_ptr(&self) -> *const T {
+        self.cpu_addr as _
+    }
+
+    /// Returns the base address to the allocated region in the CPU's virtual address space as
+    /// a mutable pointer.
+    pub fn start_ptr_mut(&mut self) -> *mut T {
+        self.cpu_addr
+    }
+
+    /// Returns a DMA handle which may given to the device as the DMA address base of
+    /// the region.
+    pub fn dma_handle(&self) -> bindings::dma_addr_t {
+        self.dma_handle
+    }
+
+    /// Returns the CPU-addressable region as a slice.
+    pub fn cpu_buf(&self) -> &[T]
+    {
+        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
+        // is valid for reads for `self.count * size_of::<T>` bytes.
+        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
+    }
+
+    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
+    pub fn cpu_buf_mut(&mut self) -> &mut [T]
+    {
+        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
+        // is valid for reads for `self.count * size_of::<T>` bytes.
+        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
+    }
+}
+
+impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
+    fn drop(&mut self) {
+        let size = self.count * core::mem::size_of::<T>();
+        // SAFETY: the device, cpu address, and the dma handle is valid due to the
+        // type invariants on `CoherentAllocation`.
+        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
+                                          self.cpu_addr as _,
+                                          self.dma_handle,
+                                          self.dma_attrs.as_raw(),) }
+    }
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index e1065a7551a3..6e90ebf5a130 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -35,6 +35,7 @@
 mod build_assert;
 pub mod cred;
 pub mod device;
+pub mod dma;
 pub mod error;
 #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
 pub mod firmware;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-10 22:14 ` [PATCH v7 2/2] rust: add dma coherent allocator abstraction Abdiel Janulgue
@ 2024-12-13  8:53   ` Abdiel Janulgue
  2024-12-19 11:22     ` Andreas Hindborg
  2024-12-13 14:12   ` Daniel Almeida
  2024-12-13 14:27   ` Alice Ryhl
  2 siblings, 1 reply; 15+ messages in thread
From: Abdiel Janulgue @ 2024-12-13  8:53 UTC (permalink / raw)
  To: rust-for-linux, Daniel Almeida
  Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, airlied,
	open list:DMA MAPPING HELPERS

Gentle ping. Does this approach make sense? :)

On 11/12/2024 00:14, Abdiel Janulgue wrote:
> Add a simple dma coherent allocator rust abstraction. Based on
> Andreas Hindborg's dma abstractions from the rnvme driver, which
> was also based on earlier work by Wedson Almeida Filho.
> 
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
> ---
>   rust/bindings/bindings_helper.h |   1 +
>   rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
>   rust/kernel/lib.rs              |   1 +
>   3 files changed, 225 insertions(+)
>   create mode 100644 rust/kernel/dma.rs
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 5c4dfe22f41a..49bf713b9bb6 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -11,6 +11,7 @@
>   #include <linux/blk_types.h>
>   #include <linux/blkdev.h>
>   #include <linux/cred.h>
> +#include <linux/dma-mapping.h>
>   #include <linux/errname.h>
>   #include <linux/ethtool.h>
>   #include <linux/file.h>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> new file mode 100644
> index 000000000000..29ae744d6f2b
> --- /dev/null
> +++ b/rust/kernel/dma.rs
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Direct memory access (DMA).
> +//!
> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
> +
> +use crate::{
> +    bindings,
> +    build_assert,
> +    device::Device,
> +    error::code::*,
> +    error::Result,
> +    types::ARef,
> +    transmute::{AsBytes, FromBytes},
> +};
> +
> +/// Possible attributes associated with a DMA mapping.
> +///
> +/// They can be combined with the operators `|`, `&`, and `!`.
> +///
> +/// Values can be used from the [`attrs`] module.
> +#[derive(Clone, Copy, PartialEq)]
> +pub struct Attribs(u32);
> +
> +impl Attribs {
> +    /// Get the raw representation of this attribute.
> +    pub(crate) fn as_raw(self) -> u64 {
> +        self.0.into()
> +    }
> +
> +    /// Check whether `flags` is contained in `self`.
> +    pub fn contains(self, flags: Attribs) -> bool {
> +        (self & flags) == flags
> +    }
> +}
> +
> +impl core::ops::BitOr for Attribs {
> +    type Output = Self;
> +    fn bitor(self, rhs: Self) -> Self::Output {
> +        Self(self.0 | rhs.0)
> +    }
> +}
> +
> +impl core::ops::BitAnd for Attribs {
> +    type Output = Self;
> +    fn bitand(self, rhs: Self) -> Self::Output {
> +        Self(self.0 & rhs.0)
> +    }
> +}
> +
> +impl core::ops::Not for Attribs {
> +    type Output = Self;
> +    fn not(self) -> Self::Output {
> +        Self(!self.0)
> +    }
> +}
> +
> +/// DMA mapping attrributes.
> +pub mod attrs {
> +    use super::Attribs;
> +
> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
> +    /// and writes may pass each other.
> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
> +
> +    /// Specifies that writes to the mapping may be buffered to improve performance.
> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
> +
> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
> +
> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
> +    /// that it has been already transferred to 'device' domain.
> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
> +
> +    /// Forces contiguous allocation of the buffer in physical memory.
> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
> +
> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
> +    /// to allocate memory to in a way that gives better TLB efficiency.
> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
> +
> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
> +    /// __GFP_NOWARN).
> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
> +
> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
> +}
> +
> +/// An abstraction of the `dma_alloc_coherent` API.
> +///
> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
> +/// large consistent DMA regions.
> +///
> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
> +/// processor's virtual address space) and the device address which can be given to the device
> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
> +/// is dropped.
> +///
> +/// # Invariants
> +///
> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
> +/// to an allocated region of consistent memory and we hold a reference to the device.
> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
> +    dev: ARef<Device>,
> +    dma_handle: bindings::dma_addr_t,
> +    count: usize,
> +    cpu_addr: *mut T,
> +    dma_attrs: Attribs,
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::device::Device;
> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
> +    ///
> +    /// # fn test(dev: &Device) -> Result {
> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
> +    ///                                                                  DMA_ATTR_NO_WARN)?;
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    pub fn alloc_attrs(
> +        dev: &Device,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attribs,
> +    ) -> Result<CoherentAllocation<T>> {
> +        build_assert!(core::mem::size_of::<T>() > 0,
> +                      "It doesn't make sense for the allocated type to be a ZST");
> +
> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
> +        let mut dma_handle = 0;
> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
> +        // We ensure that we catch the failure on this function and throw an ENOMEM
> +        let ret = unsafe {
> +            bindings::dma_alloc_attrs(
> +                dev.as_raw(),
> +                size,
> +                &mut dma_handle, gfp_flags.as_raw(),
> +                dma_attrs.as_raw(),
> +            )
> +        };
> +        if ret.is_null() {
> +            return Err(ENOMEM)
> +        }
> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
> +        // to the device.
> +        Ok(Self {
> +            dev: dev.into(),
> +            dma_handle,
> +            count,
> +            cpu_addr: ret as *mut T,
> +            dma_attrs,
> +        })
> +    }
> +
> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
> +    pub fn alloc_coherent(dev: &Device,
> +                          count: usize,
> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
> +    }
> +
> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
> +    /// ownership of the returned resources.
> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
> +        let ret = (self.cpu_addr as _, self.dma_handle);
> +        core::mem::forget(self);
> +        ret
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
> +    pub fn start_ptr(&self) -> *const T {
> +        self.cpu_addr as _
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
> +    /// a mutable pointer.
> +    pub fn start_ptr_mut(&mut self) -> *mut T {
> +        self.cpu_addr
> +    }
> +
> +    /// Returns a DMA handle which may given to the device as the DMA address base of
> +    /// the region.
> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
> +        self.dma_handle
> +    }
> +
> +    /// Returns the CPU-addressable region as a slice.
> +    pub fn cpu_buf(&self) -> &[T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
> +    }
> +
> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
> +    fn drop(&mut self) {
> +        let size = self.count * core::mem::size_of::<T>();
> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
> +        // type invariants on `CoherentAllocation`.
> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
> +                                          self.cpu_addr as _,
> +                                          self.dma_handle,
> +                                          self.dma_attrs.as_raw(),) }
> +    }
> +}
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index e1065a7551a3..6e90ebf5a130 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -35,6 +35,7 @@
>   mod build_assert;
>   pub mod cred;
>   pub mod device;
> +pub mod dma;
>   pub mod error;
>   #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>   pub mod firmware;


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-10 22:14 ` [PATCH v7 2/2] rust: add dma coherent allocator abstraction Abdiel Janulgue
  2024-12-13  8:53   ` Abdiel Janulgue
@ 2024-12-13 14:12   ` Daniel Almeida
  2024-12-13 14:27   ` Alice Ryhl
  2 siblings, 0 replies; 15+ messages in thread
From: Daniel Almeida @ 2024-12-13 14:12 UTC (permalink / raw)
  To: Abdiel Janulgue
  Cc: rust-for-linux, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, airlied,
	open list:DMA MAPPING HELPERS

Hi Abdiel,

> On 10 Dec 2024, at 19:14, Abdiel Janulgue <abdiel.janulgue@gmail.com> wrote:
> 
> Add a simple dma coherent allocator rust abstraction. Based on
> Andreas Hindborg's dma abstractions from the rnvme driver, which
> was also based on earlier work by Wedson Almeida Filho.
> 
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
> ---
> rust/bindings/bindings_helper.h |   1 +
> rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
> rust/kernel/lib.rs              |   1 +
> 3 files changed, 225 insertions(+)
> create mode 100644 rust/kernel/dma.rs
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 5c4dfe22f41a..49bf713b9bb6 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -11,6 +11,7 @@
> #include <linux/blk_types.h>
> #include <linux/blkdev.h>
> #include <linux/cred.h>
> +#include <linux/dma-mapping.h>
> #include <linux/errname.h>
> #include <linux/ethtool.h>
> #include <linux/file.h>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> new file mode 100644
> index 000000000000..29ae744d6f2b
> --- /dev/null
> +++ b/rust/kernel/dma.rs
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Direct memory access (DMA).
> +//!
> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
> +
> +use crate::{
> +    bindings,
> +    build_assert,
> +    device::Device,
> +    error::code::*,
> +    error::Result,
> +    types::ARef,
> +    transmute::{AsBytes, FromBytes},
> +};
> +
> +/// Possible attributes associated with a DMA mapping.
> +///
> +/// They can be combined with the operators `|`, `&`, and `!`.
> +///
> +/// Values can be used from the [`attrs`] module.
> +#[derive(Clone, Copy, PartialEq)]
> +pub struct Attribs(u32);
> +
> +impl Attribs {
> +    /// Get the raw representation of this attribute.
> +    pub(crate) fn as_raw(self) -> u64 {
> +        self.0.into()
> +    }
> +
> +    /// Check whether `flags` is contained in `self`.
> +    pub fn contains(self, flags: Attribs) -> bool {
> +        (self & flags) == flags
> +    }
> +}
> +
> +impl core::ops::BitOr for Attribs {
> +    type Output = Self;
> +    fn bitor(self, rhs: Self) -> Self::Output {
> +        Self(self.0 | rhs.0)
> +    }
> +}
> +
> +impl core::ops::BitAnd for Attribs {
> +    type Output = Self;
> +    fn bitand(self, rhs: Self) -> Self::Output {
> +        Self(self.0 & rhs.0)
> +    }
> +}
> +
> +impl core::ops::Not for Attribs {
> +    type Output = Self;
> +    fn not(self) -> Self::Output {
> +        Self(!self.0)
> +    }
> +}
> +
> +/// DMA mapping attrributes.
> +pub mod attrs {
> +    use super::Attribs;
> +
> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
> +    /// and writes may pass each other.
> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
> +
> +    /// Specifies that writes to the mapping may be buffered to improve performance.
> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
> +
> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
> +
> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
> +    /// that it has been already transferred to 'device' domain.
> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
> +
> +    /// Forces contiguous allocation of the buffer in physical memory.
> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
> +
> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
> +    /// to allocate memory to in a way that gives better TLB efficiency.
> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
> +
> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
> +    /// __GFP_NOWARN).
> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
> +
> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
> +}
> +
> +/// An abstraction of the `dma_alloc_coherent` API.
> +///
> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
> +/// large consistent DMA regions.
> +///
> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
> +/// processor's virtual address space) and the device address which can be given to the device
> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
> +/// is dropped.
> +///
> +/// # Invariants
> +///
> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
> +/// to an allocated region of consistent memory and we hold a reference to the device.
> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
> +    dev: ARef<Device>,
> +    dma_handle: bindings::dma_addr_t,
> +    count: usize,
> +    cpu_addr: *mut T,
> +    dma_attrs: Attribs,
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::device::Device;
> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
> +    ///
> +    /// # fn test(dev: &Device) -> Result {
> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
> +    ///                                                                  DMA_ATTR_NO_WARN)?;
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    pub fn alloc_attrs(
> +        dev: &Device,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attribs,
> +    ) -> Result<CoherentAllocation<T>> {
> +        build_assert!(core::mem::size_of::<T>() > 0,
> +                      "It doesn't make sense for the allocated type to be a ZST");
> +
> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
> +        let mut dma_handle = 0;
> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
> +        // We ensure that we catch the failure on this function and throw an ENOMEM
> +        let ret = unsafe {
> +            bindings::dma_alloc_attrs(
> +                dev.as_raw(),
> +                size,
> +                &mut dma_handle, gfp_flags.as_raw(),
> +                dma_attrs.as_raw(),
> +            )
> +        };
> +        if ret.is_null() {
> +            return Err(ENOMEM)
> +        }
> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
> +        // to the device.
> +        Ok(Self {
> +            dev: dev.into(),
> +            dma_handle,
> +            count,
> +            cpu_addr: ret as *mut T,
> +            dma_attrs,
> +        })
> +    }
> +
> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
> +    pub fn alloc_coherent(dev: &Device,
> +                          count: usize,
> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
> +    }
> +
> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
> +    /// ownership of the returned resources.
> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
> +        let ret = (self.cpu_addr as _, self.dma_handle);
> +        core::mem::forget(self);
> +        ret
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
> +    pub fn start_ptr(&self) -> *const T {
> +        self.cpu_addr as _
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
> +    /// a mutable pointer.
> +    pub fn start_ptr_mut(&mut self) -> *mut T {
> +        self.cpu_addr
> +    }
> +
> +    /// Returns a DMA handle which may given to the device as the DMA address base of
> +    /// the region.
> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
> +        self.dma_handle
> +    }
> +
> +    /// Returns the CPU-addressable region as a slice.
> +    pub fn cpu_buf(&self) -> &[T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
> +    }
> +
> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
> +    fn drop(&mut self) {
> +        let size = self.count * core::mem::size_of::<T>();
> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
> +        // type invariants on `CoherentAllocation`.
> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
> +                                          self.cpu_addr as _,
> +                                          self.dma_handle,
> +                                          self.dma_attrs.as_raw(),) }
> +    }
> +}
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index e1065a7551a3..6e90ebf5a130 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -35,6 +35,7 @@
> mod build_assert;
> pub mod cred;
> pub mod device;
> +pub mod dma;
> pub mod error;
> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
> pub mod firmware;
> -- 
> 2.43.0
> 


This still looks good to me.

— Daniel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-10 22:14 ` [PATCH v7 2/2] rust: add dma coherent allocator abstraction Abdiel Janulgue
  2024-12-13  8:53   ` Abdiel Janulgue
  2024-12-13 14:12   ` Daniel Almeida
@ 2024-12-13 14:27   ` Alice Ryhl
  2024-12-13 14:47     ` Daniel Almeida
  2024-12-16 10:28     ` Abdiel Janulgue
  2 siblings, 2 replies; 15+ messages in thread
From: Alice Ryhl @ 2024-12-13 14:27 UTC (permalink / raw)
  To: Abdiel Janulgue
  Cc: rust-for-linux, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, airlied,
	open list:DMA MAPPING HELPERS, Daniel Almeida

On Tue, Dec 10, 2024 at 11:16 PM Abdiel Janulgue
<abdiel.janulgue@gmail.com> wrote:
>
> Add a simple dma coherent allocator rust abstraction. Based on
> Andreas Hindborg's dma abstractions from the rnvme driver, which
> was also based on earlier work by Wedson Almeida Filho.
>
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
> ---
>  rust/bindings/bindings_helper.h |   1 +
>  rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
>  rust/kernel/lib.rs              |   1 +
>  3 files changed, 225 insertions(+)
>  create mode 100644 rust/kernel/dma.rs
>
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 5c4dfe22f41a..49bf713b9bb6 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -11,6 +11,7 @@
>  #include <linux/blk_types.h>
>  #include <linux/blkdev.h>
>  #include <linux/cred.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/errname.h>
>  #include <linux/ethtool.h>
>  #include <linux/file.h>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> new file mode 100644
> index 000000000000..29ae744d6f2b
> --- /dev/null
> +++ b/rust/kernel/dma.rs
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Direct memory access (DMA).
> +//!
> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
> +
> +use crate::{
> +    bindings,
> +    build_assert,
> +    device::Device,
> +    error::code::*,
> +    error::Result,
> +    types::ARef,
> +    transmute::{AsBytes, FromBytes},
> +};
> +
> +/// Possible attributes associated with a DMA mapping.
> +///
> +/// They can be combined with the operators `|`, `&`, and `!`.
> +///
> +/// Values can be used from the [`attrs`] module.
> +#[derive(Clone, Copy, PartialEq)]
> +pub struct Attribs(u32);
> +
> +impl Attribs {
> +    /// Get the raw representation of this attribute.
> +    pub(crate) fn as_raw(self) -> u64 {
> +        self.0.into()
> +    }
> +
> +    /// Check whether `flags` is contained in `self`.
> +    pub fn contains(self, flags: Attribs) -> bool {
> +        (self & flags) == flags
> +    }
> +}
> +
> +impl core::ops::BitOr for Attribs {
> +    type Output = Self;
> +    fn bitor(self, rhs: Self) -> Self::Output {
> +        Self(self.0 | rhs.0)
> +    }
> +}
> +
> +impl core::ops::BitAnd for Attribs {
> +    type Output = Self;
> +    fn bitand(self, rhs: Self) -> Self::Output {
> +        Self(self.0 & rhs.0)
> +    }
> +}
> +
> +impl core::ops::Not for Attribs {
> +    type Output = Self;
> +    fn not(self) -> Self::Output {
> +        Self(!self.0)
> +    }
> +}
> +
> +/// DMA mapping attrributes.
> +pub mod attrs {
> +    use super::Attribs;
> +
> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
> +    /// and writes may pass each other.
> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
> +
> +    /// Specifies that writes to the mapping may be buffered to improve performance.
> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
> +
> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
> +
> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
> +    /// that it has been already transferred to 'device' domain.
> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
> +
> +    /// Forces contiguous allocation of the buffer in physical memory.
> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
> +
> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
> +    /// to allocate memory to in a way that gives better TLB efficiency.
> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
> +
> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
> +    /// __GFP_NOWARN).
> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
> +
> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
> +}
> +
> +/// An abstraction of the `dma_alloc_coherent` API.
> +///
> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
> +/// large consistent DMA regions.
> +///
> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
> +/// processor's virtual address space) and the device address which can be given to the device
> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
> +/// is dropped.
> +///
> +/// # Invariants
> +///
> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
> +/// to an allocated region of consistent memory and we hold a reference to the device.
> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
> +    dev: ARef<Device>,
> +    dma_handle: bindings::dma_addr_t,
> +    count: usize,
> +    cpu_addr: *mut T,
> +    dma_attrs: Attribs,
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::device::Device;
> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
> +    ///
> +    /// # fn test(dev: &Device) -> Result {
> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
> +    ///                                                                  DMA_ATTR_NO_WARN)?;
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    pub fn alloc_attrs(
> +        dev: &Device,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attribs,
> +    ) -> Result<CoherentAllocation<T>> {
> +        build_assert!(core::mem::size_of::<T>() > 0,
> +                      "It doesn't make sense for the allocated type to be a ZST");
> +
> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
> +        let mut dma_handle = 0;
> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
> +        // We ensure that we catch the failure on this function and throw an ENOMEM
> +        let ret = unsafe {
> +            bindings::dma_alloc_attrs(
> +                dev.as_raw(),
> +                size,
> +                &mut dma_handle, gfp_flags.as_raw(),
> +                dma_attrs.as_raw(),
> +            )
> +        };
> +        if ret.is_null() {
> +            return Err(ENOMEM)
> +        }
> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
> +        // to the device.
> +        Ok(Self {
> +            dev: dev.into(),
> +            dma_handle,
> +            count,
> +            cpu_addr: ret as *mut T,
> +            dma_attrs,
> +        })
> +    }
> +
> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
> +    pub fn alloc_coherent(dev: &Device,
> +                          count: usize,
> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
> +    }

Please run rustfmt.

> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
> +    /// ownership of the returned resources.
> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
> +        let ret = (self.cpu_addr as _, self.dma_handle);
> +        core::mem::forget(self);
> +        ret
> +    }

Not only does this skip the destructor of `dma_free_attrs`. It also
skips the destructor of fields including the `dev` field. Did you
intend to leave the refcount on `struct device` without decrementing
it?

> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
> +    pub fn start_ptr(&self) -> *const T {
> +        self.cpu_addr as _

This conversion happens without needing a cast.

> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
> +    /// a mutable pointer.
> +    pub fn start_ptr_mut(&mut self) -> *mut T {
> +        self.cpu_addr
> +    }
> +
> +    /// Returns a DMA handle which may given to the device as the DMA address base of
> +    /// the region.
> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
> +        self.dma_handle
> +    }
> +
> +    /// Returns the CPU-addressable region as a slice.
> +    pub fn cpu_buf(&self) -> &[T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }

Immutable slices require that the data does not change while the
reference is live. Is that the case? If so, your safety comment should
explain that.

> +    }
> +
> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }

Mutable slices require that the data is not written to *or read* by
anybody else while the reference is live. Is that the case? If so,
your safety comment should explain that.

> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
> +    fn drop(&mut self) {
> +        let size = self.count * core::mem::size_of::<T>();
> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
> +        // type invariants on `CoherentAllocation`.
> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
> +                                          self.cpu_addr as _,
> +                                          self.dma_handle,
> +                                          self.dma_attrs.as_raw(),) }
> +    }
> +}
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index e1065a7551a3..6e90ebf5a130 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -35,6 +35,7 @@
>  mod build_assert;
>  pub mod cred;
>  pub mod device;
> +pub mod dma;
>  pub mod error;
>  #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>  pub mod firmware;
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-13 14:27   ` Alice Ryhl
@ 2024-12-13 14:47     ` Daniel Almeida
  2024-12-13 15:28       ` Robin Murphy
  2024-12-16 10:28     ` Abdiel Janulgue
  1 sibling, 1 reply; 15+ messages in thread
From: Daniel Almeida @ 2024-12-13 14:47 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Abdiel Janulgue, rust-for-linux, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, Danilo Krummrich, Valentin Obst,
	open list, Christoph Hellwig, Marek Szyprowski, Robin Murphy,
	airlied, open list:DMA MAPPING HELPERS



> On 13 Dec 2024, at 11:27, Alice Ryhl <aliceryhl@google.com> wrote:
> 
> On Tue, Dec 10, 2024 at 11:16 PM Abdiel Janulgue
> <abdiel.janulgue@gmail.com> wrote:
>> 
>> Add a simple dma coherent allocator rust abstraction. Based on
>> Andreas Hindborg's dma abstractions from the rnvme driver, which
>> was also based on earlier work by Wedson Almeida Filho.
>> 
>> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
>> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
>> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
>> ---
>> rust/bindings/bindings_helper.h |   1 +
>> rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
>> rust/kernel/lib.rs              |   1 +
>> 3 files changed, 225 insertions(+)
>> create mode 100644 rust/kernel/dma.rs
>> 
>> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
>> index 5c4dfe22f41a..49bf713b9bb6 100644
>> --- a/rust/bindings/bindings_helper.h
>> +++ b/rust/bindings/bindings_helper.h
>> @@ -11,6 +11,7 @@
>> #include <linux/blk_types.h>
>> #include <linux/blkdev.h>
>> #include <linux/cred.h>
>> +#include <linux/dma-mapping.h>
>> #include <linux/errname.h>
>> #include <linux/ethtool.h>
>> #include <linux/file.h>
>> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
>> new file mode 100644
>> index 000000000000..29ae744d6f2b
>> --- /dev/null
>> +++ b/rust/kernel/dma.rs
>> @@ -0,0 +1,223 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +//! Direct memory access (DMA).
>> +//!
>> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
>> +
>> +use crate::{
>> +    bindings,
>> +    build_assert,
>> +    device::Device,
>> +    error::code::*,
>> +    error::Result,
>> +    types::ARef,
>> +    transmute::{AsBytes, FromBytes},
>> +};
>> +
>> +/// Possible attributes associated with a DMA mapping.
>> +///
>> +/// They can be combined with the operators `|`, `&`, and `!`.
>> +///
>> +/// Values can be used from the [`attrs`] module.
>> +#[derive(Clone, Copy, PartialEq)]
>> +pub struct Attribs(u32);
>> +
>> +impl Attribs {
>> +    /// Get the raw representation of this attribute.
>> +    pub(crate) fn as_raw(self) -> u64 {
>> +        self.0.into()
>> +    }
>> +
>> +    /// Check whether `flags` is contained in `self`.
>> +    pub fn contains(self, flags: Attribs) -> bool {
>> +        (self & flags) == flags
>> +    }
>> +}
>> +
>> +impl core::ops::BitOr for Attribs {
>> +    type Output = Self;
>> +    fn bitor(self, rhs: Self) -> Self::Output {
>> +        Self(self.0 | rhs.0)
>> +    }
>> +}
>> +
>> +impl core::ops::BitAnd for Attribs {
>> +    type Output = Self;
>> +    fn bitand(self, rhs: Self) -> Self::Output {
>> +        Self(self.0 & rhs.0)
>> +    }
>> +}
>> +
>> +impl core::ops::Not for Attribs {
>> +    type Output = Self;
>> +    fn not(self) -> Self::Output {
>> +        Self(!self.0)
>> +    }
>> +}
>> +
>> +/// DMA mapping attrributes.
>> +pub mod attrs {
>> +    use super::Attribs;
>> +
>> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
>> +    /// and writes may pass each other.
>> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
>> +
>> +    /// Specifies that writes to the mapping may be buffered to improve performance.
>> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
>> +
>> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
>> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
>> +
>> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
>> +    /// that it has been already transferred to 'device' domain.
>> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
>> +
>> +    /// Forces contiguous allocation of the buffer in physical memory.
>> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
>> +
>> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
>> +    /// to allocate memory to in a way that gives better TLB efficiency.
>> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
>> +
>> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
>> +    /// __GFP_NOWARN).
>> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
>> +
>> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
>> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
>> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
>> +}
>> +
>> +/// An abstraction of the `dma_alloc_coherent` API.
>> +///
>> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
>> +/// large consistent DMA regions.
>> +///
>> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
>> +/// processor's virtual address space) and the device address which can be given to the device
>> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
>> +/// is dropped.
>> +///
>> +/// # Invariants
>> +///
>> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
>> +/// to an allocated region of consistent memory and we hold a reference to the device.
>> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
>> +    dev: ARef<Device>,
>> +    dma_handle: bindings::dma_addr_t,
>> +    count: usize,
>> +    cpu_addr: *mut T,
>> +    dma_attrs: Attribs,
>> +}
>> +
>> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
>> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
>> +    ///
>> +    /// # Examples
>> +    ///
>> +    /// ```
>> +    /// use kernel::device::Device;
>> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
>> +    ///
>> +    /// # fn test(dev: &Device) -> Result {
>> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
>> +    ///                                                                  DMA_ATTR_NO_WARN)?;
>> +    /// # Ok::<(), Error>(()) }
>> +    /// ```
>> +    pub fn alloc_attrs(
>> +        dev: &Device,
>> +        count: usize,
>> +        gfp_flags: kernel::alloc::Flags,
>> +        dma_attrs: Attribs,
>> +    ) -> Result<CoherentAllocation<T>> {
>> +        build_assert!(core::mem::size_of::<T>() > 0,
>> +                      "It doesn't make sense for the allocated type to be a ZST");
>> +
>> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
>> +        let mut dma_handle = 0;
>> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
>> +        // We ensure that we catch the failure on this function and throw an ENOMEM
>> +        let ret = unsafe {
>> +            bindings::dma_alloc_attrs(
>> +                dev.as_raw(),
>> +                size,
>> +                &mut dma_handle, gfp_flags.as_raw(),
>> +                dma_attrs.as_raw(),
>> +            )
>> +        };
>> +        if ret.is_null() {
>> +            return Err(ENOMEM)
>> +        }
>> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
>> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
>> +        // to the device.
>> +        Ok(Self {
>> +            dev: dev.into(),
>> +            dma_handle,
>> +            count,
>> +            cpu_addr: ret as *mut T,
>> +            dma_attrs,
>> +        })
>> +    }
>> +
>> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
>> +    pub fn alloc_coherent(dev: &Device,
>> +                          count: usize,
>> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
>> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
>> +    }
> 
> Please run rustfmt.
> 
>> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
>> +    /// ownership of the returned resources.
>> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
>> +        let ret = (self.cpu_addr as _, self.dma_handle);
>> +        core::mem::forget(self);
>> +        ret
>> +    }
> 
> Not only does this skip the destructor of `dma_free_attrs`. It also
> skips the destructor of fields including the `dev` field. Did you
> intend to leave the refcount on `struct device` without decrementing
> it?
> 
>> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
>> +    pub fn start_ptr(&self) -> *const T {
>> +        self.cpu_addr as _
> 
> This conversion happens without needing a cast.
> 
>> +    }
>> +
>> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
>> +    /// a mutable pointer.
>> +    pub fn start_ptr_mut(&mut self) -> *mut T {
>> +        self.cpu_addr
>> +    }
>> +
>> +    /// Returns a DMA handle which may given to the device as the DMA address base of
>> +    /// the region.
>> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
>> +        self.dma_handle
>> +    }
>> +
>> +    /// Returns the CPU-addressable region as a slice.
>> +    pub fn cpu_buf(&self) -> &[T]
>> +    {
>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
> 
> Immutable slices require that the data does not change while the
> reference is live. Is that the case? If so, your safety comment should
> explain that.
> 
>> +    }
>> +
>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>> +    {
>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
> 
> Mutable slices require that the data is not written to *or read* by
> anybody else while the reference is live. Is that the case? If so,
> your safety comment should explain that.
> 

The buffer will probably be shared between the CPU and some hardware device, since this is the
point of the dma mapping API.

It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
place while the slices above are alive.

I think that adding that as a safety requirement to `cpu_buf` and `cpu_buf_mut` will be sufficient.

>> +    }
>> +}
>> +
>> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
>> +    fn drop(&mut self) {
>> +        let size = self.count * core::mem::size_of::<T>();
>> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
>> +        // type invariants on `CoherentAllocation`.
>> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
>> +                                          self.cpu_addr as _,
>> +                                          self.dma_handle,
>> +                                          self.dma_attrs.as_raw(),) }
>> +    }
>> +}
>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>> index e1065a7551a3..6e90ebf5a130 100644
>> --- a/rust/kernel/lib.rs
>> +++ b/rust/kernel/lib.rs
>> @@ -35,6 +35,7 @@
>> mod build_assert;
>> pub mod cred;
>> pub mod device;
>> +pub mod dma;
>> pub mod error;
>> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>> pub mod firmware;
>> —
>> 2.43.0

— Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-13 14:47     ` Daniel Almeida
@ 2024-12-13 15:28       ` Robin Murphy
  2024-12-13 19:08         ` Daniel Almeida
  0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2024-12-13 15:28 UTC (permalink / raw)
  To: Daniel Almeida, Alice Ryhl
  Cc: Abdiel Janulgue, rust-for-linux, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Trevor Gross, Danilo Krummrich, Valentin Obst,
	open list, Christoph Hellwig, Marek Szyprowski, airlied,
	open list:DMA MAPPING HELPERS

On 13/12/2024 2:47 pm, Daniel Almeida wrote:
[...]
>>> +    /// Returns the CPU-addressable region as a slice.
>>> +    pub fn cpu_buf(&self) -> &[T]
>>> +    {
>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>
>> Immutable slices require that the data does not change while the
>> reference is live. Is that the case? If so, your safety comment should
>> explain that.
>>
>>> +    }
>>> +
>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>> +    {
>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>
>> Mutable slices require that the data is not written to *or read* by
>> anybody else while the reference is live. Is that the case? If so,
>> your safety comment should explain that.
>>
> 
> The buffer will probably be shared between the CPU and some hardware device, since this is the
> point of the dma mapping API.
> 
> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
> place while the slices above are alive.

Hmm, that sounds troublesome... the nature of coherent allocations is 
that both CPU and device may access them at any time, and you can 
definitely expect ringbuffer-style usage models where a CPU is writing 
to part of the buffer while the device is reading/writing another part, 
but also cases where a CPU needs to poll for a device write to a 
particular location.

Thanks,
Robin.

> I think that adding that as a safety requirement to `cpu_buf` and `cpu_buf_mut` will be sufficient.
> 
>>> +    }
>>> +}
>>> +
>>> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
>>> +    fn drop(&mut self) {
>>> +        let size = self.count * core::mem::size_of::<T>();
>>> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
>>> +        // type invariants on `CoherentAllocation`.
>>> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
>>> +                                          self.cpu_addr as _,
>>> +                                          self.dma_handle,
>>> +                                          self.dma_attrs.as_raw(),) }
>>> +    }
>>> +}
>>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>>> index e1065a7551a3..6e90ebf5a130 100644
>>> --- a/rust/kernel/lib.rs
>>> +++ b/rust/kernel/lib.rs
>>> @@ -35,6 +35,7 @@
>>> mod build_assert;
>>> pub mod cred;
>>> pub mod device;
>>> +pub mod dma;
>>> pub mod error;
>>> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>>> pub mod firmware;
>>> —
>>> 2.43.0
> 
> — Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-13 15:28       ` Robin Murphy
@ 2024-12-13 19:08         ` Daniel Almeida
  2024-12-16 10:23           ` Abdiel Janulgue
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Almeida @ 2024-12-13 19:08 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Alice Ryhl, Abdiel Janulgue, rust-for-linux, Miguel Ojeda,
	Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
	Valentin Obst, open list, Christoph Hellwig, Marek Szyprowski,
	airlied, open list:DMA MAPPING HELPERS

Hi Robin,

> On 13 Dec 2024, at 12:28, Robin Murphy <robin.murphy@arm.com> wrote:
> 
> On 13/12/2024 2:47 pm, Daniel Almeida wrote:
> [...]
>>>> +    /// Returns the CPU-addressable region as a slice.
>>>> +    pub fn cpu_buf(&self) -> &[T]
>>>> +    {
>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>> 
>>> Immutable slices require that the data does not change while the
>>> reference is live. Is that the case? If so, your safety comment should
>>> explain that.
>>> 
>>>> +    }
>>>> +
>>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>>> +    {
>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>> 
>>> Mutable slices require that the data is not written to *or read* by
>>> anybody else while the reference is live. Is that the case? If so,
>>> your safety comment should explain that.
>>> 
>> The buffer will probably be shared between the CPU and some hardware device, since this is the
>> point of the dma mapping API.
>> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
>> place while the slices above are alive.
> 
> Hmm, that sounds troublesome... the nature of coherent allocations is that both CPU and device may access them at any time, and you can definitely expect ringbuffer-style usage models where a CPU is writing to part of the buffer while the device is reading/writing another part, but also cases where a CPU needs to poll for a device write to a particular location.
> 

Ok, I had based my answer on some other drivers I’ve worked on in the past where the approach I cited would work.

I can see it not working for what you described, though.

This is a bit unfortunate, because it means we are back to square one, i.e.: back to read() and write() functions and
to the bound on `Copy`. That’s because, no matter how you try to dress this, there is no way to give safe and direct access
to the underlying memory if you can’t avoid situations where both the CPU and the device will be accessing the memory
at the same time.

I guess the only improvement that could be made over the approach used for v2 is to at least use copy_nonoverlapping
instead, because I assume the performance of things like:

+ /// Reads a value on a location specified by index.
+ pub fn read(&self, index: usize) -> Result<T>
+ where
+ T: Copy
+ {
+ if let Some(val) = self.cpu_buf().get(index) {
+ Ok(*val)
+ } else {
+ Err(EINVAL)
+ }
+ }

will be atrocious.


> Thanks,
> Robin.
> 
>> I think that adding that as a safety requirement to `cpu_buf` and `cpu_buf_mut` will be sufficient.
>>>> +    }
>>>> +}
>>>> +
>>>> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
>>>> +    fn drop(&mut self) {
>>>> +        let size = self.count * core::mem::size_of::<T>();
>>>> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
>>>> +        // type invariants on `CoherentAllocation`.
>>>> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
>>>> +                                          self.cpu_addr as _,
>>>> +                                          self.dma_handle,
>>>> +                                          self.dma_attrs.as_raw(),) }
>>>> +    }
>>>> +}
>>>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>>>> index e1065a7551a3..6e90ebf5a130 100644
>>>> --- a/rust/kernel/lib.rs
>>>> +++ b/rust/kernel/lib.rs
>>>> @@ -35,6 +35,7 @@
>>>> mod build_assert;
>>>> pub mod cred;
>>>> pub mod device;
>>>> +pub mod dma;
>>>> pub mod error;
>>>> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>>>> pub mod firmware;
>>>> —
>>>> 2.43.0
>> — Daniel



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-13 19:08         ` Daniel Almeida
@ 2024-12-16 10:23           ` Abdiel Janulgue
  2024-12-16 11:00             ` Daniel Almeida
  0 siblings, 1 reply; 15+ messages in thread
From: Abdiel Janulgue @ 2024-12-16 10:23 UTC (permalink / raw)
  To: Daniel Almeida, Robin Murphy, Alice Ryhl
  Cc: rust-for-linux, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, airlied,
	open list:DMA MAPPING HELPERS



On 13/12/2024 21:08, Daniel Almeida wrote:
> Hi Robin,
> 
>> On 13 Dec 2024, at 12:28, Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 13/12/2024 2:47 pm, Daniel Almeida wrote:
>> [...]
>>>>> +    /// Returns the CPU-addressable region as a slice.
>>>>> +    pub fn cpu_buf(&self) -> &[T]
>>>>> +    {
>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>>>
>>>> Immutable slices require that the data does not change while the
>>>> reference is live. Is that the case? If so, your safety comment should
>>>> explain that.
>>>>
>>>>> +    }
>>>>> +
>>>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>>>> +    {
>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>>>
>>>> Mutable slices require that the data is not written to *or read* by
>>>> anybody else while the reference is live. Is that the case? If so,
>>>> your safety comment should explain that.
>>>>
>>> The buffer will probably be shared between the CPU and some hardware device, since this is the
>>> point of the dma mapping API.
>>> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
>>> place while the slices above are alive.
>>
>> Hmm, that sounds troublesome... the nature of coherent allocations is that both CPU and device may access them at any time, and you can definitely expect ringbuffer-style usage models where a CPU is writing to part of the buffer while the device is reading/writing another part, but also cases where a CPU needs to poll for a device write to a particular location.
>>
> 
> Ok, I had based my answer on some other drivers I’ve worked on in the past where the approach I cited would work.
> 
> I can see it not working for what you described, though.
> 
> This is a bit unfortunate, because it means we are back to square one, i.e.: back to read() and write() functions and
> to the bound on `Copy`. That’s because, no matter how you try to dress this, there is no way to give safe and direct access
> to the underlying memory if you can’t avoid situations where both the CPU and the device will be accessing the memory
> at the same time.
> 

This is unfortunate indeed. Thanks Alice for pointing out the 
limitations of slice.

Btw, do we have any other concerns in going back to plain old raw 
pointers instead? i.e.,

     pub fn read(&self, index: usize) -> Result<T> {
         if index >= self.count {
             return Err(EINVAL);
         }

         let ptr = self.cpu_addr.wrapping_add(index);
         // SAFETY: We just checked that the index is within bounds.
         Ok(unsafe { ptr.read() })
     }

     pub fn write(&self, index: usize, value: &T) -> Result
     where
         T: Copy,
     {
         if index >= self.count {
             return Err(EINVAL);
         }

         let ptr = self.cpu_addr.wrapping_add(index);
         // SAFETY: We just checked that the index is within bounds.
         unsafe { ptr.write(*value) };
         Ok(())
     }

> I guess the only improvement that could be made over the approach used for v2 is to at least use copy_nonoverlapping
> instead, 

You mean introduce something like read_raw(dst: *mut u8,...) and 
write_raw(&self, src: *const u8,...)?

Regards,
Abdiel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-13 14:27   ` Alice Ryhl
  2024-12-13 14:47     ` Daniel Almeida
@ 2024-12-16 10:28     ` Abdiel Janulgue
  1 sibling, 0 replies; 15+ messages in thread
From: Abdiel Janulgue @ 2024-12-16 10:28 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: rust-for-linux, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Trevor Gross, Danilo Krummrich, Valentin Obst, open list,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, airlied,
	open list:DMA MAPPING HELPERS, Daniel Almeida

On 13/12/2024 16:27, Alice Ryhl wrote:
> 
>> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
>> +    /// ownership of the returned resources.
>> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
>> +        let ret = (self.cpu_addr as _, self.dma_handle);
>> +        core::mem::forget(self);
>> +        ret
>> +    }
> 
> Not only does this skip the destructor of `dma_free_attrs`. It also
> skips the destructor of fields including the `dev` field. Did you
> intend to leave the refcount on `struct device` without decrementing
> it?
> 

Good catch. Yeah dev should be decremented here as well.
Will incorporate fix into next revision.

Regards,
Abdiel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-16 10:23           ` Abdiel Janulgue
@ 2024-12-16 11:00             ` Daniel Almeida
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Almeida @ 2024-12-16 11:00 UTC (permalink / raw)
  To: Abdiel Janulgue
  Cc: Robin Murphy, Alice Ryhl, rust-for-linux, Miguel Ojeda,
	Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Trevor Gross, Danilo Krummrich,
	Valentin Obst, open list, Christoph Hellwig, Marek Szyprowski,
	airlied, open list:DMA MAPPING HELPERS



> On 16 Dec 2024, at 07:23, Abdiel Janulgue <abdiel.janulgue@gmail.com> wrote:
> 
> 
> 
> On 13/12/2024 21:08, Daniel Almeida wrote:
>> Hi Robin,
>>> On 13 Dec 2024, at 12:28, Robin Murphy <robin.murphy@arm.com> wrote:
>>> 
>>> On 13/12/2024 2:47 pm, Daniel Almeida wrote:
>>> [...]
>>>>>> +    /// Returns the CPU-addressable region as a slice.
>>>>>> +    pub fn cpu_buf(&self) -> &[T]
>>>>>> +    {
>>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>>>> 
>>>>> Immutable slices require that the data does not change while the
>>>>> reference is live. Is that the case? If so, your safety comment should
>>>>> explain that.
>>>>> 
>>>>>> +    }
>>>>>> +
>>>>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>>>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>>>>> +    {
>>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>>>> 
>>>>> Mutable slices require that the data is not written to *or read* by
>>>>> anybody else while the reference is live. Is that the case? If so,
>>>>> your safety comment should explain that.
>>>>> 
>>>> The buffer will probably be shared between the CPU and some hardware device, since this is the
>>>> point of the dma mapping API.
>>>> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
>>>> place while the slices above are alive.
>>> 
>>> Hmm, that sounds troublesome... the nature of coherent allocations is that both CPU and device may access them at any time, and you can definitely expect ringbuffer-style usage models where a CPU is writing to part of the buffer while the device is reading/writing another part, but also cases where a CPU needs to poll for a device write to a particular location.
>>> 
>> Ok, I had based my answer on some other drivers I’ve worked on in the past where the approach I cited would work.
>> I can see it not working for what you described, though.
>> This is a bit unfortunate, because it means we are back to square one, i.e.: back to read() and write() functions and
>> to the bound on `Copy`. That’s because, no matter how you try to dress this, there is no way to give safe and direct access
>> to the underlying memory if you can’t avoid situations where both the CPU and the device will be accessing the memory
>> at the same time.
> 
> This is unfortunate indeed. Thanks Alice for pointing out the limitations of slice.
> 
> Btw, do we have any other concerns in going back to plain old raw pointers instead? i.e.,
> 
>    pub fn read(&self, index: usize) -> Result<T> {
>        if index >= self.count {
>            return Err(EINVAL);
>        }
> 
>        let ptr = self.cpu_addr.wrapping_add(index);
>        // SAFETY: We just checked that the index is within bounds.
>        Ok(unsafe { ptr.read() })
>    }
> 
>    pub fn write(&self, index: usize, value: &T) -> Result
>    where
>        T: Copy,
>    {
>        if index >= self.count {
>            return Err(EINVAL);
>        }
> 
>        let ptr = self.cpu_addr.wrapping_add(index);
>        // SAFETY: We just checked that the index is within bounds.
>        unsafe { ptr.write(*value) };
>        Ok(())
>    }
> 
>> I guess the only improvement that could be made over the approach used for v2 is to at least use copy_nonoverlapping
>> instead, 
> 
> You mean introduce something like read_raw(dst: *mut u8,...) and write_raw(&self, src: *const u8,...)?


What I mean is that you don’t have to read (and bounds-check) a single index per function call.

Using copy_nonoverlapping lets you read/write multiple values with only a single function call and a
single check. It’s basically equivalent to a memcpy.


> 
> Regards,
> Abdiel



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-13  8:53   ` Abdiel Janulgue
@ 2024-12-19 11:22     ` Andreas Hindborg
  2025-01-08 12:26       ` Abdiel Janulgue
  0 siblings, 1 reply; 15+ messages in thread
From: Andreas Hindborg @ 2024-12-19 11:22 UTC (permalink / raw)
  To: Abdiel Janulgue
  Cc: rust-for-linux, Daniel Almeida, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Alice Ryhl, Trevor Gross, Danilo Krummrich, Valentin Obst,
	open list, Christoph Hellwig, Marek Szyprowski, Robin Murphy,
	airlied, open list:DMA MAPPING HELPERS

Abdiel Janulgue <abdiel.janulgue@gmail.com> writes:

> Gentle ping. Does this approach make sense? :)

If you include the `Allocator` trait and the `Pool` it would make my
life a lot easier when rebasing nvme. For now I'm just going to use out
of tree patches that include the dma pool.

If you do not have a user for dma pool, maybe just include the
`Allocator` trait and indicate that it is there to support other dma
allocators in the future?


Best regards,
Andreas Hindborg



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2024-12-19 11:22     ` Andreas Hindborg
@ 2025-01-08 12:26       ` Abdiel Janulgue
  2025-01-09  8:44         ` Andreas Hindborg
  0 siblings, 1 reply; 15+ messages in thread
From: Abdiel Janulgue @ 2025-01-08 12:26 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: rust-for-linux, Daniel Almeida, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Alice Ryhl, Trevor Gross, Danilo Krummrich, Valentin Obst,
	open list, Christoph Hellwig, Marek Szyprowski, Robin Murphy,
	airlied, open list:DMA MAPPING HELPERS

Hi Andreas,

On 19/12/2024 13:22, Andreas Hindborg wrote:
> If you include the `Allocator` trait and the `Pool` it would make my
> life a lot easier when rebasing nvme. For now I'm just going to use out
> of tree patches that include the dma pool.
> 
> If you do not have a user for dma pool, maybe just include the
> `Allocator` trait and indicate that it is there to support other dma
> allocators in the future?
> 
> 

Would it be okay with you to have the basic functionality merged first? 
Once this is done, I will send up a follow-up patch that would then 
include the allocator trait and the pool.

Thanks!

Cheers,
Abdiel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.
  2025-01-08 12:26       ` Abdiel Janulgue
@ 2025-01-09  8:44         ` Andreas Hindborg
  0 siblings, 0 replies; 15+ messages in thread
From: Andreas Hindborg @ 2025-01-09  8:44 UTC (permalink / raw)
  To: Abdiel Janulgue
  Cc: rust-for-linux, Daniel Almeida, Miguel Ojeda, Alex Gaynor,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Alice Ryhl, Trevor Gross, Danilo Krummrich, Valentin Obst,
	open list, Christoph Hellwig, Marek Szyprowski, Robin Murphy,
	airlied, open list:DMA MAPPING HELPERS

"Abdiel Janulgue" <abdiel.janulgue@gmail.com> writes:

> Hi Andreas,
>
> On 19/12/2024 13:22, Andreas Hindborg wrote:
>> If you include the `Allocator` trait and the `Pool` it would make my
>> life a lot easier when rebasing nvme. For now I'm just going to use out
>> of tree patches that include the dma pool.
>>
>> If you do not have a user for dma pool, maybe just include the
>> `Allocator` trait and indicate that it is there to support other dma
>> allocators in the future?
>>
>>
>
> Would it be okay with you to have the basic functionality merged first?
> Once this is done, I will send up a follow-up patch that would then
> include the allocator trait and the pool.

Later is better than never :)


Best regards,
Andreas Hindborg



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-01-09  8:45 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-10 22:14 [PATCH v7 0/2] Add dma coherent allocator abstraction Abdiel Janulgue
2024-12-10 22:14 ` [PATCH v7 1/2] rust: error: Add EOVERFLOW Abdiel Janulgue
2024-12-10 22:14 ` [PATCH v7 2/2] rust: add dma coherent allocator abstraction Abdiel Janulgue
2024-12-13  8:53   ` Abdiel Janulgue
2024-12-19 11:22     ` Andreas Hindborg
2025-01-08 12:26       ` Abdiel Janulgue
2025-01-09  8:44         ` Andreas Hindborg
2024-12-13 14:12   ` Daniel Almeida
2024-12-13 14:27   ` Alice Ryhl
2024-12-13 14:47     ` Daniel Almeida
2024-12-13 15:28       ` Robin Murphy
2024-12-13 19:08         ` Daniel Almeida
2024-12-16 10:23           ` Abdiel Janulgue
2024-12-16 11:00             ` Daniel Almeida
2024-12-16 10:28     ` Abdiel Janulgue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).