Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings

public inbox for linux-fbdev@vger.kernel.org
 help / color / mirror / Atom feed

From: "Alexandre Courbot" <acourbot@nvidia.com>
To: "Joel Fernandes" <joelagnelf@nvidia.com>
Cc: linux-kernel@vger.kernel.org, "Miguel Ojeda" <ojeda@kernel.org>,
	"Boqun Feng" <boqun@kernel.org>, "Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Benno Lossin" <lossin@kernel.org>,
	"Andreas Hindborg" <a.hindborg@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Trevor Gross" <tmgross@umich.edu>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Dave Airlie" <airlied@redhat.com>,
	"Daniel Almeida" <daniel.almeida@collabora.com>,
	"Koen Koning" <koen.koning@linux.intel.com>,
	dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org,
	rust-for-linux@vger.kernel.org,
	"Nikola Djukic" <ndjukic@nvidia.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Jani Nikula" <jani.nikula@linux.intel.com>,
	"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Tvrtko Ursulin" <tursulin@ursulin.net>,
	"Huang Rui" <ray.huang@amd.com>,
	"Matthew Auld" <matthew.auld@intel.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Lucas De Marchi" <lucas.demarchi@intel.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Helge Deller" <deller@gmx.de>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Alistair Popple" <apopple@nvidia.com>,
	"Timur Tabi" <ttabi@nvidia.com>, "Edwin Peer" <epeer@nvidia.com>,
	"Andrea Righi" <arighi@nvidia.com>,
	"Andy Ritger" <aritger@nvidia.com>, "Zhi Wang" <zhiw@nvidia.com>,
	"Balbir Singh" <balbirs@nvidia.com>,
	"Philipp Stanner" <phasta@kernel.org>,
	"Elle Rhumsaa" <elle@weathered-steel.dev>,
	alexeyi@nvidia.com, "Eliot Courtney" <ecourtney@nvidia.com>,
	joel@joelfernandes.org, linux-doc@vger.kernel.org,
	amd-gfx@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, linux-fbdev@vger.kernel.org
Subject: Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings
Date: Mon, 16 Mar 2026 22:12:01 +0900	[thread overview]
Message-ID: <DH48DNAQCE0Z.2EX23VD27CQVX@nvidia.com> (raw)
In-Reply-To: <20260309135338.3919996-2-joelagnelf@nvidia.com>

On Mon Mar 9, 2026 at 10:53 PM JST, Joel Fernandes wrote:
> Add safe Rust abstractions over the Linux kernel's GPU buddy
> allocator for physical memory management. The GPU buddy allocator
> implements a binary buddy system useful for GPU physical memory
> allocation. nova-core will use it for physical memory allocation.
>
> Christian Koenig mentioned he'd like to step down from reviewer role for
> GPU buddy so updated accordingly. Arun/Matthew agree on the modified entry.
>
> Cc: Nikola Djukic <ndjukic@nvidia.com>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

This is getting close IMHO. But `make rustdoc` fails on the examples,
and there are still clippy warnings - please make sure to address them.

A few more comments below.

<snip>
> diff --git a/rust/kernel/gpu/buddy.rs b/rust/kernel/gpu/buddy.rs
> new file mode 100644
> index 000000000000..9027c9a7778f
> --- /dev/null
> +++ b/rust/kernel/gpu/buddy.rs
> @@ -0,0 +1,611 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! GPU buddy allocator bindings.
> +//!
> +//! C header: [`include/linux/gpu_buddy.h`](srctree/include/linux/gpu_buddy.h)
> +//!
> +//! This module provides Rust abstractions over the Linux kernel's GPU buddy
> +//! allocator, which implements a binary buddy memory allocator.
> +//!
> +//! The buddy allocator manages a contiguous address space and allocates blocks
> +//! in power-of-two sizes, useful for GPU physical memory management.
> +//!
> +//! # Examples
> +//!
> +//! Create a buddy allocator and perform a basic range allocation:
> +//!
> +//! ```
> +//! use kernel::{
> +//!     gpu::buddy::{GpuBuddy, GpuBuddyAllocMode, GpuBuddyAllocFlags, GpuBuddyParams},

nit: should also use kernel formatting style.

> +//!     prelude::*,
> +//!     ptr::Alignment,
> +//!     sizes::*, //
> +//! };
> +//!
> +//! // Create a 1GB buddy allocator with 4KB minimum chunk size.
> +//! let buddy = GpuBuddy::new(GpuBuddyParams {
> +//!     base_offset: 0,
> +//!     physical_memory_size: SZ_1G as u64,
> +//!     chunk_size: SZ_4K,

`chunk_size` is an interesting case. The C API uses a `u64`, but I think
we can reasonably consider that we won't ever need chunks larger than
4GB (or can we :O). I'm actually ok with using a `usize` for this one.

One of the first things the C code does is throwing an error if it is
not a power of 2, so maybe we can even request an `Alignment`?

I'm a bit torn as to whether we should use a `u64` to conform with the C
API, but doing so would mean we cannot use an `Alignment`...

> +//! })?;
> +//!
> +//! assert_eq!(buddy.size(), SZ_1G as u64);
> +//! assert_eq!(buddy.chunk_size(), SZ_4K);
> +//! let initial_free = buddy.free_memory();
> +//!
> +//! // Allocate 16MB, results in a single 16MB block at offset 0.
> +//! let allocated = KBox::pin_init(
> +//!     buddy.alloc_blocks(
> +//!         GpuBuddyAllocMode::Range { start: 0, end: 0 },

This zero-sized range looks confusing for the example. I understand the
C API allows this, but I don't think we should. Is there a difference
with just using `GpuBuddyAllocMode::Simple`? If not, let's switch to
that, and reject zero-sized ranges in the same spirit as we don't allow
invalid flag combinations.

> +//!         SZ_16M,
> +//!         Alignment::new::<SZ_16M>(),
> +//!         GpuBuddyAllocFlags::default(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_16M as u64);
> +//!
> +//! let block = allocated.iter().next().expect("expected one block");
> +//! assert_eq!(block.offset(), 0);
> +//! assert_eq!(block.order(), 12); // 2^12 pages = 16MB
> +//! assert_eq!(block.size(), SZ_16M);

Here we should also check that there is not a second block.

> +//!
> +//! // Dropping the allocation returns the memory to the buddy allocator.

s/memory/range - technically we are not returning physical memory.

> +//! drop(allocated);
> +//! assert_eq!(buddy.free_memory(), initial_free);
> +//! # Ok::<(), Error>(())
> +//! ```
> +//!
> +//! Top-down allocation allocates from the highest addresses:
> +//!
> +//! ```
> +//! # use kernel::{
> +//! #     gpu::buddy::{GpuBuddy, GpuBuddyAllocMode, GpuBuddyAllocFlags, GpuBuddyParams},
> +//! #     prelude::*,
> +//! #     ptr::Alignment,
> +//! #     sizes::*, //

`make rustdoc` fails to build:

error[E0433]: failed to resolve: use of undeclared type `GpuBuddyAllocFlags`
    --> rust/doctests_kernel_generated.rs:6182:9
     |
6182 |         GpuBuddyAllocFlags::default(),
     |         ^^^^^^^^^^^^^^^^^^ use of undeclared type `GpuBuddyAllocFlags`
     |
help: an enum with a similar name exists
     |
6182 -         GpuBuddyAllocFlags::default(),
6182 +         GpuBuddyAllocFlag::default(),
     |
help: consider importing this struct
     |
   3 + use kernel::gpu::buddy::GpuBuddyAllocFlags;
     |

error[E0433]: failed to resolve: use of undeclared type `GpuBuddyAllocFlags`
    --> rust/doctests_kernel_generated.rs:6195:9
     |
6195 |         GpuBuddyAllocFlags::default(),
     |         ^^^^^^^^^^^^^^^^^^ use of undeclared type `GpuBuddyAllocFlags`
     |
help: an enum with a similar name exists
     |
6195 -         GpuBuddyAllocFlags::default(),
6195 +         GpuBuddyAllocFlag::default(),
     |
help: consider importing this struct
     |
   3 + use kernel::gpu::buddy::GpuBuddyAllocFlags;

> +//! # };
> +//! # let buddy = GpuBuddy::new(GpuBuddyParams {
> +//! #     base_offset: 0,
> +//! #     physical_memory_size: SZ_1G as u64,
> +//! #     chunk_size: SZ_4K,
> +//! # })?;
> +//! # let initial_free = buddy.free_memory();
> +//! let topdown = KBox::pin_init(
> +//!     buddy.alloc_blocks(
> +//!         GpuBuddyAllocMode::TopDown,
> +//!         SZ_16M,
> +//!         Alignment::new::<SZ_16M>(),
> +//!         GpuBuddyAllocFlags::default(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_16M as u64);
> +//!
> +//! let block = topdown.iter().next().expect("expected one block");
> +//! assert_eq!(block.offset(), (SZ_1G - SZ_16M) as u64);
> +//! assert_eq!(block.order(), 12);
> +//! assert_eq!(block.size(), SZ_16M);
> +//!
> +//! // Dropping the allocation returns the memory to the buddy allocator.
> +//! drop(topdown);
> +//! assert_eq!(buddy.free_memory(), initial_free);
> +//! # Ok::<(), Error>(())
> +//! ```
> +//!
> +//! Non-contiguous allocation can fill fragmented memory by returning multiple
> +//! blocks:
> +//!
> +//! ```
> +//! # use kernel::{
> +//! #     gpu::buddy::{
> +//! #         GpuBuddy, GpuBuddyAllocFlags, GpuBuddyAllocMode, GpuBuddyParams,
> +//! #     },
> +//! #     prelude::*,
> +//! #     ptr::Alignment,
> +//! #     sizes::*, //
> +//! # };
> +//! # let buddy = GpuBuddy::new(GpuBuddyParams {
> +//! #     base_offset: 0,
> +//! #     physical_memory_size: SZ_1G as u64,
> +//! #     chunk_size: SZ_4K,
> +//! # })?;
> +//! # let initial_free = buddy.free_memory();
> +//! // Create fragmentation by allocating 4MB blocks at [0,4M) and [8M,12M).
> +//! let frag1 = KBox::pin_init(
> +//!     buddy.alloc_blocks(
> +//!         GpuBuddyAllocMode::Range { start: 0, end: SZ_4M as u64 },
> +//!         SZ_4M,
> +//!         Alignment::new::<SZ_4M>(),
> +//!         GpuBuddyAllocFlags::default(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_4M as u64);
> +//!
> +//! let frag2 = KBox::pin_init(
> +//!     buddy.alloc_blocks(
> +//!         GpuBuddyAllocMode::Range {
> +//!             start: SZ_8M as u64,
> +//!             end: (SZ_8M + SZ_4M) as u64,
> +//!         },
> +//!         SZ_4M,
> +//!         Alignment::new::<SZ_4M>(),
> +//!         GpuBuddyAllocFlags::default(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_8M as u64);
> +//!
> +//! // Allocate 8MB, this returns 2 blocks from the holes.
> +//! let fragmented = KBox::pin_init(
> +//!     buddy.alloc_blocks(
> +//!         GpuBuddyAllocMode::Range { start: 0, end: SZ_16M as u64 },
> +//!         SZ_8M,
> +//!         Alignment::new::<SZ_4M>(),
> +//!         GpuBuddyAllocFlags::default(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_16M as u64);
> +//!
> +//! let (mut count, mut total) = (0u32, 0usize);
> +//! for block in fragmented.iter() {
> +//!     assert_eq!(block.size(), SZ_4M);
> +//!     total += block.size();
> +//!     count += 1;
> +//! }

Note that we can avoid mutable variables with this:

//! let total_size: usize = fragmented.iter()
//!      .inspect(|block| assert_eq!(block.size(), SZ_4M))
//!      .map(|block| block.size())
//!      .sum();
//! assert_eq!(total_size, SZ_8M);
//! assert_eq!(fragmented.iter().count(), 2);

But your call as to whether this is an improvement.

> +//! assert_eq!(total, SZ_8M);
> +//! assert_eq!(count, 2);
> +//! # Ok::<(), Error>(())
> +//! ```
> +//!
> +//! Contiguous allocation fails when only fragmented space is available:
> +//!
> +//! ```
> +//! # use kernel::{
> +//! #     gpu::buddy::{
> +//! #         GpuBuddy, GpuBuddyAllocFlag, GpuBuddyAllocMode, GpuBuddyParams,
> +//! #     },
> +//! #     prelude::*,
> +//! #     ptr::Alignment,
> +//! #     sizes::*, //
> +//! # };
> +//! // Create a small 16MB buddy allocator with fragmented memory.
> +//! let small = GpuBuddy::new(GpuBuddyParams {
> +//!     base_offset: 0,
> +//!     physical_memory_size: SZ_16M as u64,
> +//!     chunk_size: SZ_4K,
> +//! })?;
> +//!
> +//! let _hole1 = KBox::pin_init(
> +//!     small.alloc_blocks(
> +//!         GpuBuddyAllocMode::Range { start: 0, end: SZ_4M as u64 },
> +//!         SZ_4M,
> +//!         Alignment::new::<SZ_4M>(),
> +//!         GpuBuddyAllocFlags::default(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! )?;
> +//!
> +//! let _hole2 = KBox::pin_init(
> +//!     small.alloc_blocks(
> +//!         GpuBuddyAllocMode::Range {
> +//!             start: SZ_8M as u64,
> +//!             end: (SZ_8M + SZ_4M) as u64,
> +//!         },
> +//!         SZ_4M,
> +//!         Alignment::new::<SZ_4M>(),
> +//!         GpuBuddyAllocFlags::default(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! )?;
> +//!
> +//! // 8MB contiguous should fail, only two non-contiguous 4MB holes exist.
> +//! let result = KBox::pin_init(
> +//!     small.alloc_blocks(
> +//!         GpuBuddyAllocMode::Simple,
> +//!         SZ_8M,
> +//!         Alignment::new::<SZ_4M>(),
> +//!         GpuBuddyAllocFlag::Contiguous.into(),
> +//!     ),
> +//!     GFP_KERNEL,
> +//! );
> +//! assert!(result.is_err());
> +//! # Ok::<(), Error>(())
> +//! ```

I think these last two examples are great both as documentation and
tests - the doc has also become much more readable!

> +
> +use crate::{
> +    bindings,
> +    clist_create,
> +    error::to_result,
> +    interop::list::CListHead,
> +    new_mutex,
> +    prelude::*,
> +    ptr::Alignment,
> +    sync::{
> +        lock::mutex::MutexGuard,
> +        Arc,
> +        Mutex, //
> +    },
> +    types::Opaque, //
> +};
> +
> +/// Allocation mode for the GPU buddy allocator.
> +///
> +/// The mode determines the primary allocation strategy. Modes are mutually
> +/// exclusive: an allocation is either simple, range-constrained, or top-down.
> +///
> +/// Orthogonal modifier flags (e.g., contiguous, clear) are specified separately
> +/// via [`GpuBuddyAllocFlags`].
> +#[derive(Copy, Clone, Debug, PartialEq, Eq)]
> +pub enum GpuBuddyAllocMode {
> +    /// Simple allocation without constraints.
> +    Simple,
> +    /// Range-based allocation between `start` and `end` addresses.
> +    Range {
> +        /// Start of the allocation range.
> +        start: u64,
> +        /// End of the allocation range.
> +        end: u64,
> +    },
> +    /// Allocate from top of address space downward.
> +    TopDown,
> +}
> +
> +impl GpuBuddyAllocMode {
> +    // Returns the C flags corresponding to the allocation mode.
> +    fn into_flags(self) -> usize {
> +        match self {
> +            Self::Simple => 0,
> +            Self::Range { .. } => bindings::GPU_BUDDY_RANGE_ALLOCATION as usize,
> +            Self::TopDown => bindings::GPU_BUDDY_TOPDOWN_ALLOCATION as usize,
> +        }
> +    }
> +
> +    // Extracts the range start/end, defaulting to (0, 0) for non-range modes.
> +    fn range(self) -> (u64, u64) {
> +        match self {
> +            Self::Range { start, end } => (start, end),
> +            _ => (0, 0),
> +        }
> +    }
> +}
> +
> +crate::impl_flags!(
> +    /// Modifier flags for GPU buddy allocation.
> +    ///
> +    /// These flags can be combined with any [`GpuBuddyAllocMode`] to control
> +    /// additional allocation behavior.
> +    #[derive(Clone, Copy, Default, PartialEq, Eq)]
> +    pub struct GpuBuddyAllocFlags(u32);
> +
> +    /// Individual modifier flag for GPU buddy allocation.
> +    #[derive(Clone, Copy, PartialEq, Eq)]
> +    pub enum GpuBuddyAllocFlag {
> +        /// Allocate physically contiguous blocks.
> +        Contiguous = bindings::GPU_BUDDY_CONTIGUOUS_ALLOCATION as u32,
> +
> +        /// Request allocation from cleared (zeroed) memory.
> +        Clear = bindings::GPU_BUDDY_CLEAR_ALLOCATION as u32,
> +
> +        /// Disable trimming of partially used blocks.
> +        TrimDisable = bindings::GPU_BUDDY_TRIM_DISABLE as u32,
> +    }
> +);
> +
> +/// Parameters for creating a GPU buddy allocator.
> +pub struct GpuBuddyParams {
> +    /// Base offset (in bytes) where the managed memory region starts.
> +    /// Allocations will be offset by this value.
> +    pub base_offset: u64,
> +    /// Total physical memory size (in bytes) managed by the allocator.
> +    pub physical_memory_size: u64,
> +    /// Minimum allocation unit / chunk size (in bytes), must be >= 4KB.
> +    pub chunk_size: usize,

As I mentioned above, let's consider if we can store this as an `Alignment`.

> +}
> +
> +/// Inner structure holding the actual buddy allocator.
> +///
> +/// # Synchronization
> +///
> +/// The C `gpu_buddy` API requires synchronization (see `include/linux/gpu_buddy.h`).
> +/// [`GpuBuddyGuard`] ensures that the lock is held for all
> +/// allocator and free operations, preventing races between concurrent allocations
> +/// and the freeing that occurs when [`AllocatedBlocks`] is dropped.

`GpuBuddyGuard` is now private, so we should avoid mentioning it in the
public documentation as it will just confuse users.

Users won't care about such implementation details - we can just say
that internal locking ensures all operations are properly synchronized.

> +///
> +/// # Invariants
> +///
> +/// The inner [`Opaque`] contains an initialized buddy allocator.
> +#[pin_data(PinnedDrop)]
> +struct GpuBuddyInner {
> +    #[pin]
> +    inner: Opaque<bindings::gpu_buddy>,
> +
> +    // TODO: Replace `Mutex<()>` with `Mutex<Opaque<..>>` once `Mutex::new()`
> +    // accepts `impl PinInit<T>`.
> +    #[pin]
> +    lock: Mutex<()>,
> +    /// Cached creation parameters (do not change after init).
> +    params: GpuBuddyParams,
> +}
> +
> +impl GpuBuddyInner {
> +    /// Create a pin-initializer for the buddy allocator.
> +    fn new(params: GpuBuddyParams) -> impl PinInit<Self, Error> {
> +        let size = params.physical_memory_size;
> +        let chunk_size = params.chunk_size;
> +
> +        // INVARIANT: `gpu_buddy_init` returns 0 on success, at which point the
> +        // `gpu_buddy` structure is initialized and ready for use with all
> +        // `gpu_buddy_*` APIs. `try_pin_init!` only completes if all fields succeed,
> +        // so the invariant holds when construction finishes.
> +        try_pin_init!(Self {
> +            inner <- Opaque::try_ffi_init(|ptr| {
> +                // SAFETY: `ptr` points to valid uninitialized memory from the pin-init
> +                // infrastructure. `gpu_buddy_init` will initialize the structure.
> +                to_result(unsafe { bindings::gpu_buddy_init(ptr, size, chunk_size as u64) })
> +            }),
> +            lock <- new_mutex!(()),
> +            params,
> +        })
> +    }
> +
> +    /// Lock the mutex and return a guard for accessing the allocator.
> +    fn lock(&self) -> GpuBuddyGuard<'_> {
> +        GpuBuddyGuard {
> +            inner: self,
> +            _guard: self.lock.lock(),
> +        }
> +    }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for GpuBuddyInner {
> +    fn drop(self: Pin<&mut Self>) {
> +        let guard = self.lock();
> +
> +        // SAFETY: Per the type invariant, `inner` contains an initialized
> +        // allocator. `guard` provides exclusive access.
> +        unsafe {
> +            bindings::gpu_buddy_fini(guard.as_raw());
> +        }
> +    }
> +}
> +
> +// SAFETY: GpuBuddyInner can be sent between threads.
> +unsafe impl Send for GpuBuddyInner {}
> +
> +// SAFETY: `GpuBuddyInner` is `Sync` because `GpuBuddyInner::lock`
> +// serializes all access to the C allocator, preventing data races.
> +unsafe impl Sync for GpuBuddyInner {}
> +
> +// Guard that proves the lock is held, enabling access to the allocator.
> +// The `_guard` holds the lock for the duration of this guard's lifetime.
> +struct GpuBuddyGuard<'a> {
> +    inner: &'a GpuBuddyInner,
> +    _guard: MutexGuard<'a, ()>,
> +}
> +
> +impl GpuBuddyGuard<'_> {
> +    /// Get a raw pointer to the underlying C `gpu_buddy` structure.
> +    fn as_raw(&self) -> *mut bindings::gpu_buddy {
> +        self.inner.inner.get()
> +    }
> +}
> +
> +/// GPU buddy allocator instance.
> +///
> +/// This structure wraps the C `gpu_buddy` allocator using reference counting.
> +/// The allocator is automatically cleaned up when all references are dropped.
> +///
> +/// Refer to the module-level documentation for usage examples.
> +pub struct GpuBuddy(Arc<GpuBuddyInner>);
> +
> +impl GpuBuddy {
> +    /// Create a new buddy allocator.
> +    ///
> +    /// Creates a buddy allocator that manages a contiguous address space of the given
> +    /// size, with the specified minimum allocation unit (chunk_size must be at least 4KB).
> +    pub fn new(params: GpuBuddyParams) -> Result<Self> {
> +        Ok(Self(Arc::pin_init(GpuBuddyInner::new(params), GFP_KERNEL)?))

Can be written as:

        Arc::pin_init(GpuBuddyInner::new(params), GFP_KERNEL).map(Self)

I prefer this form as it avoids the `?` and re-wrapping into `Ok` for
something that is already a `Result`.

>
> +    }
> +
> +    /// Get the base offset for allocations.
> +    pub fn base_offset(&self) -> u64 {
> +        self.0.params.base_offset
> +    }
> +
> +    /// Get the chunk size (minimum allocation unit).
> +    pub fn chunk_size(&self) -> usize {

If my suggestion above works this could return an `Alignment`.

> +        self.0.params.chunk_size
> +    }
> +
> +    /// Get the total managed size.
> +    pub fn size(&self) -> u64 {
> +        self.0.params.physical_memory_size
> +    }
> +
> +    /// Get the available (free) memory in bytes.
> +    pub fn free_memory(&self) -> u64 {
> +        let guard = self.0.lock();
> +
> +        // SAFETY: Per the type invariant, `inner` contains an initialized allocator.
> +        // `guard` provides exclusive access.
> +        unsafe { (*guard.as_raw()).avail }
> +    }
> +
> +    /// Allocate blocks from the buddy allocator.
> +    ///
> +    /// Returns a pin-initializer for [`AllocatedBlocks`].
> +    ///
> +    /// Takes `&self` instead of `&mut self` because the internal [`Mutex`] provides
> +    /// synchronization - no external `&mut` exclusivity needed.

This is another implementation detail - the fact this takes `&self` and
is not `unsafe` is already proof that synchronization is taken care of.

> +    pub fn alloc_blocks(
> +        &self,
> +        mode: GpuBuddyAllocMode,
> +        size: usize,

For this parameter I am pretty sure we want to conform to the C API and
use a `u64` - there is no benefit in not doing so, and buffers larger
than 4GB *are* a reality nowadays, (maybe not for graphics, but this
will also be used in compute scenarios).

> +        min_block_size: Alignment,
> +        flags: GpuBuddyAllocFlags,
> +    ) -> impl PinInit<AllocatedBlocks, Error> {
> +        let buddy_arc = Arc::clone(&self.0);
> +        let (start, end) = mode.range();
> +        let mode_flags = mode.into_flags();
> +        let modifier_flags = u32::from(flags) as usize;
> +
> +        // Create pin-initializer that initializes list and allocates blocks.
> +        try_pin_init!(AllocatedBlocks {
> +            buddy: buddy_arc,
> +            list <- CListHead::new(),
> +            _: {
> +                // Lock while allocating to serialize with concurrent frees.
> +                let guard = buddy.lock();
> +
> +                // SAFETY: Per the type invariant, `inner` contains an initialized
> +                // allocator. `guard` provides exclusive access.
> +                to_result(unsafe {
> +                    bindings::gpu_buddy_alloc_blocks(
> +                        guard.as_raw(),
> +                        start,
> +                        end,
> +                        size as u64,
> +                        min_block_size.as_usize() as u64,
> +                        list.as_raw(),
> +                        mode_flags | modifier_flags,
> +                    )
> +                })?
> +            }
> +        })
> +    }
> +}
> +
> +/// Allocated blocks from the buddy allocator with automatic cleanup.
> +///
> +/// This structure owns a list of allocated blocks and ensures they are
> +/// automatically freed when dropped. Use `iter()` to iterate over all
> +/// allocated blocks.
> +///
> +/// # Invariants
> +///
> +/// - `list` is an initialized, valid list head containing allocated blocks.
> +#[pin_data(PinnedDrop)]
> +pub struct AllocatedBlocks {
> +    #[pin]
> +    list: CListHead,
> +    buddy: Arc<GpuBuddyInner>,
> +}
> +
> +impl AllocatedBlocks {
> +    /// Check if the block list is empty.
> +    pub fn is_empty(&self) -> bool {
> +        // An empty list head points to itself.
> +        !self.list.is_linked()
> +    }
> +
> +    /// Iterate over allocated blocks.
> +    ///
> +    /// Returns an iterator yielding [`AllocatedBlock`] values. Each [`AllocatedBlock`]
> +    /// borrows `self` and is only valid for the duration of that borrow.
> +    pub fn iter(&self) -> impl Iterator<Item = AllocatedBlock<'_>> + '_ {
> +        // SAFETY:
> +        // - Per the type invariant, `list` is an initialized sentinel `list_head`
> +        //   and is not concurrently modified (we hold a `&self` borrow).
> +        // - The list contains `gpu_buddy_block` items linked via
> +        //   `__bindgen_anon_1.link`.
> +        // - `Block` is `#[repr(transparent)]` over `gpu_buddy_block`.
> +        let clist = clist_create!(unsafe {
> +            self.list.as_raw(),
> +            Block,
> +            bindings::gpu_buddy_block,
> +            __bindgen_anon_1.link
> +        });
> +
> +        clist
> +            .iter()
> +            .map(|this| AllocatedBlock { this, blocks: self })
> +    }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for AllocatedBlocks {
> +    fn drop(self: Pin<&mut Self>) {
> +        let guard = self.buddy.lock();
> +
> +        // SAFETY:
> +        // - list is valid per the type's invariants.
> +        // - guard provides exclusive access to the allocator.
> +        unsafe {
> +            bindings::gpu_buddy_free_list(guard.as_raw(), self.list.as_raw(), 0);
> +        }
> +    }
> +}
> +
> +/// A GPU buddy block.
> +///
> +/// Transparent wrapper over C `gpu_buddy_block` structure. This type is returned
> +/// as references during iteration over [`AllocatedBlocks`].
> +///
> +/// # Invariants
> +///
> +/// The inner [`Opaque`] contains a valid, allocated `gpu_buddy_block`.
> +#[repr(transparent)]
> +struct Block(Opaque<bindings::gpu_buddy_block>);
> +
> +impl Block {
> +    /// Get a raw pointer to the underlying C block.
> +    fn as_raw(&self) -> *mut bindings::gpu_buddy_block {
> +        self.0.get()
> +    }
> +
> +    /// Get the block's raw offset in the buddy address space (without base offset).
> +    fn offset(&self) -> u64 {
> +        // SAFETY: `self.as_raw()` is valid per the type's invariants.
> +        unsafe { bindings::gpu_buddy_block_offset(self.as_raw()) }
> +    }
> +
> +    /// Get the block order.
> +    fn order(&self) -> u32 {
> +        // SAFETY: `self.as_raw()` is valid per the type's invariants.
> +        unsafe { bindings::gpu_buddy_block_order(self.as_raw()) }
> +    }

Speaking of synchronization - I only had a quick look at the C API, but
are we sure these methods can all be called without holding the lock?

> +}
> +
> +// SAFETY: `Block` is a wrapper around `gpu_buddy_block` which can be
> +// sent across threads safely.
> +unsafe impl Send for Block {}
> +
> +// SAFETY: `Block` is only accessed through shared references after
> +// allocation, and thus safe to access concurrently across threads.
> +unsafe impl Sync for Block {}
> +
> +/// A buddy block paired with its owning [`AllocatedBlocks`] context.
> +///
> +/// Unlike a raw block, which only knows its offset within the buddy address
> +/// space, an [`AllocatedBlock`] also has access to the allocator's `base_offset`
> +/// and `chunk_size`, enabling it to compute absolute offsets and byte sizes.
> +///
> +/// Returned by [`AllocatedBlocks::iter()`].
> +pub struct AllocatedBlock<'a> {
> +    this: &'a Block,
> +    blocks: &'a AllocatedBlocks,
> +}
> +
> +impl AllocatedBlock<'_> {
> +    /// Get the block's offset in the address space.
> +    ///
> +    /// Returns the absolute offset including the allocator's base offset.
> +    /// This is the actual address to use for accessing the allocated memory.
> +    pub fn offset(&self) -> u64 {
> +        self.blocks.buddy.params.base_offset + self.this.offset()
> +    }
> +
> +    /// Get the block order (size = chunk_size << order).
> +    pub fn order(&self) -> u32 {
> +        self.this.order()
> +    }
> +
> +    /// Get the block's size in bytes.
> +    pub fn size(&self) -> usize {
> +        self.blocks.buddy.params.chunk_size << self.this.order()
> +    }
> +}
> diff --git a/rust/kernel/gpu/mod.rs b/rust/kernel/gpu/mod.rs

Let's use `gpu.rs` as the file for this module.

> new file mode 100644
> index 000000000000..8f25e6367edc
> --- /dev/null
> +++ b/rust/kernel/gpu/mod.rs
> @@ -0,0 +1,5 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! GPU subsystem abstractions.
> +
> +pub mod buddy;

IMHO we should have a `#[cfg(CONFIG_GPU_BUDDY = "y")]` here for
defensiveness...

>
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index bb741f1e0dfd..63e3f656eb6c 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -98,6 +98,8 @@
>  pub mod firmware;
>  pub mod fmt;
>  pub mod fs;
> +#[cfg(CONFIG_GPU_BUDDY = "y")]
> +pub mod gpu;

... because in the future I suspect the condition for enabling that
module will become broader. I think it's fine to keep it as-is for now
though.

next prev parent reply	other threads:[~2026-03-16 13:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-08 18:04 [PATCH v12 0/1] Rust GPU buddy allocator bindings Joel Fernandes
2026-03-08 18:04 ` [PATCH v12 1/1] rust: gpu: Add " Joel Fernandes
2026-03-09 13:53 ` [PATCH v12.1 0/1] Rust " Joel Fernandes
2026-03-09 13:53   ` [PATCH v12.1 1/1] rust: gpu: Add " Joel Fernandes
2026-03-12 17:45     ` Danilo Krummrich
2026-03-16 17:29       ` Joel Fernandes
2026-03-16 13:12     ` Alexandre Courbot [this message]
2026-03-16 18:43       ` Joel Fernandes
2026-03-17  0:58         ` Alexandre Courbot
2026-03-17 18:49           ` Joel Fernandes
2026-03-16 18:51       ` John Hubbard
2026-03-16 21:00         ` Joel Fernandes
2026-03-16 22:08           ` John Hubbard
2026-03-17  1:02         ` Alexandre Courbot
2026-03-17  1:10           ` John Hubbard
2026-03-17  1:58             ` Alexandre Courbot
2026-03-17 18:44           ` Joel Fernandes
2026-03-17 22:03 ` [PATCH v13 0/2] Rust " Joel Fernandes
2026-03-17 22:03   ` [PATCH v13 1/2] rust: gpu: Add " Joel Fernandes
2026-03-18 18:41     ` Joel Fernandes
2026-03-19 12:49     ` Alexandre Courbot
2026-03-17 22:03   ` [PATCH v13 2/2] MAINTAINERS: gpu: buddy: Update reviewer Joel Fernandes
2026-03-18  9:15     ` Matthew Auld
2026-03-18  9:35     ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DH48DNAQCE0Z.2EX23VD27CQVX@nvidia.com \
    --to=acourbot@nvidia.com \
    --cc=a.hindborg@kernel.org \
    --cc=airlied@gmail.com \
    --cc=airlied@redhat.com \
    --cc=alex.gaynor@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=alexeyi@nvidia.com \
    --cc=aliceryhl@google.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=apopple@nvidia.com \
    --cc=arighi@nvidia.com \
    --cc=aritger@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=boqun@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=deller@gmx.de \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ecourtney@nvidia.com \
    --cc=elle@weathered-steel.dev \
    --cc=epeer@nvidia.com \
    --cc=gary@garyguo.net \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=jhubbard@nvidia.com \
    --cc=joel@joelfernandes.org \
    --cc=joelagnelf@nvidia.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=koen.koning@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fbdev@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=lucas.demarchi@intel.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=ndjukic@nvidia.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=ojeda@kernel.org \
    --cc=phasta@kernel.org \
    --cc=ray.huang@amd.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=simona@ffwll.ch \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tmgross@umich.edu \
    --cc=ttabi@nvidia.com \
    --cc=tursulin@ursulin.net \
    --cc=tzimmermann@suse.de \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox