Re: [PATCH v4 4/5] rust: Add dma_fence abstractions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Philipp Stanner" <phasta@kernel.org>
Cc: dri-devel@lists.freedesktop.org, ojeda@kernel.org
Subject: Re: [PATCH v4 4/5] rust: Add dma_fence abstractions
Date: Tue, 23 Jun 2026 21:06:46 +0000	[thread overview]
Message-ID: <20260623210647.4F7361F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260622135654.334961-6-phasta@kernel.org>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [Critical] Soundness / Use-After-Free due to lifetime erasure in `ForeignOwnable` implementation for `DriverFence`.
- [Critical] TOCTOU race condition in `FenceCbRegistration::new` allows uninitialized memory access.
- [High] Deadlock when calling `Fence::is_signaled()` from a registered fence callback.
- [High] `FenceCtx::drop` calls `rcu_barrier()`, sleeping in atomic context.
- [Medium] Missing `dma_fence_begin_signalling()` lockdep annotations in `DriverFence::signal` and `DriverFence::drop`.
--

commit 13ec5041cfbf69d9b4feb4a3709de7bd1157511a
Author: Philipp Stanner <phasta@kernel.org>

rust: Add dma_fence abstractions
    
This commit introduces Rust abstractions for the C dma_fence API, providing
a safe interface for drivers to create, signal, and await fences. It handles
the memory management, callback registration, and context creation.

> diff --git a/rust/kernel/dma_buf/dma_fence.rs b/rust/kernel/dma_buf/dma_fence.rs
> new file mode 100644
> index 0000000000000..a9d9a684947c5
> --- /dev/null
> +++ b/rust/kernel/dma_buf/dma_fence.rs

[ ... ]

> +    extern "C" fn get_driver_name(ptr: *mut bindings::dma_fence) -> *const c_char {
> +        // SAFETY: The C backend only invokes this callback with `ptr` pointing
> +        // to a valid, unsignaled `bindings::dma_fence`. All fences created
> +        // in this module always reside within `Fence` which always resides in
> +        // a `DriverFenceData`, thus satisfying the function's safety requirements.
> +        let fctx = unsafe { Self::from_raw_fence(ptr) };
> +
> +        fctx.driver_name.as_char_ptr()
> +    }

[Severity: Critical]
Could this result in a use-after-free?

When the C backend calls this mandatory callback, it dereferences the fctx
pointer. However, the lifetime of the DriverFence's context is erased in the
ForeignOwnable implementation. If the underlying FenceCtx is dropped
prematurely from safe Rust, could this callback access freed memory?

[ ... ]

> +#[pinned_drop]
> +impl<T: Send + Sync> PinnedDrop for FenceCtx<T> {
> +    fn drop(self: Pin<&mut Self>) {
> +        rcu_barrier();
> +    }
> +}

[Severity: High]
Does this sleep in atomic context?

FenceCtx is managed by an Arc, meaning its drop executes when the last
reference is released. If an Arc<FenceCtx> is dropped in an atomic context
like an interrupt handler or while holding a spinlock, unconditionally
calling rcu_barrier() will block and sleep, violating the atomic context
constraints.

[ ... ]

> +    pub fn new<'a>(fence: &'a Fence, callback: T) -> impl PinInit<Self, CallbackError<T>> + 'a
> +    where
> +        T: 'a,
> +    {
> +        let mut callback = Some(callback);
> +
> +        try_pin_init!(Self {
> +            cb <- Opaque::try_ffi_init(|ptr| {
> +                // SAFETY: `fence.inner.get()` is a valid, initialized `struct dma_fence`. `ptr`
> +                // points to the `struct dma_fence_cb` field within the pinned allocation, so it
> +                // remains valid until `dma_fence_remove_callback()` in `PinnedDrop` or until the
> +                // callback fires.
> +                let ret = unsafe {
> +                    to_result(bindings::dma_fence_add_callback(
> +                        fence.inner.get(),
> +                        ptr,
> +                        Some(Self::dma_fence_callback),
> +                    ))
> +                };
> +                match ret {
> +                    Ok(()) => Ok(()),
> +                    Err(e) => {
> +                        let cb_back = callback.take().expect("Success path did not run.");
> +                        if e == ENOENT {
> +                            Err(CallbackError::AlreadySignaled(cb_back))
> +                        } else {
> +                            Err(CallbackError::Other(e))
> +                        }
> +                    }
> +                }
> +            }),
> +            callback: callback.take().expect("Error path did not run."),

[Severity: Critical]
Is there a race condition here that allows uninitialized memory access?

The cb field is initialized first, which registers the callback with the C
backend via dma_fence_add_callback. Once added, the callback is live.
However, the callback field containing the user's T: FenceCb implementation
has not yet been initialized.

If a concurrent thread signals the fence in this tight window, the C subsystem
will immediately invoke Self::dma_fence_callback, which will cast the pointer
back and invoke (*reg).callback.called() on uninitialized memory.

[ ... ]

> +    pub fn is_signaled(&self) -> bool {
> +        let fence = self.as_raw();
> +        let mut fence_flags: usize = 0;
> +        let flag_ptr = &raw mut fence_flags;
> +
> +        // We shouuld not use `dma_fence_is_signaled_locked()` here, because
> +        // according to the C backend's recommendations, that function is problematic
> +        // and we should avoid calling that function with a lock held.
> +
> +        // SAFETY: `self` is valid by definition. We take the spinlock above.
> +        let ret = unsafe { bindings::dma_fence_is_signaled(fence) };
> +
> +        // To guarantee that an API caller can 100% rely on the signalling being
> +        // completed (i.e., all fence callbacks ran), we have to take the lock.
> +        //
> +        // The reason is that the C dma_fence backend currently does not carefully
> +        // synchronize the `dma_fence_is_signaled()` function with the proper
> +        // spinlock. This can lead to the function returning `true` while fence
> +        // callbacks are still being executed. This can be mitigated by guarding
> +        // the entire function with the spinlock.
> +        //
> +        // See commit c8a5d5ea3ba6a.
> +
> +        // SAFETY: `fence` is valid because `self` is valid. `flag_ptr` is
> +        // merely a pointer to an integer, which lives as long as this function.
> +        unsafe { bindings::dma_fence_lock_irqsave(fence, flag_ptr) };

[Severity: High]
Could this cause a deadlock if called from within a registered fence callback?

When the fence signals, the C backend (dma_fence_signal) acquires the fence's
spinlock and iterates over the callback list. The callback executes
FenceCb::called() synchronously while the spinlock is held.

If the user's called() implementation invokes Fence::is_signaled(), this code
will forcefully attempt to acquire the exact same spinlock, leading to an
immediate CPU deadlock since spinlocks are not recursive.

[ ... ]

> +    pub fn signal(self, res: Result) {
> +        let fence = self.as_raw();
> +        let mut fence_flags: usize = 0;
> +        let flag_ptr = &raw mut fence_flags;
> +
> +        // SAFETY: `fence` is valid because `self` is valid. `flag_ptr` is
> +        // merely a pointer to an integer, which lives as long as this function.
> +        unsafe { bindings::dma_fence_lock_irqsave(fence, flag_ptr) };
> +        // SAFETY: `fence` is valid because `self` is valid. The lock must be
> +        // held, which we acquired directly above.
> +        if !unsafe { bindings::dma_fence_test_signaled_flag(fence) } {
> +            if let Err(err) = res {
> +                // SAFETY: `fence` is valid because `self` is valid. The fence
> +                // must not have been signaled yet, which we check directly above.
> +                unsafe { bindings::dma_fence_set_error(fence, err.to_errno()) };
> +            }
> +            // SAFETY: `fence` is valid because `self` is valid. The lock must
> +            // be held, which we acquired above.
> +            unsafe { bindings::dma_fence_signal_locked(fence) };

[Severity: Medium]
Are we missing the dma_fence_begin_signalling() lockdep annotations here?

By calling bindings::dma_fence_signal_locked() directly instead of using the
standard dma_fence_signal() C function, we skip the lockdep state tracking
helpers. Could this disable lockdep's ability to detect deadlocks between
fence signaling and fence waiters?

[ ... ]

> +unsafe impl<'b, T: Send + Sync + FenceCtxOps + 'static> ForeignOwnable for DriverFence<'b, T> {
> +    type Borrowed<'a> = DriverFenceBorrow<'a, T> where Self: 'a;
> +    type BorrowedMut<'a> = DriverFenceBorrow<'a, T> where Self: 'a;
> +
> +    const FOREIGN_ALIGN: usize = core::mem::align_of::<bindings::dma_fence>();
> +
> +    fn into_foreign(self) -> *mut c_void {
> +        let fence = self;
> +
> +        let ptr = fence.as_raw();
> +
> +        // DriverFence must not drop.
> +        let _ = ManuallyDrop::new(fence);
> +
> +        ptr.cast()
> +    }

[Severity: Critical]
Does this ForeignOwnable implementation introduce a soundness issue by erasing
the lifetime 'b bound to FenceCtx?

When into_foreign() is called, it consumes the DriverFence and returns a raw
pointer, effectively erasing the lifetime 'b from the borrow checker's
perspective. This allows the borrow on FenceCtx to be released, and safe Rust
code can then drop the FenceCtx.

Since the C backend still retains the raw dma_fence pointer, any subsequent
use of mandatory callbacks (like get_driver_name) will read from the dangling
fctx pointer inside DriverFenceData, causing a use-after-free.

[ ... ]

> +impl<'a, T: Send + Sync + FenceCtxOps> Drop for DriverFence<'a, T> {
> +    fn drop(&mut self) {
> +        let fence = self.as_raw();
> +        let mut fence_flags: usize = 0;
> +        let flag_ptr = &raw mut fence_flags;
> +
> +        // SAFETY: `fence` is valid until the `call_rcu()` below. `flag_ptr` is
> +        // merely a pointer to an integer, which lives as long as this function.
> +        unsafe { bindings::dma_fence_lock_irqsave(fence, flag_ptr) };
> +
> +        // Use dma_fence_test_signaled_flag() instead of dma_fence_is_signaled_locked()
> +        // because the C backend wants to get rid of the latter.
> +
> +        // SAFETY: `fence` is valid until the `call_rcu()` below.
> +        let signaled: bool = unsafe { bindings::dma_fence_test_signaled_flag(fence) };
> +        if warn_on!(!signaled) {
> +            // SAFETY: `fence` is valid until the `call_rcu()` below. The fence
> +            // must not have been signaled yet, which we check directly above.
> +            unsafe { bindings::dma_fence_set_error(fence, ECANCELED.to_errno()) };
> +            // SAFETY: `fence` is valid until the `call_rcu()` below. The lock must
> +            // be held, which we acquired above.
> +            unsafe { bindings::dma_fence_signal_locked(fence) };

[Severity: Medium]
Similar to the signal() method above, are we skipping the required lockdep
annotations here?

This fallback drop signaling path also invokes dma_fence_signal_locked()
directly without wrapping it in dma_fence_begin_signalling() and
dma_fence_end_signalling().

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260622135654.334961-2-phasta@kernel.org?part=4

next prev parent reply	other threads:[~2026-06-23 21:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-22 13:56 [PATCH v4 0/5] rust / dma_buf: Add abstractions for dma_fence Philipp Stanner
2026-06-22 13:56 ` [PATCH v4 1/5] rust: types: implement ForeignOwnable for ARef<T> Philipp Stanner
2026-06-23 20:49   ` sashiko-bot
2026-06-22 13:56 ` [PATCH v4 2/5] rust: error: Add ECANCELED error code Philipp Stanner
2026-06-22 13:56 ` [PATCH v4 3/5] rust: sync: Add abstraction for rcu_barrier() Philipp Stanner
2026-06-23 20:54   ` sashiko-bot
2026-06-22 13:56 ` [PATCH v4 4/5] rust: Add dma_fence abstractions Philipp Stanner
2026-06-23 21:06   ` sashiko-bot [this message]
2026-06-22 13:56 ` [PATCH v4 5/5] MAINTAINERS: Add entry for Rust dma-buf Philipp Stanner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623210647.4F7361F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ojeda@kernel.org \
    --cc=phasta@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.