* [RFC PATCH] dma-buf/dma_fence: Make races for dma_fence_is_signaled() less likely
@ 2026-06-12 10:42 Philipp Stanner
2026-06-12 10:56 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: Philipp Stanner @ 2026-06-12 10:42 UTC (permalink / raw)
To: Danilo Krummrich, Maarten Lankhorst, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, Tvrtko Ursulin,
Boris Brezillon, Paul E . McKenney
Cc: dri-devel, linux-kernel, Philipp Stanner
dma_fence_is_signaled() returns whether a fence has been signaled
already. That function contains a fast path opportunistic check which is
not guarded by the lock and, according to Christian, cannot be guarded
by the lock without causing a massive performance regression.
This now means that dma_fence_is_signaled() can return true WHILE the
fence callbacks are still being executed. This is razy and has lead to
at least one bug solved in:
commit c8a5d5ea3ba6 ("nouveau: fix client work fence deletion race")
Make this race impossible, by simply setting the bit only once the
callbacks are actually completed.
Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
drivers/dma-buf/dma-fence.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index c7ea1e75d38a..2416cc86ce93 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -359,8 +359,19 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
dma_fence_assert_held(fence);
- if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
- &fence->flags)))
+ /*
+ * First test the bit, so we don't signal an already signaled fence again.
+ * The lock protects against multiple parties setting the bit. The bit
+ * is then set at the end of the function.
+ *
+ * The background is that there is a fast path check in
+ * dma_fence_is_signaled() which does not use lock protection and can
+ * return true *while* the fence callbacks are still executing.
+ *
+ * This fast path check supposedly cannot be guarded by the lock because
+ * of significant performance regressions.
+ */
+ if (unlikely(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)))
return;
trace_dma_fence_signaled(fence);
@@ -384,6 +395,9 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
INIT_LIST_HEAD(&cur->node);
cur->func(fence, cur);
}
+
+ // TODO: we need some barrier here, don't we?
+ set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags);
}
EXPORT_SYMBOL(dma_fence_signal_timestamp_locked);
--
2.54.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [RFC PATCH] dma-buf/dma_fence: Make races for dma_fence_is_signaled() less likely
2026-06-12 10:42 [RFC PATCH] dma-buf/dma_fence: Make races for dma_fence_is_signaled() less likely Philipp Stanner
@ 2026-06-12 10:56 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-12 10:56 UTC (permalink / raw)
To: Philipp Stanner; +Cc: dri-devel
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Critical] Moving the signaled bit update creates a window where `fence->ops` is cleared to NULL but the fence is not yet marked as signaled, causing a NULL pointer dereference in lockless accessors.
- [High] Missing release memory barrier before setting the signaled bit, defeating the purpose of the patch on weakly-ordered architectures and risking use-after-free.
- [High] Delaying the signaled bit breaks the fast-path lockless check in `dma_fence_is_signaled()`, leading to lock contention, recursive deadlocks, and API contract violations.
--
commit 0ba6de5d00b831f763affe10965d5c591e6f0b5f
Author: Philipp Stanner <phasta@kernel.org>
dma-buf/dma_fence: Make races for dma_fence_is_signaled() less likely
This commit addresses a race condition where dma_fence_is_signaled() can
return true while fence callbacks are still executing. It delays setting
the DMA_FENCE_FLAG_SIGNALED_BIT until after the callbacks have completed,
instead of setting it at the beginning of dma_fence_signal_timestamp_locked().
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -359,8 +359,19 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
>
> dma_fence_assert_held(fence);
>
> - if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> - &fence->flags)))
> + /*
> + * First test the bit, so we don't signal an already signaled fence again.
> + * The lock protects against multiple parties setting the bit. The bit
> + * is then set at the end of the function.
> + *
> + * The background is that there is a fast path check in
> + * dma_fence_is_signaled() which does not use lock protection and can
> + * return true *while* the fence callbacks are still executing.
> + *
> + * This fast path check supposedly cannot be guarded by the lock because
> + * of significant performance regressions.
> + */
> + if (unlikely(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)))
> return;
>
> trace_dma_fence_signaled(fence);
[Severity: Critical]
Because the signaled bit is no longer set here, is there a risk of a NULL
pointer dereference for concurrent lockless readers?
Just below this in dma_fence_signal_timestamp_locked(), fence->ops can be
cleared:
ops = rcu_dereference_protected(fence->ops, true);
if (!ops->release && !ops->wait)
RCU_INIT_POINTER(fence->ops, NULL);
With the signaled bit still false, could lockless accessors like
dma_fence_driver_name() see the unsignaled state and evaluate
ops->get_driver_name(fence) on the NULL ops pointer?
[ ... ]
> @@ -384,6 +395,9 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
> INIT_LIST_HEAD(&cur->node);
> cur->func(fence, cur);
> }
[Severity: High]
If a callback executes while the signaled bit is still false, won't this
break the lockless fast-path check in dma_fence_is_signaled()?
If dma_fence_is_signaled() or dma_fence_timestamp() is called from within
a callback, the unsignaled state would cause dma_fence_timestamp() to hit
a WARN_ON(!test_bit(...)).
Additionally, dma_fence_is_signaled() would fall back to dma_fence_signal(),
which attempts to acquire the fence->lock. Since the lock is already held
by the current thread executing the callbacks, would this result in a
recursive deadlock?
> +
> + // TODO: we need some barrier here, don't we?
> + set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags);
[Severity: High]
Does this missing memory barrier reintroduce the use-after-free risk on
weakly-ordered architectures?
Since set_bit() does not imply a memory barrier, could the CPU reorder
this assignment to become globally visible before the memory writes performed
by the callbacks?
If a concurrent lockless reader using dma_fence_test_signaled_flag()
observes the bit as set but reads stale data, this might defeat the purpose
of the patch. Is an explicit smp_mb__before_atomic() needed here?
> }
> EXPORT_SYMBOL(dma_fence_signal_timestamp_locked);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260612104251.2264707-2-phasta@kernel.org?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-12 10:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 10:42 [RFC PATCH] dma-buf/dma_fence: Make races for dma_fence_is_signaled() less likely Philipp Stanner
2026-06-12 10:56 ` sashiko-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.