All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] dma-buf/dma_fence: Make races for dma_fence_is_signaled() less likely
@ 2026-06-12 10:42 Philipp Stanner
  2026-06-12 10:56 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Philipp Stanner @ 2026-06-12 10:42 UTC (permalink / raw)
  To: Danilo Krummrich, Maarten Lankhorst, David Airlie, Simona Vetter,
	Sumit Semwal, Christian König, Tvrtko Ursulin,
	Boris Brezillon, Paul E . McKenney
  Cc: dri-devel, linux-kernel, Philipp Stanner

dma_fence_is_signaled() returns whether a fence has been signaled
already. That function contains a fast path opportunistic check which is
not guarded by the lock and, according to Christian, cannot be guarded
by the lock without causing a massive performance regression.

This now means that dma_fence_is_signaled() can return true WHILE the
fence callbacks are still being executed. This is razy and has lead to
at least one bug solved in:

commit c8a5d5ea3ba6 ("nouveau: fix client work fence deletion race")

Make this race impossible, by simply setting the bit only once the
callbacks are actually completed.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/dma-buf/dma-fence.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index c7ea1e75d38a..2416cc86ce93 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -359,8 +359,19 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
 
 	dma_fence_assert_held(fence);
 
-	if (unlikely(test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-				      &fence->flags)))
+	/*
+	 * First test the bit, so we don't signal an already signaled fence again.
+	 * The lock protects against multiple parties setting the bit. The bit
+	 * is then set at the end of the function.
+	 *
+	 * The background is that there is a fast path check in
+	 * dma_fence_is_signaled() which does not use lock protection and can
+	 * return true *while* the fence callbacks are still executing.
+	 *
+	 * This fast path check supposedly cannot be guarded by the lock because
+	 * of significant performance regressions.
+	 */
+	if (unlikely(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)))
 		return;
 
 	trace_dma_fence_signaled(fence);
@@ -384,6 +395,9 @@ void dma_fence_signal_timestamp_locked(struct dma_fence *fence,
 		INIT_LIST_HEAD(&cur->node);
 		cur->func(fence, cur);
 	}
+
+	// TODO: we need some barrier here, don't we?
+	set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags);
 }
 EXPORT_SYMBOL(dma_fence_signal_timestamp_locked);
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-12 10:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 10:42 [RFC PATCH] dma-buf/dma_fence: Make races for dma_fence_is_signaled() less likely Philipp Stanner
2026-06-12 10:56 ` sashiko-bot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.