Linux Media Controller development
 help / color / mirror / Atom feed
* [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
@ 2026-06-08 14:24 Philipp Stanner
  2026-06-08 14:43 ` sashiko-bot
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Philipp Stanner @ 2026-06-08 14:24 UTC (permalink / raw)
  To: Sumit Semwal, Christian König, Boris Brezillon, Alice Ryhl,
	Daniel Almeida, Gary Guo, Tvrtko Ursulin
  Cc: linux-media, dri-devel, linux-kernel, Philipp Stanner,
	Danilo Krummrich

The dma_fence backend_ops can access a fence. Hereby, a driver callback
will be running which likely will access driver specific data through
container_of(). If now, simultaneously, a driver signals the fence and
afterwards expects to run a driver specific destructor (using the same
data accessed through container_of()), there can be a race.

A driver very likely trusts that once it has signaled a fence, no one
will be accessing it anymore. Moreover, it might already want to free up
resources, making UAF bugs possible.

The race occurs because there are only pragmatic checks for the signaled
flag of a fence, without taking the fence lock. RCU guards exist, but
their purpose is to guard accesses through the backend_ops callbacks
against the driver (which implements the TEXT segment these callbacks
live in) from unloading.

Proper synchronization can be ensured by taking the fence lock. RCU is
still simultaneously required to guard against the unload.

Fix the races by taking the lock for all non-deprecated backend_ops
callbacks.

Conveniently, this also fixes a race where backend_ops->set_deadline()
might try to set a deadline for an already signaled fence.

Suggested-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
We discovered this problem through our Rust abstractions, but it can
also occur in C.

The by far cleanest solution seems to be to use the fence lock. This RFC
serves to discuss whether there is anything preventing that.

(Patch so far just compile tested, to have some groundlayer for the
rough idea, to discuss it first)
---
 drivers/dma-buf/dma-fence.c | 39 ++++++++++++++++++++++++++++---------
 include/linux/dma-fence.h   | 17 ++++++++++++----
 2 files changed, 43 insertions(+), 13 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index c7ea1e75d38a..b74f02f3cca8 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
 static bool __dma_fence_enable_signaling(struct dma_fence *fence)
 {
 	const struct dma_fence_ops *ops;
-	bool was_set;
+	bool was_set, success;
+	unsigned long flags;
 
 	dma_fence_assert_held(fence);
 
@@ -644,7 +645,10 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence)
 	if (!was_set && ops && ops->enable_signaling) {
 		trace_dma_fence_enable_signal(fence);
 
-		if (!ops->enable_signaling(fence)) {
+		dma_fence_lock_irqsave(fence, flags);
+		success = ops->enable_signaling(fence);
+		dma_fence_unlock_irqrestore(fence, flags);
+		if (!success) {
 			rcu_read_unlock();
 			dma_fence_signal_locked(fence);
 			return false;
@@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
 void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
 {
 	const struct dma_fence_ops *ops;
+	unsigned long flags;
 
 	rcu_read_lock();
 	ops = rcu_dereference(fence->ops);
-	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
+	if (!ops || !ops->set_deadline) {
+		rcu_read_unlock();
+		return;
+	}
+
+	dma_fence_lock_irqsave(fence, flags);
+	if (!dma_fence_is_signaled_locked(fence))
 		ops->set_deadline(fence, deadline);
+
+	dma_fence_unlock_irqrestore(fence, flags);
 	rcu_read_unlock();
 }
 EXPORT_SYMBOL(dma_fence_set_deadline);
@@ -1166,14 +1179,18 @@ EXPORT_SYMBOL(dma_fence_init64);
  */
 const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
 {
+	const char __rcu *name = "detached-driver";
 	const struct dma_fence_ops *ops;
+	unsigned long flags;
 
 	/* RCU protection is required for safe access to returned string */
 	ops = rcu_dereference(fence->ops);
+	dma_fence_lock_irqsave(fence, flags);
 	if (!dma_fence_test_signaled_flag(fence))
-		return (const char __rcu *)ops->get_driver_name(fence);
-	else
-		return (const char __rcu *)"detached-driver";
+		name = ops->get_driver_name(fence);
+	dma_fence_unlock_irqrestore(fence, flags);
+
+	return name;
 }
 EXPORT_SYMBOL(dma_fence_driver_name);
 
@@ -1199,13 +1216,17 @@ EXPORT_SYMBOL(dma_fence_driver_name);
  */
 const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
 {
+	const char __rcu *name = "signaled-timeline";
 	const struct dma_fence_ops *ops;
+	unsigned long flags;
 
 	/* RCU protection is required for safe access to returned string */
 	ops = rcu_dereference(fence->ops);
+	dma_fence_lock_irqsave(fence, flags);
 	if (!dma_fence_test_signaled_flag(fence))
-		return (const char __rcu *)ops->get_driver_name(fence);
-	else
-		return (const char __rcu *)"signaled-timeline";
+		name = ops->get_driver_name(fence);
+	dma_fence_unlock_irqrestore(fence, flags);
+
+	return name;
 }
 EXPORT_SYMBOL(dma_fence_timeline_name);
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index b52ab692b22e..b93c3f7f69fb 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -547,20 +547,29 @@ static inline bool
 dma_fence_is_signaled(struct dma_fence *fence)
 {
 	const struct dma_fence_ops *ops;
+	unsigned long flags;
+	bool signaled;
 
 	if (dma_fence_test_signaled_flag(fence))
 		return true;
 
 	rcu_read_lock();
 	ops = rcu_dereference(fence->ops);
-	if (ops && ops->signaled && ops->signaled(fence)) {
+	if (!ops || !ops->signaled) {
 		rcu_read_unlock();
-		dma_fence_signal(fence);
-		return true;
+		return false;
 	}
+
+	dma_fence_lock_irqsave(fence, flags);
+	signaled = ops->signaled(fence);
+
+	if (signaled)
+		dma_fence_signal_locked(fence);
+
+	dma_fence_unlock_irqrestore(fence, flags);
 	rcu_read_unlock();
 
-	return false;
+	return signaled;
 }
 
 /**
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 14:24 [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking Philipp Stanner
@ 2026-06-08 14:43 ` sashiko-bot
  2026-06-08 15:01 ` Boris Brezillon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: sashiko-bot @ 2026-06-08 14:43 UTC (permalink / raw)
  To: Philipp Stanner; +Cc: linux-media, dri-devel

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Double acquisition of the fence spinlock in `__dma_fence_enable_signaling` causes an unconditional self-deadlock.
- [High] Acquiring the fence spinlock in `dma_fence_set_deadline` causes deadlocks with backend drivers.
--

commit a0ed8f208472e7d15328c05be3bdfe3b20342e3b
Author: Philipp Stanner <phasta@kernel.org>

dma-fence: Fix races of fence callbacks versus destructors by locking

This patch modifies dma-fence backend operations to take the fence lock
before executing callbacks. This aims to prevent races between a driver
signaling a fence and freeing its resources while a callback is accessing
driver-specific data.

> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
[ ... ]
> @@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
>  static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>  {
[ ... ]
>  	if (!was_set && ops && ops->enable_signaling) {
>  		trace_dma_fence_enable_signal(fence);
>  
> -		if (!ops->enable_signaling(fence)) {
> +		dma_fence_lock_irqsave(fence, flags);
> +		success = ops->enable_signaling(fence);
> +		dma_fence_unlock_irqrestore(fence, flags);
> +		if (!success) {

[Severity: High]
Will this cause an unconditional deadlock?

Looking at the beginning of __dma_fence_enable_signaling(), there is a
dma_fence_assert_held(fence) check, meaning the fence spinlock is already
held when entering this function.

Since standard Linux spinlocks are not recursive, won't attempting to
acquire the same lock a second time here immediately deadlock the CPU?

[ ... ]
> @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>  {
[ ... ]
> +	dma_fence_lock_irqsave(fence, flags);
> +	if (!dma_fence_is_signaled_locked(fence))
>  		ops->set_deadline(fence, deadline);
> +
> +	dma_fence_unlock_irqrestore(fence, flags);
>  	rcu_read_unlock();
>  }

[Severity: High]
Could taking the lock around this callback cause deadlocks with backend
drivers?

Historically this callback was invoked unlocked, and multiple backend
implementations (such as drm_sched_fence_set_deadline_finished or
msm_fence_set_deadline) explicitly acquire the fence spinlock internally.

If the core now calls them with the lock already held, it appears they will
deadlock trying to acquire the non-recursive lock again.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260608142436.265820-2-phasta@kernel.org?part=1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 14:24 [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking Philipp Stanner
  2026-06-08 14:43 ` sashiko-bot
@ 2026-06-08 15:01 ` Boris Brezillon
  2026-06-08 15:17   ` Philipp Stanner
  2026-06-08 15:07 ` Tvrtko Ursulin
  2026-06-08 15:35 ` Christian König
  3 siblings, 1 reply; 18+ messages in thread
From: Boris Brezillon @ 2026-06-08 15:01 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: Sumit Semwal, Christian König, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin, linux-media, dri-devel, linux-kernel,
	Danilo Krummrich

On Mon,  8 Jun 2026 16:24:37 +0200
Philipp Stanner <phasta@kernel.org> wrote:

> @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>  {
>  	const struct dma_fence_ops *ops;
> +	unsigned long flags;
>  
>  	rcu_read_lock();
>  	ops = rcu_dereference(fence->ops);
> -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> +	if (!ops || !ops->set_deadline) {
> +		rcu_read_unlock();
> +		return;
> +	}
> +
> +	dma_fence_lock_irqsave(fence, flags);
> +	if (!dma_fence_is_signaled_locked(fence))
>  		ops->set_deadline(fence, deadline);

You can't take the fence lock around ->set_deadline(), otherwise you'll
deadlock here [1] or here [2].

> +
> +	dma_fence_unlock_irqrestore(fence, flags);
>  	rcu_read_unlock();
>  }


[1]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/dma-buf/sw_sync.c#L182
[2]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/gpu/drm/msm/msm_fence.c#L139

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 14:24 [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking Philipp Stanner
  2026-06-08 14:43 ` sashiko-bot
  2026-06-08 15:01 ` Boris Brezillon
@ 2026-06-08 15:07 ` Tvrtko Ursulin
  2026-06-08 15:15   ` Philipp Stanner
  2026-06-08 15:35 ` Christian König
  3 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2026-06-08 15:07 UTC (permalink / raw)
  To: Philipp Stanner, Sumit Semwal, Christian König,
	Boris Brezillon, Alice Ryhl, Daniel Almeida, Gary Guo
  Cc: linux-media, dri-devel, linux-kernel, Danilo Krummrich


On 08/06/2026 15:24, Philipp Stanner wrote:
> The dma_fence backend_ops can access a fence. Hereby, a driver callback
> will be running which likely will access driver specific data through
> container_of(). If now, simultaneously, a driver signals the fence and
> afterwards expects to run a driver specific destructor (using the same
> data accessed through container_of()), there can be a race.
> 
> A driver very likely trusts that once it has signaled a fence, no one
> will be accessing it anymore. Moreover, it might already want to free up
> resources, making UAF bugs possible.

Can you explain this race a bit differently? I am struggling to 
understand the scenario.

Are you talking about driver freeing the data immediately after 
signalling? That is not allowed as per the "DOC: Safe external access to 
driver provided object members", ie. divers must ensure a RCU grace 
period between signalling and freeing any data which can be reached by 
any external caller.

Regards,

Tvrtko

> The race occurs because there are only pragmatic checks for the signaled
> flag of a fence, without taking the fence lock. RCU guards exist, but
> their purpose is to guard accesses through the backend_ops callbacks
> against the driver (which implements the TEXT segment these callbacks
> live in) from unloading.
> 
> Proper synchronization can be ensured by taking the fence lock. RCU is
> still simultaneously required to guard against the unload.
> 
> Fix the races by taking the lock for all non-deprecated backend_ops
> callbacks.
> 
> Conveniently, this also fixes a race where backend_ops->set_deadline()
> might try to set a deadline for an already signaled fence.
> 
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
> We discovered this problem through our Rust abstractions, but it can
> also occur in C.
> 
> The by far cleanest solution seems to be to use the fence lock. This RFC
> serves to discuss whether there is anything preventing that.
> 
> (Patch so far just compile tested, to have some groundlayer for the
> rough idea, to discuss it first)
> ---
>   drivers/dma-buf/dma-fence.c | 39 ++++++++++++++++++++++++++++---------
>   include/linux/dma-fence.h   | 17 ++++++++++++----
>   2 files changed, 43 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index c7ea1e75d38a..b74f02f3cca8 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
>   static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>   {
>   	const struct dma_fence_ops *ops;
> -	bool was_set;
> +	bool was_set, success;
> +	unsigned long flags;
>   
>   	dma_fence_assert_held(fence);
>   
> @@ -644,7 +645,10 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>   	if (!was_set && ops && ops->enable_signaling) {
>   		trace_dma_fence_enable_signal(fence);
>   
> -		if (!ops->enable_signaling(fence)) {
> +		dma_fence_lock_irqsave(fence, flags);
> +		success = ops->enable_signaling(fence);
> +		dma_fence_unlock_irqrestore(fence, flags);
> +		if (!success) {
>   			rcu_read_unlock();
>   			dma_fence_signal_locked(fence);
>   			return false;
> @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>   void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>   {
>   	const struct dma_fence_ops *ops;
> +	unsigned long flags;
>   
>   	rcu_read_lock();
>   	ops = rcu_dereference(fence->ops);
> -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> +	if (!ops || !ops->set_deadline) {
> +		rcu_read_unlock();
> +		return;
> +	}
> +
> +	dma_fence_lock_irqsave(fence, flags);
> +	if (!dma_fence_is_signaled_locked(fence))
>   		ops->set_deadline(fence, deadline);
> +
> +	dma_fence_unlock_irqrestore(fence, flags);
>   	rcu_read_unlock();
>   }
>   EXPORT_SYMBOL(dma_fence_set_deadline);
> @@ -1166,14 +1179,18 @@ EXPORT_SYMBOL(dma_fence_init64);
>    */
>   const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
>   {
> +	const char __rcu *name = "detached-driver";
>   	const struct dma_fence_ops *ops;
> +	unsigned long flags;
>   
>   	/* RCU protection is required for safe access to returned string */
>   	ops = rcu_dereference(fence->ops);
> +	dma_fence_lock_irqsave(fence, flags);
>   	if (!dma_fence_test_signaled_flag(fence))
> -		return (const char __rcu *)ops->get_driver_name(fence);
> -	else
> -		return (const char __rcu *)"detached-driver";
> +		name = ops->get_driver_name(fence);
> +	dma_fence_unlock_irqrestore(fence, flags);
> +
> +	return name;
>   }
>   EXPORT_SYMBOL(dma_fence_driver_name);
>   
> @@ -1199,13 +1216,17 @@ EXPORT_SYMBOL(dma_fence_driver_name);
>    */
>   const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
>   {
> +	const char __rcu *name = "signaled-timeline";
>   	const struct dma_fence_ops *ops;
> +	unsigned long flags;
>   
>   	/* RCU protection is required for safe access to returned string */
>   	ops = rcu_dereference(fence->ops);
> +	dma_fence_lock_irqsave(fence, flags);
>   	if (!dma_fence_test_signaled_flag(fence))
> -		return (const char __rcu *)ops->get_driver_name(fence);
> -	else
> -		return (const char __rcu *)"signaled-timeline";
> +		name = ops->get_driver_name(fence);
> +	dma_fence_unlock_irqrestore(fence, flags);
> +
> +	return name;
>   }
>   EXPORT_SYMBOL(dma_fence_timeline_name);
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index b52ab692b22e..b93c3f7f69fb 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -547,20 +547,29 @@ static inline bool
>   dma_fence_is_signaled(struct dma_fence *fence)
>   {
>   	const struct dma_fence_ops *ops;
> +	unsigned long flags;
> +	bool signaled;
>   
>   	if (dma_fence_test_signaled_flag(fence))
>   		return true;
>   
>   	rcu_read_lock();
>   	ops = rcu_dereference(fence->ops);
> -	if (ops && ops->signaled && ops->signaled(fence)) {
> +	if (!ops || !ops->signaled) {
>   		rcu_read_unlock();
> -		dma_fence_signal(fence);
> -		return true;
> +		return false;
>   	}
> +
> +	dma_fence_lock_irqsave(fence, flags);
> +	signaled = ops->signaled(fence);
> +
> +	if (signaled)
> +		dma_fence_signal_locked(fence);
> +
> +	dma_fence_unlock_irqrestore(fence, flags);
>   	rcu_read_unlock();
>   
> -	return false;
> +	return signaled;
>   }
>   
>   /**


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:07 ` Tvrtko Ursulin
@ 2026-06-08 15:15   ` Philipp Stanner
  0 siblings, 0 replies; 18+ messages in thread
From: Philipp Stanner @ 2026-06-08 15:15 UTC (permalink / raw)
  To: Tvrtko Ursulin, Philipp Stanner, Sumit Semwal,
	Christian König, Boris Brezillon, Alice Ryhl, Daniel Almeida,
	Gary Guo
  Cc: linux-media, dri-devel, linux-kernel, Danilo Krummrich

On Mon, 2026-06-08 at 16:07 +0100, Tvrtko Ursulin wrote:
> 
> On 08/06/2026 15:24, Philipp Stanner wrote:
> > The dma_fence backend_ops can access a fence. Hereby, a driver callback
> > will be running which likely will access driver specific data through
> > container_of(). If now, simultaneously, a driver signals the fence and
> > afterwards expects to run a driver specific destructor (using the same
> > data accessed through container_of()), there can be a race.
> > 
> > A driver very likely trusts that once it has signaled a fence, no one
> > will be accessing it anymore. Moreover, it might already want to free up
> > resources, making UAF bugs possible.
> 
> Can you explain this race a bit differently? I am struggling to 
> understand the scenario.

driver:                               fence consumer:

dma_fence_signal(f)                   backend_ops() starts running, slightly misses the signaled-bit
do_sth_with_data(container_of(f))     backend_ops() accesses container_of(f)

^ race


backend_ops callbacks must not be invoked after a fence was signaled.
Right now, that is racing.

re: free(), OK, yes, that is the documented rule. Fair. But:

> 
> Are you talking about driver freeing the data immediately after 
> signalling? That is not allowed as per the "DOC: Safe external access to 
> driver provided object members", ie. divers must ensure a RCU grace 
> period between signalling and freeing any data which can be reached by 
> any external caller.

This is the documented rule mostly because this isn't solved through
locking. If the lock were to grant full synchronization, you could
immediately do your driver cleanup after signaling, because all
potential accessors would be gone.


P.

> 
> Regards,
> 
> Tvrtko
> 
> > The race occurs because there are only pragmatic checks for the signaled
> > flag of a fence, without taking the fence lock. RCU guards exist, but
> > their purpose is to guard accesses through the backend_ops callbacks
> > against the driver (which implements the TEXT segment these callbacks
> > live in) from unloading.
> > 
> > Proper synchronization can be ensured by taking the fence lock. RCU is
> > still simultaneously required to guard against the unload.
> > 
> > Fix the races by taking the lock for all non-deprecated backend_ops
> > callbacks.
> > 
> > Conveniently, this also fixes a race where backend_ops->set_deadline()
> > might try to set a deadline for an already signaled fence.
> > 
> > Suggested-by: Danilo Krummrich <dakr@kernel.org>
> > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > ---
> > We discovered this problem through our Rust abstractions, but it can
> > also occur in C.
> > 
> > The by far cleanest solution seems to be to use the fence lock. This RFC
> > serves to discuss whether there is anything preventing that.
> > 
> > (Patch so far just compile tested, to have some groundlayer for the
> > rough idea, to discuss it first)
> > ---
> >   drivers/dma-buf/dma-fence.c | 39 ++++++++++++++++++++++++++++---------
> >   include/linux/dma-fence.h   | 17 ++++++++++++----
> >   2 files changed, 43 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index c7ea1e75d38a..b74f02f3cca8 100644
> > --- a/drivers/dma-buf/dma-fence.c
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
> >   static bool __dma_fence_enable_signaling(struct dma_fence *fence)
> >   {
> >   	const struct dma_fence_ops *ops;
> > -	bool was_set;
> > +	bool was_set, success;
> > +	unsigned long flags;
> >   
> >   	dma_fence_assert_held(fence);
> >   
> > @@ -644,7 +645,10 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence)
> >   	if (!was_set && ops && ops->enable_signaling) {
> >   		trace_dma_fence_enable_signal(fence);
> >   
> > -		if (!ops->enable_signaling(fence)) {
> > +		dma_fence_lock_irqsave(fence, flags);
> > +		success = ops->enable_signaling(fence);
> > +		dma_fence_unlock_irqrestore(fence, flags);
> > +		if (!success) {
> >   			rcu_read_unlock();
> >   			dma_fence_signal_locked(fence);
> >   			return false;
> > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> >   void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >   {
> >   	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> >   
> >   	rcu_read_lock();
> >   	ops = rcu_dereference(fence->ops);
> > -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> > +	if (!ops || !ops->set_deadline) {
> > +		rcu_read_unlock();
> > +		return;
> > +	}
> > +
> > +	dma_fence_lock_irqsave(fence, flags);
> > +	if (!dma_fence_is_signaled_locked(fence))
> >   		ops->set_deadline(fence, deadline);
> > +
> > +	dma_fence_unlock_irqrestore(fence, flags);
> >   	rcu_read_unlock();
> >   }
> >   EXPORT_SYMBOL(dma_fence_set_deadline);
> > @@ -1166,14 +1179,18 @@ EXPORT_SYMBOL(dma_fence_init64);
> >    */
> >   const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
> >   {
> > +	const char __rcu *name = "detached-driver";
> >   	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> >   
> >   	/* RCU protection is required for safe access to returned string */
> >   	ops = rcu_dereference(fence->ops);
> > +	dma_fence_lock_irqsave(fence, flags);
> >   	if (!dma_fence_test_signaled_flag(fence))
> > -		return (const char __rcu *)ops->get_driver_name(fence);
> > -	else
> > -		return (const char __rcu *)"detached-driver";
> > +		name = ops->get_driver_name(fence);
> > +	dma_fence_unlock_irqrestore(fence, flags);
> > +
> > +	return name;
> >   }
> >   EXPORT_SYMBOL(dma_fence_driver_name);
> >   
> > @@ -1199,13 +1216,17 @@ EXPORT_SYMBOL(dma_fence_driver_name);
> >    */
> >   const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
> >   {
> > +	const char __rcu *name = "signaled-timeline";
> >   	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> >   
> >   	/* RCU protection is required for safe access to returned string */
> >   	ops = rcu_dereference(fence->ops);
> > +	dma_fence_lock_irqsave(fence, flags);
> >   	if (!dma_fence_test_signaled_flag(fence))
> > -		return (const char __rcu *)ops->get_driver_name(fence);
> > -	else
> > -		return (const char __rcu *)"signaled-timeline";
> > +		name = ops->get_driver_name(fence);
> > +	dma_fence_unlock_irqrestore(fence, flags);
> > +
> > +	return name;
> >   }
> >   EXPORT_SYMBOL(dma_fence_timeline_name);
> > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > index b52ab692b22e..b93c3f7f69fb 100644
> > --- a/include/linux/dma-fence.h
> > +++ b/include/linux/dma-fence.h
> > @@ -547,20 +547,29 @@ static inline bool
> >   dma_fence_is_signaled(struct dma_fence *fence)
> >   {
> >   	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> > +	bool signaled;
> >   
> >   	if (dma_fence_test_signaled_flag(fence))
> >   		return true;
> >   
> >   	rcu_read_lock();
> >   	ops = rcu_dereference(fence->ops);
> > -	if (ops && ops->signaled && ops->signaled(fence)) {
> > +	if (!ops || !ops->signaled) {
> >   		rcu_read_unlock();
> > -		dma_fence_signal(fence);
> > -		return true;
> > +		return false;
> >   	}
> > +
> > +	dma_fence_lock_irqsave(fence, flags);
> > +	signaled = ops->signaled(fence);
> > +
> > +	if (signaled)
> > +		dma_fence_signal_locked(fence);
> > +
> > +	dma_fence_unlock_irqrestore(fence, flags);
> >   	rcu_read_unlock();
> >   
> > -	return false;
> > +	return signaled;
> >   }
> >   
> >   /**

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:01 ` Boris Brezillon
@ 2026-06-08 15:17   ` Philipp Stanner
  2026-06-08 15:23     ` Danilo Krummrich
  0 siblings, 1 reply; 18+ messages in thread
From: Philipp Stanner @ 2026-06-08 15:17 UTC (permalink / raw)
  To: Boris Brezillon, Philipp Stanner
  Cc: Sumit Semwal, Christian König, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin, linux-media, dri-devel, linux-kernel,
	Danilo Krummrich

On Mon, 2026-06-08 at 17:01 +0200, Boris Brezillon wrote:
> On Mon,  8 Jun 2026 16:24:37 +0200
> Philipp Stanner <phasta@kernel.org> wrote:
> 
> > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> >  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >  {
> >  	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> >  
> >  	rcu_read_lock();
> >  	ops = rcu_dereference(fence->ops);
> > -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> > +	if (!ops || !ops->set_deadline) {
> > +		rcu_read_unlock();
> > +		return;
> > +	}
> > +
> > +	dma_fence_lock_irqsave(fence, flags);
> > +	if (!dma_fence_is_signaled_locked(fence))
> >  		ops->set_deadline(fence, deadline);
> 
> You can't take the fence lock around ->set_deadline(), otherwise you'll
> deadlock here [1] or here [2].
> 
> > +
> > +	dma_fence_unlock_irqrestore(fence, flags);
> >  	rcu_read_unlock();
> >  }
> 
> 
> [1]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/dma-buf/sw_sync.c#L182
> [2]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/gpu/drm/msm/msm_fence.c#L139


If we'd port these (and maybe some we have overlooked) simultaneously,
they could completely drop their separate locking.

The fact that other parties were forced to take the fence lock in their
callbacks (and even 100% of the functions' code) actually proves that
this RFC is probably a good idea and callback-calls should be guarded
by the fence lock :]


P.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:17   ` Philipp Stanner
@ 2026-06-08 15:23     ` Danilo Krummrich
  2026-06-08 15:30       ` Boris Brezillon
  2026-06-08 15:30       ` Philipp Stanner
  0 siblings, 2 replies; 18+ messages in thread
From: Danilo Krummrich @ 2026-06-08 15:23 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: phasta, Boris Brezillon, Sumit Semwal, Christian König,
	Alice Ryhl, Daniel Almeida, Gary Guo, Tvrtko Ursulin, linux-media,
	dri-devel, linux-kernel

On Mon Jun 8, 2026 at 5:17 PM CEST, Philipp Stanner wrote:
> On Mon, 2026-06-08 at 17:01 +0200, Boris Brezillon wrote:
>> On Mon,  8 Jun 2026 16:24:37 +0200
>> Philipp Stanner <phasta@kernel.org> wrote:
>> 
>> > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>> >  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>> >  {
>> >  	const struct dma_fence_ops *ops;
>> > +	unsigned long flags;
>> >  
>> >  	rcu_read_lock();
>> >  	ops = rcu_dereference(fence->ops);
>> > -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
>> > +	if (!ops || !ops->set_deadline) {
>> > +		rcu_read_unlock();
>> > +		return;
>> > +	}
>> > +
>> > +	dma_fence_lock_irqsave(fence, flags);
>> > +	if (!dma_fence_is_signaled_locked(fence))
>> >  		ops->set_deadline(fence, deadline);
>> 
>> You can't take the fence lock around ->set_deadline(), otherwise you'll
>> deadlock here [1] or here [2].
>> 
>> > +
>> > +	dma_fence_unlock_irqrestore(fence, flags);
>> >  	rcu_read_unlock();
>> >  }
>> 
>> 
>> [1]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/dma-buf/sw_sync.c#L182
>> [2]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/gpu/drm/msm/msm_fence.c#L139
>
>
> If we'd port these (and maybe some we have overlooked) simultaneously,
> they could completely drop their separate locking.
>
> The fact that other parties were forced to take the fence lock in their
> callbacks (and even 100% of the functions' code) actually proves that
> this RFC is probably a good idea and callback-calls should be guarded
> by the fence lock :]

I think I looked into this recently and IIRC it indeed seems like all
implementors of set_deadline() take the lock within their callback.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:23     ` Danilo Krummrich
@ 2026-06-08 15:30       ` Boris Brezillon
  2026-06-08 15:30       ` Philipp Stanner
  1 sibling, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-06-08 15:30 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Philipp Stanner, phasta, Sumit Semwal, Christian König,
	Alice Ryhl, Daniel Almeida, Gary Guo, Tvrtko Ursulin, linux-media,
	dri-devel, linux-kernel

On Mon, 08 Jun 2026 17:23:06 +0200
"Danilo Krummrich" <dakr@kernel.org> wrote:

> On Mon Jun 8, 2026 at 5:17 PM CEST, Philipp Stanner wrote:
> > On Mon, 2026-06-08 at 17:01 +0200, Boris Brezillon wrote:  
> >> On Mon,  8 Jun 2026 16:24:37 +0200
> >> Philipp Stanner <phasta@kernel.org> wrote:
> >>   
> >> > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> >> >  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >> >  {
> >> >  	const struct dma_fence_ops *ops;
> >> > +	unsigned long flags;
> >> >  
> >> >  	rcu_read_lock();
> >> >  	ops = rcu_dereference(fence->ops);
> >> > -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> >> > +	if (!ops || !ops->set_deadline) {
> >> > +		rcu_read_unlock();
> >> > +		return;
> >> > +	}
> >> > +
> >> > +	dma_fence_lock_irqsave(fence, flags);
> >> > +	if (!dma_fence_is_signaled_locked(fence))
> >> >  		ops->set_deadline(fence, deadline);  
> >> 
> >> You can't take the fence lock around ->set_deadline(), otherwise you'll
> >> deadlock here [1] or here [2].
> >>   
> >> > +
> >> > +	dma_fence_unlock_irqrestore(fence, flags);
> >> >  	rcu_read_unlock();
> >> >  }  
> >> 
> >> 
> >> [1]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/dma-buf/sw_sync.c#L182
> >> [2]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/gpu/drm/msm/msm_fence.c#L139  
> >
> >
> > If we'd port these (and maybe some we have overlooked) simultaneously,
> > they could completely drop their separate locking.
> >
> > The fact that other parties were forced to take the fence lock in their
> > callbacks (and even 100% of the functions' code) actually proves that
> > this RFC is probably a good idea and callback-calls should be guarded
> > by the fence lock :]  
> 
> I think I looked into this recently and IIRC it indeed seems like all
> implementors of set_deadline() take the lock within their callback.

dma-fence-{chain-array}.c don't, but that's probably okay if they are
called with the container fence lock held, because we already have a
separate lockdep-class assigned to deal with the nested-locking of
dma_fence::inline_lock in the signal path.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:23     ` Danilo Krummrich
  2026-06-08 15:30       ` Boris Brezillon
@ 2026-06-08 15:30       ` Philipp Stanner
  2026-06-08 16:16         ` Boris Brezillon
  1 sibling, 1 reply; 18+ messages in thread
From: Philipp Stanner @ 2026-06-08 15:30 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: phasta, Boris Brezillon, Sumit Semwal, Christian König,
	Alice Ryhl, Daniel Almeida, Gary Guo, Tvrtko Ursulin, linux-media,
	dri-devel, linux-kernel

On Mon, 2026-06-08 at 17:23 +0200, Danilo Krummrich wrote:
> On Mon Jun 8, 2026 at 5:17 PM CEST, Philipp Stanner wrote:
> > On Mon, 2026-06-08 at 17:01 +0200, Boris Brezillon wrote:
> > > On Mon,  8 Jun 2026 16:24:37 +0200
> > > Philipp Stanner <phasta@kernel.org> wrote:
> > > 
> > > > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> > > >  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > > >  {
> > > >  	const struct dma_fence_ops *ops;
> > > > +	unsigned long flags;
> > > >  
> > > >  	rcu_read_lock();
> > > >  	ops = rcu_dereference(fence->ops);
> > > > -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> > > > +	if (!ops || !ops->set_deadline) {
> > > > +		rcu_read_unlock();
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	dma_fence_lock_irqsave(fence, flags);
> > > > +	if (!dma_fence_is_signaled_locked(fence))
> > > >  		ops->set_deadline(fence, deadline);
> > > 
> > > You can't take the fence lock around ->set_deadline(), otherwise you'll
> > > deadlock here [1] or here [2].
> > > 
> > > > +
> > > > +	dma_fence_unlock_irqrestore(fence, flags);
> > > >  	rcu_read_unlock();
> > > >  }
> > > 
> > > 
> > > [1]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/dma-buf/sw_sync.c#L182
> > > [2]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/gpu/drm/msm/msm_fence.c#L139

Oh, MSM actually doesn't btw, that's a false positive. That's a
distinct spinlock on their fence context object.


But yes, before we could upstream this, we would go through all the
implementors like Danilo did, to find all the others.

P.

> > 
> > 
> > If we'd port these (and maybe some we have overlooked) simultaneously,
> > they could completely drop their separate locking.
> > 
> > The fact that other parties were forced to take the fence lock in their
> > callbacks (and even 100% of the functions' code) actually proves that
> > this RFC is probably a good idea and callback-calls should be guarded
> > by the fence lock :]
> 
> I think I looked into this recently and IIRC it indeed seems like all
> implementors of set_deadline() take the lock within their callback.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 14:24 [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking Philipp Stanner
                   ` (2 preceding siblings ...)
  2026-06-08 15:07 ` Tvrtko Ursulin
@ 2026-06-08 15:35 ` Christian König
  2026-06-08 15:41   ` Philipp Stanner
  3 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2026-06-08 15:35 UTC (permalink / raw)
  To: Philipp Stanner, Sumit Semwal, Boris Brezillon, Alice Ryhl,
	Daniel Almeida, Gary Guo, Tvrtko Ursulin
  Cc: linux-media, dri-devel, linux-kernel, Danilo Krummrich

On 6/8/26 16:24, Philipp Stanner wrote:
> The dma_fence backend_ops can access a fence. Hereby, a driver callback
> will be running which likely will access driver specific data through
> container_of(). If now, simultaneously, a driver signals the fence and
> afterwards expects to run a driver specific destructor (using the same
> data accessed through container_of()), there can be a race.
> 
> A driver very likely trusts that once it has signaled a fence, no one
> will be accessing it anymore. Moreover, it might already want to free up
> resources, making UAF bugs possible.
> 
> The race occurs because there are only pragmatic checks for the signaled
> flag of a fence, without taking the fence lock. RCU guards exist, but
> their purpose is to guard accesses through the backend_ops callbacks
> against the driver (which implements the TEXT segment these callbacks
> live in) from unloading.
> 
> Proper synchronization can be ensured by taking the fence lock. RCU is
> still simultaneously required to guard against the unload.
> 
> Fix the races by taking the lock for all non-deprecated backend_ops
> callbacks.

That sounds like the fundamentally wrong approach to me.

The lock protects the dma_fence signaling state and *NOT* any driver state, so it should not be used to protect any driver state.

Drivers need to make sure that they protect their driver state with separate lock and don't rely on the dma_fence lock for this. This is actually the core of why we want to deprecate the shared dma_fence spinlock.

Regards,
Christian.

> 
> Conveniently, this also fixes a race where backend_ops->set_deadline()
> might try to set a deadline for an already signaled fence.
> 
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
> We discovered this problem through our Rust abstractions, but it can
> also occur in C.
> 
> The by far cleanest solution seems to be to use the fence lock. This RFC
> serves to discuss whether there is anything preventing that.
> 
> (Patch so far just compile tested, to have some groundlayer for the
> rough idea, to discuss it first)
> ---
>  drivers/dma-buf/dma-fence.c | 39 ++++++++++++++++++++++++++++---------
>  include/linux/dma-fence.h   | 17 ++++++++++++----
>  2 files changed, 43 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index c7ea1e75d38a..b74f02f3cca8 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
>  static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>  {
>  	const struct dma_fence_ops *ops;
> -	bool was_set;
> +	bool was_set, success;
> +	unsigned long flags;
>  
>  	dma_fence_assert_held(fence);
>  
> @@ -644,7 +645,10 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>  	if (!was_set && ops && ops->enable_signaling) {
>  		trace_dma_fence_enable_signal(fence);
>  
> -		if (!ops->enable_signaling(fence)) {
> +		dma_fence_lock_irqsave(fence, flags);
> +		success = ops->enable_signaling(fence);
> +		dma_fence_unlock_irqrestore(fence, flags);
> +		if (!success) {
>  			rcu_read_unlock();
>  			dma_fence_signal_locked(fence);
>  			return false;
> @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>  {
>  	const struct dma_fence_ops *ops;
> +	unsigned long flags;
>  
>  	rcu_read_lock();
>  	ops = rcu_dereference(fence->ops);
> -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> +	if (!ops || !ops->set_deadline) {
> +		rcu_read_unlock();
> +		return;
> +	}
> +
> +	dma_fence_lock_irqsave(fence, flags);
> +	if (!dma_fence_is_signaled_locked(fence))
>  		ops->set_deadline(fence, deadline);
> +
> +	dma_fence_unlock_irqrestore(fence, flags);
>  	rcu_read_unlock();
>  }
>  EXPORT_SYMBOL(dma_fence_set_deadline);
> @@ -1166,14 +1179,18 @@ EXPORT_SYMBOL(dma_fence_init64);
>   */
>  const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
>  {
> +	const char __rcu *name = "detached-driver";
>  	const struct dma_fence_ops *ops;
> +	unsigned long flags;
>  
>  	/* RCU protection is required for safe access to returned string */
>  	ops = rcu_dereference(fence->ops);
> +	dma_fence_lock_irqsave(fence, flags);
>  	if (!dma_fence_test_signaled_flag(fence))
> -		return (const char __rcu *)ops->get_driver_name(fence);
> -	else
> -		return (const char __rcu *)"detached-driver";
> +		name = ops->get_driver_name(fence);
> +	dma_fence_unlock_irqrestore(fence, flags);
> +
> +	return name;
>  }
>  EXPORT_SYMBOL(dma_fence_driver_name);
>  
> @@ -1199,13 +1216,17 @@ EXPORT_SYMBOL(dma_fence_driver_name);
>   */
>  const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
>  {
> +	const char __rcu *name = "signaled-timeline";
>  	const struct dma_fence_ops *ops;
> +	unsigned long flags;
>  
>  	/* RCU protection is required for safe access to returned string */
>  	ops = rcu_dereference(fence->ops);
> +	dma_fence_lock_irqsave(fence, flags);
>  	if (!dma_fence_test_signaled_flag(fence))
> -		return (const char __rcu *)ops->get_driver_name(fence);
> -	else
> -		return (const char __rcu *)"signaled-timeline";
> +		name = ops->get_driver_name(fence);
> +	dma_fence_unlock_irqrestore(fence, flags);
> +
> +	return name;
>  }
>  EXPORT_SYMBOL(dma_fence_timeline_name);
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index b52ab692b22e..b93c3f7f69fb 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -547,20 +547,29 @@ static inline bool
>  dma_fence_is_signaled(struct dma_fence *fence)
>  {
>  	const struct dma_fence_ops *ops;
> +	unsigned long flags;
> +	bool signaled;
>  
>  	if (dma_fence_test_signaled_flag(fence))
>  		return true;
>  
>  	rcu_read_lock();
>  	ops = rcu_dereference(fence->ops);
> -	if (ops && ops->signaled && ops->signaled(fence)) {
> +	if (!ops || !ops->signaled) {
>  		rcu_read_unlock();
> -		dma_fence_signal(fence);
> -		return true;
> +		return false;
>  	}
> +
> +	dma_fence_lock_irqsave(fence, flags);
> +	signaled = ops->signaled(fence);
> +
> +	if (signaled)
> +		dma_fence_signal_locked(fence);
> +
> +	dma_fence_unlock_irqrestore(fence, flags);
>  	rcu_read_unlock();
>  
> -	return false;
> +	return signaled;
>  }
>  
>  /**


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:35 ` Christian König
@ 2026-06-08 15:41   ` Philipp Stanner
  2026-06-08 17:34     ` Christian König
  0 siblings, 1 reply; 18+ messages in thread
From: Philipp Stanner @ 2026-06-08 15:41 UTC (permalink / raw)
  To: Christian König, Philipp Stanner, Sumit Semwal,
	Boris Brezillon, Alice Ryhl, Daniel Almeida, Gary Guo,
	Tvrtko Ursulin
  Cc: linux-media, dri-devel, linux-kernel, Danilo Krummrich

On Mon, 2026-06-08 at 17:35 +0200, Christian König wrote:
> On 6/8/26 16:24, Philipp Stanner wrote:
> > The dma_fence backend_ops can access a fence. Hereby, a driver callback
> > will be running which likely will access driver specific data through
> > container_of(). If now, simultaneously, a driver signals the fence and
> > afterwards expects to run a driver specific destructor (using the same
> > data accessed through container_of()), there can be a race.
> > 
> > A driver very likely trusts that once it has signaled a fence, no one
> > will be accessing it anymore. Moreover, it might already want to free up
> > resources, making UAF bugs possible.
> > 
> > The race occurs because there are only pragmatic checks for the signaled
> > flag of a fence, without taking the fence lock. RCU guards exist, but
> > their purpose is to guard accesses through the backend_ops callbacks
> > against the driver (which implements the TEXT segment these callbacks
> > live in) from unloading.
> > 
> > Proper synchronization can be ensured by taking the fence lock. RCU is
> > still simultaneously required to guard against the unload.
> > 
> > Fix the races by taking the lock for all non-deprecated backend_ops
> > callbacks.
> 
> That sounds like the fundamentally wrong approach to me.
> 
> The lock protects the dma_fence signaling state and *NOT* any driver
> state, so it should not be used to protect any driver state.
> 
> Drivers need to make sure that they protect their driver state with
> separate lock and don't rely on the dma_fence lock for this. This is
> actually the core of why we want to deprecate the shared dma_fence
> spinlock.

It's not so much about protecting data, it's about correctness:

A driver that calls

dma_fence_signal(f)

expects that after signalling, no callback will be running into the
driver again. It's a fix synchronization point.

Only the fence lock can grant such synchronization.

Positive effects would be:

1. Drivers can do their cleanup immediately, without having to wait for
a grace period

2. Drivers could be sure that their driver_fence data, allocated
together with fence and accessed through container_of(fence), is not
being accessed anymore.


I see only advantages. Safer, faster. :)

P.

> 
> Regards,
> Christian.
> 
> > 
> > Conveniently, this also fixes a race where backend_ops->set_deadline()
> > might try to set a deadline for an already signaled fence.
> > 
> > Suggested-by: Danilo Krummrich <dakr@kernel.org>
> > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > ---
> > We discovered this problem through our Rust abstractions, but it can
> > also occur in C.
> > 
> > The by far cleanest solution seems to be to use the fence lock. This RFC
> > serves to discuss whether there is anything preventing that.
> > 
> > (Patch so far just compile tested, to have some groundlayer for the
> > rough idea, to discuss it first)
> > ---
> >  drivers/dma-buf/dma-fence.c | 39 ++++++++++++++++++++++++++++---------
> >  include/linux/dma-fence.h   | 17 ++++++++++++----
> >  2 files changed, 43 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index c7ea1e75d38a..b74f02f3cca8 100644
> > --- a/drivers/dma-buf/dma-fence.c
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
> >  static bool __dma_fence_enable_signaling(struct dma_fence *fence)
> >  {
> >  	const struct dma_fence_ops *ops;
> > -	bool was_set;
> > +	bool was_set, success;
> > +	unsigned long flags;
> >  
> >  	dma_fence_assert_held(fence);
> >  
> > @@ -644,7 +645,10 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence)
> >  	if (!was_set && ops && ops->enable_signaling) {
> >  		trace_dma_fence_enable_signal(fence);
> >  
> > -		if (!ops->enable_signaling(fence)) {
> > +		dma_fence_lock_irqsave(fence, flags);
> > +		success = ops->enable_signaling(fence);
> > +		dma_fence_unlock_irqrestore(fence, flags);
> > +		if (!success) {
> >  			rcu_read_unlock();
> >  			dma_fence_signal_locked(fence);
> >  			return false;
> > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> >  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >  {
> >  	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> >  
> >  	rcu_read_lock();
> >  	ops = rcu_dereference(fence->ops);
> > -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> > +	if (!ops || !ops->set_deadline) {
> > +		rcu_read_unlock();
> > +		return;
> > +	}
> > +
> > +	dma_fence_lock_irqsave(fence, flags);
> > +	if (!dma_fence_is_signaled_locked(fence))
> >  		ops->set_deadline(fence, deadline);
> > +
> > +	dma_fence_unlock_irqrestore(fence, flags);
> >  	rcu_read_unlock();
> >  }
> >  EXPORT_SYMBOL(dma_fence_set_deadline);
> > @@ -1166,14 +1179,18 @@ EXPORT_SYMBOL(dma_fence_init64);
> >   */
> >  const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
> >  {
> > +	const char __rcu *name = "detached-driver";
> >  	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> >  
> >  	/* RCU protection is required for safe access to returned string */
> >  	ops = rcu_dereference(fence->ops);
> > +	dma_fence_lock_irqsave(fence, flags);
> >  	if (!dma_fence_test_signaled_flag(fence))
> > -		return (const char __rcu *)ops->get_driver_name(fence);
> > -	else
> > -		return (const char __rcu *)"detached-driver";
> > +		name = ops->get_driver_name(fence);
> > +	dma_fence_unlock_irqrestore(fence, flags);
> > +
> > +	return name;
> >  }
> >  EXPORT_SYMBOL(dma_fence_driver_name);
> >  
> > @@ -1199,13 +1216,17 @@ EXPORT_SYMBOL(dma_fence_driver_name);
> >   */
> >  const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
> >  {
> > +	const char __rcu *name = "signaled-timeline";
> >  	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> >  
> >  	/* RCU protection is required for safe access to returned string */
> >  	ops = rcu_dereference(fence->ops);
> > +	dma_fence_lock_irqsave(fence, flags);
> >  	if (!dma_fence_test_signaled_flag(fence))
> > -		return (const char __rcu *)ops->get_driver_name(fence);
> > -	else
> > -		return (const char __rcu *)"signaled-timeline";
> > +		name = ops->get_driver_name(fence);
> > +	dma_fence_unlock_irqrestore(fence, flags);
> > +
> > +	return name;
> >  }
> >  EXPORT_SYMBOL(dma_fence_timeline_name);
> > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > index b52ab692b22e..b93c3f7f69fb 100644
> > --- a/include/linux/dma-fence.h
> > +++ b/include/linux/dma-fence.h
> > @@ -547,20 +547,29 @@ static inline bool
> >  dma_fence_is_signaled(struct dma_fence *fence)
> >  {
> >  	const struct dma_fence_ops *ops;
> > +	unsigned long flags;
> > +	bool signaled;
> >  
> >  	if (dma_fence_test_signaled_flag(fence))
> >  		return true;
> >  
> >  	rcu_read_lock();
> >  	ops = rcu_dereference(fence->ops);
> > -	if (ops && ops->signaled && ops->signaled(fence)) {
> > +	if (!ops || !ops->signaled) {
> >  		rcu_read_unlock();
> > -		dma_fence_signal(fence);
> > -		return true;
> > +		return false;
> >  	}
> > +
> > +	dma_fence_lock_irqsave(fence, flags);
> > +	signaled = ops->signaled(fence);
> > +
> > +	if (signaled)
> > +		dma_fence_signal_locked(fence);
> > +
> > +	dma_fence_unlock_irqrestore(fence, flags);
> >  	rcu_read_unlock();
> >  
> > -	return false;
> > +	return signaled;
> >  }
> >  
> >  /**

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:30       ` Philipp Stanner
@ 2026-06-08 16:16         ` Boris Brezillon
  0 siblings, 0 replies; 18+ messages in thread
From: Boris Brezillon @ 2026-06-08 16:16 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: phasta, Danilo Krummrich, Sumit Semwal, Christian König,
	Alice Ryhl, Daniel Almeida, Gary Guo, Tvrtko Ursulin, linux-media,
	dri-devel, linux-kernel

On Mon, 08 Jun 2026 17:30:58 +0200
Philipp Stanner <phasta@mailbox.org> wrote:

> On Mon, 2026-06-08 at 17:23 +0200, Danilo Krummrich wrote:
> > On Mon Jun 8, 2026 at 5:17 PM CEST, Philipp Stanner wrote:  
> > > On Mon, 2026-06-08 at 17:01 +0200, Boris Brezillon wrote:  
> > > > On Mon,  8 Jun 2026 16:24:37 +0200
> > > > Philipp Stanner <phasta@kernel.org> wrote:
> > > >   
> > > > > @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> > > > >  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > > > >  {
> > > > >  	const struct dma_fence_ops *ops;
> > > > > +	unsigned long flags;
> > > > >  
> > > > >  	rcu_read_lock();
> > > > >  	ops = rcu_dereference(fence->ops);
> > > > > -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
> > > > > +	if (!ops || !ops->set_deadline) {
> > > > > +		rcu_read_unlock();
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	dma_fence_lock_irqsave(fence, flags);
> > > > > +	if (!dma_fence_is_signaled_locked(fence))
> > > > >  		ops->set_deadline(fence, deadline);  
> > > > 
> > > > You can't take the fence lock around ->set_deadline(), otherwise you'll
> > > > deadlock here [1] or here [2].
> > > >   
> > > > > +
> > > > > +	dma_fence_unlock_irqrestore(fence, flags);
> > > > >  	rcu_read_unlock();
> > > > >  }  
> > > > 
> > > > 
> > > > [1]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/dma-buf/sw_sync.c#L182
> > > > [2]https://elixir.bootlin.com/linux/v7.0.11/source/drivers/gpu/drm/msm/msm_fence.c#L139  
> 
> Oh, MSM actually doesn't btw, that's a false positive. That's a
> distinct spinlock on their fence context object.

It's not, it's the same lock they attach to all their fences coming
from this context. It's just that this lock appears to be per-context,
like is the case for basically every driver, since the inline lock was
introduced only in this release cycle.

Anyway, this is all stuff we can fix if people think it's okay to
protect dma_fence_ops calls with the fence lock. But my point remains:
each op has its own locking-rules, some are called with the fence lock
held (enable_signaling(), signaled()), others are not (set_deadline(),
get_xxx_name()), so we need to carefully audit each of those to make
sure:

- calling with the lock held in the new places is not causing a
  deadlock
- the returned data, if not a scalar, is protected by the RCU read lock
- any driver implementing ops that can be called without the lock held
  need to hold on the device data for an RCU grace period

The last bullet is probably the one I'm the most worried about, because
instead of a single rule that applies to all ops, we have various cases
based on whether some ops are implemented or not, but that's already
the case with deprecated ops like .release() or .wait(), so maybe
that's okay with the proper doc.

If I were to choose, I'd probably go for a dedicated rwlock_t to
protect dma_fence_ops, so we can:

- protect all dma_buf_ops::xx() consistently no matter the kind of op
- protect returned data (get_xxx_name()) with this lock instead of the
  RCU read lock

But the overhead of this extra lock might not be acceptable, dunno.

> 
> 
> But yes, before we could upstream this, we would go through all the
> implementors like Danilo did, to find all the others.

There's the two I pointed out, plus the array/chain containers I
mentioned, which are not problematic.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 15:41   ` Philipp Stanner
@ 2026-06-08 17:34     ` Christian König
  2026-06-08 17:59       ` Danilo Krummrich
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2026-06-08 17:34 UTC (permalink / raw)
  To: phasta, Sumit Semwal, Boris Brezillon, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin
  Cc: linux-media, dri-devel, linux-kernel, Danilo Krummrich

On 6/8/26 17:41, Philipp Stanner wrote:
> On Mon, 2026-06-08 at 17:35 +0200, Christian König wrote:
>> On 6/8/26 16:24, Philipp Stanner wrote:
>>> The dma_fence backend_ops can access a fence. Hereby, a driver callback
>>> will be running which likely will access driver specific data through
>>> container_of(). If now, simultaneously, a driver signals the fence and
>>> afterwards expects to run a driver specific destructor (using the same
>>> data accessed through container_of()), there can be a race.
>>>
>>> A driver very likely trusts that once it has signaled a fence, no one
>>> will be accessing it anymore. Moreover, it might already want to free up
>>> resources, making UAF bugs possible.
>>>
>>> The race occurs because there are only pragmatic checks for the signaled
>>> flag of a fence, without taking the fence lock. RCU guards exist, but
>>> their purpose is to guard accesses through the backend_ops callbacks
>>> against the driver (which implements the TEXT segment these callbacks
>>> live in) from unloading.
>>>
>>> Proper synchronization can be ensured by taking the fence lock. RCU is
>>> still simultaneously required to guard against the unload.
>>>
>>> Fix the races by taking the lock for all non-deprecated backend_ops
>>> callbacks.
>>
>> That sounds like the fundamentally wrong approach to me.
>>
>> The lock protects the dma_fence signaling state and *NOT* any driver
>> state, so it should not be used to protect any driver state.
>>
>> Drivers need to make sure that they protect their driver state with
>> separate lock and don't rely on the dma_fence lock for this. This is
>> actually the core of why we want to deprecate the shared dma_fence
>> spinlock.
> 
> It's not so much about protecting data, it's about correctness:
> 
> A driver that calls
> 
> dma_fence_signal(f)
> 
> expects that after signalling, no callback will be running into the
> driver again.

No, that is not even remotely correct.

That's why we need the RCU grace period to make sure that nobody is referencing the driver stuff any more.

DMA fence destruction has to wait for an RCU grace period for exactly the same reason as well.

If we want to cleanup I would start there. And then eventually stop calling callbacks with the fence lock held and only hold the RCU read side.

Regards,
Christian.

> It's a fix synchronization point.
> 
> Only the fence lock can grant such synchronization.
> 
> Positive effects would be:
> 
> 1. Drivers can do their cleanup immediately, without having to wait for
> a grace period
> 
> 2. Drivers could be sure that their driver_fence data, allocated
> together with fence and accessed through container_of(fence), is not
> being accessed anymore.
> 
> 
> I see only advantages. Safer, faster. :)
> 
> P.
> 
>>
>> Regards,
>> Christian.
>>
>>>
>>> Conveniently, this also fixes a race where backend_ops->set_deadline()
>>> might try to set a deadline for an already signaled fence.
>>>
>>> Suggested-by: Danilo Krummrich <dakr@kernel.org>
>>> Signed-off-by: Philipp Stanner <phasta@kernel.org>
>>> ---
>>> We discovered this problem through our Rust abstractions, but it can
>>> also occur in C.
>>>
>>> The by far cleanest solution seems to be to use the fence lock. This RFC
>>> serves to discuss whether there is anything preventing that.
>>>
>>> (Patch so far just compile tested, to have some groundlayer for the
>>> rough idea, to discuss it first)
>>> ---
>>>  drivers/dma-buf/dma-fence.c | 39 ++++++++++++++++++++++++++++---------
>>>  include/linux/dma-fence.h   | 17 ++++++++++++----
>>>  2 files changed, 43 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>>> index c7ea1e75d38a..b74f02f3cca8 100644
>>> --- a/drivers/dma-buf/dma-fence.c
>>> +++ b/drivers/dma-buf/dma-fence.c
>>> @@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
>>>  static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>>>  {
>>>  	const struct dma_fence_ops *ops;
>>> -	bool was_set;
>>> +	bool was_set, success;
>>> +	unsigned long flags;
>>>  
>>>  	dma_fence_assert_held(fence);
>>>  
>>> @@ -644,7 +645,10 @@ static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>>>  	if (!was_set && ops && ops->enable_signaling) {
>>>  		trace_dma_fence_enable_signal(fence);
>>>  
>>> -		if (!ops->enable_signaling(fence)) {
>>> +		dma_fence_lock_irqsave(fence, flags);
>>> +		success = ops->enable_signaling(fence);
>>> +		dma_fence_unlock_irqrestore(fence, flags);
>>> +		if (!success) {
>>>  			rcu_read_unlock();
>>>  			dma_fence_signal_locked(fence);
>>>  			return false;
>>> @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>>>  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>>>  {
>>>  	const struct dma_fence_ops *ops;
>>> +	unsigned long flags;
>>>  
>>>  	rcu_read_lock();
>>>  	ops = rcu_dereference(fence->ops);
>>> -	if (ops && ops->set_deadline && !dma_fence_is_signaled(fence))
>>> +	if (!ops || !ops->set_deadline) {
>>> +		rcu_read_unlock();
>>> +		return;
>>> +	}
>>> +
>>> +	dma_fence_lock_irqsave(fence, flags);
>>> +	if (!dma_fence_is_signaled_locked(fence))
>>>  		ops->set_deadline(fence, deadline);
>>> +
>>> +	dma_fence_unlock_irqrestore(fence, flags);
>>>  	rcu_read_unlock();
>>>  }
>>>  EXPORT_SYMBOL(dma_fence_set_deadline);
>>> @@ -1166,14 +1179,18 @@ EXPORT_SYMBOL(dma_fence_init64);
>>>   */
>>>  const char __rcu *dma_fence_driver_name(struct dma_fence *fence)
>>>  {
>>> +	const char __rcu *name = "detached-driver";
>>>  	const struct dma_fence_ops *ops;
>>> +	unsigned long flags;
>>>  
>>>  	/* RCU protection is required for safe access to returned string */
>>>  	ops = rcu_dereference(fence->ops);
>>> +	dma_fence_lock_irqsave(fence, flags);
>>>  	if (!dma_fence_test_signaled_flag(fence))
>>> -		return (const char __rcu *)ops->get_driver_name(fence);
>>> -	else
>>> -		return (const char __rcu *)"detached-driver";
>>> +		name = ops->get_driver_name(fence);
>>> +	dma_fence_unlock_irqrestore(fence, flags);
>>> +
>>> +	return name;
>>>  }
>>>  EXPORT_SYMBOL(dma_fence_driver_name);
>>>  
>>> @@ -1199,13 +1216,17 @@ EXPORT_SYMBOL(dma_fence_driver_name);
>>>   */
>>>  const char __rcu *dma_fence_timeline_name(struct dma_fence *fence)
>>>  {
>>> +	const char __rcu *name = "signaled-timeline";
>>>  	const struct dma_fence_ops *ops;
>>> +	unsigned long flags;
>>>  
>>>  	/* RCU protection is required for safe access to returned string */
>>>  	ops = rcu_dereference(fence->ops);
>>> +	dma_fence_lock_irqsave(fence, flags);
>>>  	if (!dma_fence_test_signaled_flag(fence))
>>> -		return (const char __rcu *)ops->get_driver_name(fence);
>>> -	else
>>> -		return (const char __rcu *)"signaled-timeline";
>>> +		name = ops->get_driver_name(fence);
>>> +	dma_fence_unlock_irqrestore(fence, flags);
>>> +
>>> +	return name;
>>>  }
>>>  EXPORT_SYMBOL(dma_fence_timeline_name);
>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>> index b52ab692b22e..b93c3f7f69fb 100644
>>> --- a/include/linux/dma-fence.h
>>> +++ b/include/linux/dma-fence.h
>>> @@ -547,20 +547,29 @@ static inline bool
>>>  dma_fence_is_signaled(struct dma_fence *fence)
>>>  {
>>>  	const struct dma_fence_ops *ops;
>>> +	unsigned long flags;
>>> +	bool signaled;
>>>  
>>>  	if (dma_fence_test_signaled_flag(fence))
>>>  		return true;
>>>  
>>>  	rcu_read_lock();
>>>  	ops = rcu_dereference(fence->ops);
>>> -	if (ops && ops->signaled && ops->signaled(fence)) {
>>> +	if (!ops || !ops->signaled) {
>>>  		rcu_read_unlock();
>>> -		dma_fence_signal(fence);
>>> -		return true;
>>> +		return false;
>>>  	}
>>> +
>>> +	dma_fence_lock_irqsave(fence, flags);
>>> +	signaled = ops->signaled(fence);
>>> +
>>> +	if (signaled)
>>> +		dma_fence_signal_locked(fence);
>>> +
>>> +	dma_fence_unlock_irqrestore(fence, flags);
>>>  	rcu_read_unlock();
>>>  
>>> -	return false;
>>> +	return signaled;
>>>  }
>>>  
>>>  /**


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 17:34     ` Christian König
@ 2026-06-08 17:59       ` Danilo Krummrich
  2026-06-08 18:32         ` Christian König
  0 siblings, 1 reply; 18+ messages in thread
From: Danilo Krummrich @ 2026-06-08 17:59 UTC (permalink / raw)
  To: Christian König
  Cc: phasta, Sumit Semwal, Boris Brezillon, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin, linux-media, dri-devel, linux-kernel

On Mon Jun 8, 2026 at 7:34 PM CEST, Christian König wrote:
> That's why we need the RCU grace period to make sure that nobody is
> referencing the driver stuff any more.

Right, and that's what Philipp tries to address, the requirement to wait for an
RCU grace period is perfectly fine if it is only about freeing memory, but it
can become painful if the fence private data contains data also needs to be
destructed in some way.

IOW, if a driver signals a fence, it is lifecycle-wise reasonable to destruct
the private data that is no longer needed (remaining users only deal with struct
dma_fence) and having to wait for a full grace period adds sublety and
complication that can be avoided with the proposed approach.

That said, I'd like to ask the opposite question: What are the concerns with the
proposed approach over (pure) RCU?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 17:59       ` Danilo Krummrich
@ 2026-06-08 18:32         ` Christian König
  2026-06-08 18:39           ` Danilo Krummrich
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2026-06-08 18:32 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: phasta, Sumit Semwal, Boris Brezillon, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin, linux-media, dri-devel, linux-kernel

On 6/8/26 19:59, Danilo Krummrich wrote:
> On Mon Jun 8, 2026 at 7:34 PM CEST, Christian König wrote:
>> That's why we need the RCU grace period to make sure that nobody is
>> referencing the driver stuff any more.
> 
> Right, and that's what Philipp tries to address, the requirement to wait for an
> RCU grace period is perfectly fine if it is only about freeing memory, but it
> can become painful if the fence private data contains data also needs to be
> destructed in some way.

Yeah that makes sense.

> IOW, if a driver signals a fence, it is lifecycle-wise reasonable to destruct
> the private data that is no longer needed (remaining users only deal with struct
> dma_fence) and having to wait for a full grace period adds sublety and
> complication that can be avoided with the proposed approach.

Yeah, I've run into that when I tried to make the amdgpu fences independent as well.
> That said, I'd like to ask the opposite question: What are the concerns with the
> proposed approach over (pure) RCU?

Well a) locking inversions and b) performance.

For example the reason why we have the dma_fence_is_signaled() and dma_fence_is_signaled_locked() variants is because there is a measurable difference in some specific use cases for not grabbing the locks.

I personally find those micro-optimizations rather questionable, but the community agreement is that we should have them.

So my take would rather be that the dma_fence_is_signaled_locked() variant goes away and we consistently call the ops pointers without holding the dma_fence lock and the driver implementations can then optionally take it if necessary.

I think for this we would just need to replace most calls to dma_fence_is_signaled_locked() with dma_fence_test_signaled().

In the long term that would also allow cleaning up the container handling and simplifying the DRM scheduler a bit.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 18:32         ` Christian König
@ 2026-06-08 18:39           ` Danilo Krummrich
  2026-06-08 18:47             ` Christian König
  0 siblings, 1 reply; 18+ messages in thread
From: Danilo Krummrich @ 2026-06-08 18:39 UTC (permalink / raw)
  To: Christian König
  Cc: phasta, Sumit Semwal, Boris Brezillon, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin, linux-media, dri-devel, linux-kernel

On Mon Jun 8, 2026 at 8:32 PM CEST, Christian König wrote:
> On 6/8/26 19:59, Danilo Krummrich wrote:
>> On Mon Jun 8, 2026 at 7:34 PM CEST, Christian König wrote:
>>> That's why we need the RCU grace period to make sure that nobody is
>>> referencing the driver stuff any more.
>> 
>> Right, and that's what Philipp tries to address, the requirement to wait for an
>> RCU grace period is perfectly fine if it is only about freeing memory, but it
>> can become painful if the fence private data contains data also needs to be
>> destructed in some way.
>
> Yeah that makes sense.
>
>> IOW, if a driver signals a fence, it is lifecycle-wise reasonable to destruct
>> the private data that is no longer needed (remaining users only deal with struct
>> dma_fence) and having to wait for a full grace period adds sublety and
>> complication that can be avoided with the proposed approach.
>
> Yeah, I've run into that when I tried to make the amdgpu fences independent as well.
>> That said, I'd like to ask the opposite question: What are the concerns with the
>> proposed approach over (pure) RCU?
>
> Well a) locking inversions and b) performance.
>
> For example the reason why we have the dma_fence_is_signaled() and
> dma_fence_is_signaled_locked() variants is because there is a measurable
> difference in some specific use cases for not grabbing the locks.

I checked for this as well, but couldn't find a case where
dma_fence_is_signaled() is used in a way where it would be performance critical
to avoid the lock in any way.

Note that the lock is only bypassed when the fence is signaled already (this
would be preserved) and if signaled() returns false, i.e. dma_fence_signal()
will take the lock anyways.

> I personally find those micro-optimizations rather questionable, but the
> community agreement is that we should have them.

I agree, it is rather questionable. So, I wouldn't make this the deciding factor
unless someone can present a valid case where it actually matters.

> So my take would rather be that the dma_fence_is_signaled_locked() variant
> goes away and we consistently call the ops pointers without holding the
> dma_fence lock and the driver implementations can then optionally take it if
> necessary.

How did you get to this conclusion considering that you run into what I
mentioned above as well and the fact that we seem to agree that the performance
concern is rather questionable?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 18:39           ` Danilo Krummrich
@ 2026-06-08 18:47             ` Christian König
  2026-06-08 19:25               ` Danilo Krummrich
  0 siblings, 1 reply; 18+ messages in thread
From: Christian König @ 2026-06-08 18:47 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: phasta, Sumit Semwal, Boris Brezillon, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin, linux-media, dri-devel, linux-kernel

On 6/8/26 20:39, Danilo Krummrich wrote:
> On Mon Jun 8, 2026 at 8:32 PM CEST, Christian König wrote:
>> On 6/8/26 19:59, Danilo Krummrich wrote:
>>> On Mon Jun 8, 2026 at 7:34 PM CEST, Christian König wrote:
>>>> That's why we need the RCU grace period to make sure that nobody is
>>>> referencing the driver stuff any more.
>>>
>>> Right, and that's what Philipp tries to address, the requirement to wait for an
>>> RCU grace period is perfectly fine if it is only about freeing memory, but it
>>> can become painful if the fence private data contains data also needs to be
>>> destructed in some way.
>>
>> Yeah that makes sense.
>>
>>> IOW, if a driver signals a fence, it is lifecycle-wise reasonable to destruct
>>> the private data that is no longer needed (remaining users only deal with struct
>>> dma_fence) and having to wait for a full grace period adds sublety and
>>> complication that can be avoided with the proposed approach.
>>
>> Yeah, I've run into that when I tried to make the amdgpu fences independent as well.
>>> That said, I'd like to ask the opposite question: What are the concerns with the
>>> proposed approach over (pure) RCU?
>>
>> Well a) locking inversions and b) performance.
>>
>> For example the reason why we have the dma_fence_is_signaled() and
>> dma_fence_is_signaled_locked() variants is because there is a measurable
>> difference in some specific use cases for not grabbing the locks.
> 
> I checked for this as well, but couldn't find a case where
> dma_fence_is_signaled() is used in a way where it would be performance critical
> to avoid the lock in any way.
> 
> Note that the lock is only bypassed when the fence is signaled already (this
> would be preserved) and if signaled() returns false, i.e. dma_fence_signal()
> will take the lock anyways.
> 
>> I personally find those micro-optimizations rather questionable, but the
>> community agreement is that we should have them.
> 
> I agree, it is rather questionable. So, I wouldn't make this the deciding factor
> unless someone can present a valid case where it actually matters.
> 
>> So my take would rather be that the dma_fence_is_signaled_locked() variant
>> goes away and we consistently call the ops pointers without holding the
>> dma_fence lock and the driver implementations can then optionally take it if
>> necessary.
> 
> How did you get to this conclusion considering that you run into what I
> mentioned above as well and the fact that we seem to agree that the performance
> concern is rather questionable?

Quite simple, it's the cleaner approach.

Calling callbacks with locks held is rather questionable even putting the performance issue aside.

In detail calling the callbacks without holding locks allows all implementations who need it to explicitly take locks in the order they want.

If you call it with the lock held you enforce the fence lock the be the outermost lock.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking
  2026-06-08 18:47             ` Christian König
@ 2026-06-08 19:25               ` Danilo Krummrich
  0 siblings, 0 replies; 18+ messages in thread
From: Danilo Krummrich @ 2026-06-08 19:25 UTC (permalink / raw)
  To: Christian König
  Cc: phasta, Sumit Semwal, Boris Brezillon, Alice Ryhl, Daniel Almeida,
	Gary Guo, Tvrtko Ursulin, linux-media, dri-devel, linux-kernel

On Mon Jun 8, 2026 at 8:47 PM CEST, Christian König wrote:
> On 6/8/26 20:39, Danilo Krummrich wrote:
>> On Mon Jun 8, 2026 at 8:32 PM CEST, Christian König wrote:
>>> On 6/8/26 19:59, Danilo Krummrich wrote:
>>>> On Mon Jun 8, 2026 at 7:34 PM CEST, Christian König wrote:
>>>>> That's why we need the RCU grace period to make sure that nobody is
>>>>> referencing the driver stuff any more.
>>>>
>>>> Right, and that's what Philipp tries to address, the requirement to wait for an
>>>> RCU grace period is perfectly fine if it is only about freeing memory, but it
>>>> can become painful if the fence private data contains data also needs to be
>>>> destructed in some way.
>>>
>>> Yeah that makes sense.
>>>
>>>> IOW, if a driver signals a fence, it is lifecycle-wise reasonable to destruct
>>>> the private data that is no longer needed (remaining users only deal with struct
>>>> dma_fence) and having to wait for a full grace period adds sublety and
>>>> complication that can be avoided with the proposed approach.
>>>
>>> Yeah, I've run into that when I tried to make the amdgpu fences independent as well.
>>>> That said, I'd like to ask the opposite question: What are the concerns with the
>>>> proposed approach over (pure) RCU?
>>>
>>> Well a) locking inversions and b) performance.
>>>
>>> For example the reason why we have the dma_fence_is_signaled() and
>>> dma_fence_is_signaled_locked() variants is because there is a measurable
>>> difference in some specific use cases for not grabbing the locks.
>> 
>> I checked for this as well, but couldn't find a case where
>> dma_fence_is_signaled() is used in a way where it would be performance critical
>> to avoid the lock in any way.
>> 
>> Note that the lock is only bypassed when the fence is signaled already (this
>> would be preserved) and if signaled() returns false, i.e. dma_fence_signal()
>> will take the lock anyways.
>> 
>>> I personally find those micro-optimizations rather questionable, but the
>>> community agreement is that we should have them.
>> 
>> I agree, it is rather questionable. So, I wouldn't make this the deciding factor
>> unless someone can present a valid case where it actually matters.
>> 
>>> So my take would rather be that the dma_fence_is_signaled_locked() variant
>>> goes away and we consistently call the ops pointers without holding the
>>> dma_fence lock and the driver implementations can then optionally take it if
>>> necessary.
>> 
>> How did you get to this conclusion considering that you run into what I
>> mentioned above as well and the fact that we seem to agree that the performance
>> concern is rather questionable?
>
> Quite simple, it's the cleaner approach.

I would maybe agree iff the RCU read side critical section wouldn't be needed
and we wouldn't need to deal with the consequences of having to defer
everything.

And so far it seems to me that there isn't really any other reason that the
performance concern we both don't buy into.

> Calling callbacks with locks held is rather questionable even putting the
> performance issue aside.

In general, I don't think that more flexibility for drivers is automatically
always superior.

Also, before we keep calling it a performance issue, I'd really love to know
where dma_fence_is_signaled() is called in a case where it returns false and the
spinlock causes such an overhead that it actually matters.

(As mentioned above, none of the cases where it returns true would change.)

> In detail calling the callbacks without holding locks allows all
> implementations who need it to explicitly take locks in the order they want.

I don't think this is true in this case.

  1) The existence of dma_fence_is_signaled_locked() already mandates that all
     such callbacks must work properly if called with the fence lock held.

  2) The RCU read side critical section already mandates that driver must not
     sleep within the callback.

> If you call it with the lock held you enforce the fence lock the be the
> outermost lock.

That's practically already the case, due to dma_fence_is_signaled_locked().

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-06-08 19:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08 14:24 [RFC PATCH] dma-fence: Fix races of fence callbacks versus destructors by locking Philipp Stanner
2026-06-08 14:43 ` sashiko-bot
2026-06-08 15:01 ` Boris Brezillon
2026-06-08 15:17   ` Philipp Stanner
2026-06-08 15:23     ` Danilo Krummrich
2026-06-08 15:30       ` Boris Brezillon
2026-06-08 15:30       ` Philipp Stanner
2026-06-08 16:16         ` Boris Brezillon
2026-06-08 15:07 ` Tvrtko Ursulin
2026-06-08 15:15   ` Philipp Stanner
2026-06-08 15:35 ` Christian König
2026-06-08 15:41   ` Philipp Stanner
2026-06-08 17:34     ` Christian König
2026-06-08 17:59       ` Danilo Krummrich
2026-06-08 18:32         ` Christian König
2026-06-08 18:39           ` Danilo Krummrich
2026-06-08 18:47             ` Christian König
2026-06-08 19:25               ` Danilo Krummrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox