Linux Media Controller development
 help / color / mirror / Atom feed
* [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle
@ 2026-05-28  8:29 w15303746062
  2026-05-28  9:14 ` Christian König
  2026-05-31  7:54 ` [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle kernel test robot
  0 siblings, 2 replies; 9+ messages in thread
From: w15303746062 @ 2026-05-28  8:29 UTC (permalink / raw)
  To: maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	sumit.semwal, christian.koenig
  Cc: jeffy.chen, dri-devel, linux-kernel, linux-media, linaro-mm-sig,
	Mingyu Wang, stable

From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

Syzkaller fuzzer triggered a kernel panic via a WARNING in
drm_prime_destroy_file_private() due to a non-empty prime rb_tree.

The root cause is a complete lack of synchronization in the teardown
path. While the import path (drm_gem_prime_fd_to_handle) holds the
&file_priv->prime.lock during lookup and insertion, the deletion path
(drm_prime_remove_buf_handle) traverses and mutates both the 'handles'
and 'dmabufs' rb_trees without acquiring any mutex.

When multiple threads concurrently close GEM handles or interleave import
and close operations, the pointers and balance states of the rb_tree
nodes get corrupted. As a result, certain members are erased from one
tree but remain orphaned in the other. Upon process exit, the final
sanity check triggers the WARNING.

[    448.919314][T19739] ------------[ cut here ]------------
[    448.945387][T19739] WARNING: CPU: 0 PID: 19739 at drivers/gpu/drm/drm_prime.c:223 drm_prime_destroy_file_private+0x43/0x60
...
[    449.056535][T19739] Call Trace:
[    449.056544][T19739]  <TASK>
[    449.056553][T19739]  drm_file_free.part.0+0x805/0xcf0
[    449.056652][T19739]  drm_close_helper.isra.0+0x183/0x1f0
[    449.056677][T19739]  drm_release+0x1ab/0x360
[    449.056719][T19739]  __fput+0x402/0xb50
[    449.056783][T19739]  task_work_run+0x16b/0x260
[    449.056883][T19739]  exit_to_user_mode_loop+0xf9/0x130
[    449.056931][T19739]  do_syscall_64+0x424/0xfa0
[    449.056977][T19739]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[    449.057268][T19739]  </TASK>
[    449.057295][T19739] Kernel panic - not syncing: kernel: panic_on_warn set ...

Fix this by acquiring the prime_fpriv->lock mutex around the rb_tree
lookup and erasure logic. To respect the locking rules and avoid potential
deadlocks with driver-specific memory cleanups, assign the target node to
a temporary pointer and defer the dma_buf_put() and kfree() operations
until after the mutex is safely dropped.

Fixes: ea2aa97ca37a ("drm/gem: Fix GEM handle release errors")
Cc: stable@vger.kernel.org
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
 drivers/gpu/drm/drm_prime.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 9b44c78cd77f..26319c638e0f 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -190,6 +190,9 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
 				 uint32_t handle)
 {
 	struct rb_node *rb;
+	struct drm_prime_member *found = NULL;
+
+	mutex_lock(&prime_fpriv->lock);
 
 	rb = prime_fpriv->handles.rb_node;
 	while (rb) {
@@ -200,8 +203,7 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
 			rb_erase(&member->handle_rb, &prime_fpriv->handles);
 			rb_erase(&member->dmabuf_rb, &prime_fpriv->dmabufs);
 
-			dma_buf_put(member->dma_buf);
-			kfree(member);
+			found = member;
 			break;
 		} else if (member->handle < handle) {
 			rb = rb->rb_right;
@@ -209,6 +211,13 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
 			rb = rb->rb_left;
 		}
 	}
+	mutex_unlock(&prime_fpriv->lock);
+
+	/* Defer resource release outside the mutex to prevent deadlocks */
+	if (found) {
+		dma_buf_put(found->dma_buf);
+		kfree(found);
+	}
 }
 
 void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle
  2026-05-28  8:29 [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle w15303746062
@ 2026-05-28  9:14 ` Christian König
  2026-05-28 12:40   ` w15303746062
  2026-05-28 13:29   ` [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release w15303746062
  2026-05-31  7:54 ` [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle kernel test robot
  1 sibling, 2 replies; 9+ messages in thread
From: Christian König @ 2026-05-28  9:14 UTC (permalink / raw)
  To: w15303746062, maarten.lankhorst, mripard, tzimmermann, airlied,
	simona, sumit.semwal
  Cc: jeffy.chen, dri-devel, linux-kernel, linux-media, linaro-mm-sig,
	Mingyu Wang, stable

On 5/28/26 10:29, w15303746062@163.com wrote:
> From: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> 
> Syzkaller fuzzer triggered a kernel panic via a WARNING in
> drm_prime_destroy_file_private() due to a non-empty prime rb_tree.
> 
> The root cause is a complete lack of synchronization in the teardown
> path. While the import path (drm_gem_prime_fd_to_handle) holds the
> &file_priv->prime.lock during lookup and insertion, the deletion path
> (drm_prime_remove_buf_handle) traverses and mutates both the 'handles'
> and 'dmabufs' rb_trees without acquiring any mutex.

That's just simply incorrect, drm_prime_remove_buf_handle() is called with the lock already held.

See drm_gem_object_release_handle():

        mutex_lock(&file_priv->prime.lock);

        drm_prime_remove_buf_handle(&file_priv->prime, id);

        mutex_unlock(&file_priv->prime.lock);

So the patch you propose here is just nonsense.

What tree are you working on? Could it be that this is something specific to a certain version.

Regards,
Christian.

> 
> When multiple threads concurrently close GEM handles or interleave import
> and close operations, the pointers and balance states of the rb_tree
> nodes get corrupted. As a result, certain members are erased from one
> tree but remain orphaned in the other. Upon process exit, the final
> sanity check triggers the WARNING.
> 
> [    448.919314][T19739] ------------[ cut here ]------------
> [    448.945387][T19739] WARNING: CPU: 0 PID: 19739 at drivers/gpu/drm/drm_prime.c:223 drm_prime_destroy_file_private+0x43/0x60
> ...
> [    449.056535][T19739] Call Trace:
> [    449.056544][T19739]  <TASK>
> [    449.056553][T19739]  drm_file_free.part.0+0x805/0xcf0
> [    449.056652][T19739]  drm_close_helper.isra.0+0x183/0x1f0
> [    449.056677][T19739]  drm_release+0x1ab/0x360
> [    449.056719][T19739]  __fput+0x402/0xb50
> [    449.056783][T19739]  task_work_run+0x16b/0x260
> [    449.056883][T19739]  exit_to_user_mode_loop+0xf9/0x130
> [    449.056931][T19739]  do_syscall_64+0x424/0xfa0
> [    449.056977][T19739]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [    449.057268][T19739]  </TASK>
> [    449.057295][T19739] Kernel panic - not syncing: kernel: panic_on_warn set ...
> 
> Fix this by acquiring the prime_fpriv->lock mutex around the rb_tree
> lookup and erasure logic. To respect the locking rules and avoid potential
> deadlocks with driver-specific memory cleanups, assign the target node to
> a temporary pointer and defer the dma_buf_put() and kfree() operations
> until after the mutex is safely dropped.
> 
> Fixes: ea2aa97ca37a ("drm/gem: Fix GEM handle release errors")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> ---
>  drivers/gpu/drm/drm_prime.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 9b44c78cd77f..26319c638e0f 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -190,6 +190,9 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
>                                  uint32_t handle)
>  {
>         struct rb_node *rb;
> +       struct drm_prime_member *found = NULL;
> +
> +       mutex_lock(&prime_fpriv->lock);
> 
>         rb = prime_fpriv->handles.rb_node;
>         while (rb) {
> @@ -200,8 +203,7 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
>                         rb_erase(&member->handle_rb, &prime_fpriv->handles);
>                         rb_erase(&member->dmabuf_rb, &prime_fpriv->dmabufs);
> 
> -                       dma_buf_put(member->dma_buf);
> -                       kfree(member);
> +                       found = member;
>                         break;
>                 } else if (member->handle < handle) {
>                         rb = rb->rb_right;
> @@ -209,6 +211,13 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
>                         rb = rb->rb_left;
>                 }
>         }
> +       mutex_unlock(&prime_fpriv->lock);
> +
> +       /* Defer resource release outside the mutex to prevent deadlocks */
> +       if (found) {
> +               dma_buf_put(found->dma_buf);
> +               kfree(found);
> +       }
>  }
> 
>  void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv)
> --
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re:Re: [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle
  2026-05-28  9:14 ` Christian König
@ 2026-05-28 12:40   ` w15303746062
  2026-05-28 13:29   ` [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release w15303746062
  1 sibling, 0 replies; 9+ messages in thread
From: w15303746062 @ 2026-05-28 12:40 UTC (permalink / raw)
  To: Christian König
  Cc: maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	sumit.semwal, jeffy.chen, dri-devel, linux-kernel, linux-media,
	linaro-mm-sig, Mingyu Wang, stable

Hi Christian,

Thank you for the review and for catching this. You are absolutely right, and my analysis was flawed. Adding the lock inside `drm_prime_remove_buf_handle` would indeed cause a recursive deadlock.

The syzkaller crash was originally triggered on the v6.18 kernel. When investigating, I checked the latest mainline source for `drm_prime_remove_buf_handle` itself. Since I didn't see any synchronization changes within that specific function, I incorrectly assumed the concurrency issue was still completely unhandled, failing to notice that the upstream tree properly holds the lock in the caller (`drm_gem_object_release_handle`).

I will dive deeper into the code to see if there is still any hidden race condition under the current locking scheme, or if this is strictly a legacy issue that might only require a stable-tree backport. If a fix is still warranted, I will send a v2 patch.

Thanks again for your time and guidance!

Best regards,
Mingyu Wang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
  2026-05-28  9:14 ` Christian König
  2026-05-28 12:40   ` w15303746062
@ 2026-05-28 13:29   ` w15303746062
  2026-05-28 13:32     ` Christian König
  1 sibling, 1 reply; 9+ messages in thread
From: w15303746062 @ 2026-05-28 13:29 UTC (permalink / raw)
  To: maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	sumit.semwal, christian.koenig
  Cc: jeffy.chen, dri-devel, linux-kernel, linux-media, linaro-mm-sig,
	Mingyu Wang

From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

When a GEM handle already exists in the drm_prime_file_private, repeated
calls to DRM_IOCTL_PRIME_HANDLE_TO_FD can cause drm_prime_add_buf_handle()
to insert multiple entries with the same handle into the handles rb_tree.
Because the insertion walk moves left on equality, duplicate keys are
structurally accepted by the tree.

Later, when the handle is released via drm_gem_release() ->
drm_gem_object_release_handle() -> drm_prime_remove_buf_handle(), the
latter iterates the handles tree, removes the first matching node, and
breaks out of the loop. Any remaining duplicate nodes that share the
same handle are left orphaned in the dmabufs tree - they are no longer
reachable through the handles tree and are never freed.

When the drm file is finally closed, drm_prime_destroy_file_private()
triggers:

	WARN_ON(!RB_EMPTY_ROOT(&prime_fpriv->dmabufs));

because the dmabufs tree is still non-empty. With CONFIG_PANIC_ON_WARN
this becomes a kernel panic:

	------------[ cut here ]------------
	WARNING: CPU: 0 PID: 19739 at drivers/gpu/drm/drm_prime.c:223 drm_prime_destroy_file_private+0x43/0x60
	...
	Kernel panic - not syncing: kernel: panic_on_warn set ...

Fix this by restarting the lookup from the root of the handles tree
after each successful removal, so that all duplicate nodes for the given
handle are erased. The caller (drm_gem_object_release_handle) already
holds prime_fpriv->lock, so this does not change the locking strategy.

Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
Changes in v2:
 - Drop the unnecessary mutex_lock addition, as the caller (drm_gem_object_release_handle) already holds the lock.
 - Rewrite the commit message to accurately reflect the root cause (duplicate handle insertions) rather than an assumed lack of synchronization.
 - Restart the rb_tree lookup from the root instead of breaking the loop to ensure all orphaned duplicate nodes are thoroughly removed.

 drivers/gpu/drm/drm_prime.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 9b44c78cd77f..dc28df1c6698 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -202,7 +202,10 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
 
 			dma_buf_put(member->dma_buf);
 			kfree(member);
-			break;
+			/* Duplicate handles may exist; restart search from root
+			 * to guarantee removal of all matching entries.
+			 */
+			rb = prime_fpriv->handles.rb_node;
 		} else if (member->handle < handle) {
 			rb = rb->rb_right;
 		} else {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
  2026-05-28 13:29   ` [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release w15303746062
@ 2026-05-28 13:32     ` Christian König
  2026-05-28 13:49       ` w15303746062
  0 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2026-05-28 13:32 UTC (permalink / raw)
  To: w15303746062, maarten.lankhorst, mripard, tzimmermann, airlied,
	simona, sumit.semwal
  Cc: jeffy.chen, dri-devel, linux-kernel, linux-media, linaro-mm-sig,
	Mingyu Wang

On 5/28/26 15:29, w15303746062@163.com wrote:
> From: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> 
> When a GEM handle already exists in the drm_prime_file_private, repeated
> calls to DRM_IOCTL_PRIME_HANDLE_TO_FD can cause drm_prime_add_buf_handle()
> to insert multiple entries with the same handle into the handles rb_tree.
> Because the insertion walk moves left on equality, duplicate keys are
> structurally accepted by the tree.

That should never happen and would be a major bug.

All callers should check if a handler exists before calling drm_prime_add_buf_handle().

How do you see that a handle is added twice?

Regards,
Christian.

> 
> Later, when the handle is released via drm_gem_release() ->
> drm_gem_object_release_handle() -> drm_prime_remove_buf_handle(), the
> latter iterates the handles tree, removes the first matching node, and
> breaks out of the loop. Any remaining duplicate nodes that share the
> same handle are left orphaned in the dmabufs tree - they are no longer
> reachable through the handles tree and are never freed.
> 
> When the drm file is finally closed, drm_prime_destroy_file_private()
> triggers:
> 
>         WARN_ON(!RB_EMPTY_ROOT(&prime_fpriv->dmabufs));
> 
> because the dmabufs tree is still non-empty. With CONFIG_PANIC_ON_WARN
> this becomes a kernel panic:
> 
>         ------------[ cut here ]------------
>         WARNING: CPU: 0 PID: 19739 at drivers/gpu/drm/drm_prime.c:223 drm_prime_destroy_file_private+0x43/0x60
>         ...
>         Kernel panic - not syncing: kernel: panic_on_warn set ...
> 
> Fix this by restarting the lookup from the root of the handles tree
> after each successful removal, so that all duplicate nodes for the given
> handle are erased. The caller (drm_gem_object_release_handle) already
> holds prime_fpriv->lock, so this does not change the locking strategy.
> 
> Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> ---
> Changes in v2:
>  - Drop the unnecessary mutex_lock addition, as the caller (drm_gem_object_release_handle) already holds the lock.
>  - Rewrite the commit message to accurately reflect the root cause (duplicate handle insertions) rather than an assumed lack of synchronization.
>  - Restart the rb_tree lookup from the root instead of breaking the loop to ensure all orphaned duplicate nodes are thoroughly removed.
> 
>  drivers/gpu/drm/drm_prime.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 9b44c78cd77f..dc28df1c6698 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -202,7 +202,10 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
> 
>                         dma_buf_put(member->dma_buf);
>                         kfree(member);
> -                       break;
> +                       /* Duplicate handles may exist; restart search from root
> +                        * to guarantee removal of all matching entries.
> +                        */
> +                       rb = prime_fpriv->handles.rb_node;
>                 } else if (member->handle < handle) {
>                         rb = rb->rb_right;
>                 } else {
> --
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re:Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
  2026-05-28 13:32     ` Christian König
@ 2026-05-28 13:49       ` w15303746062
  2026-05-29  6:45         ` Christian König
  0 siblings, 1 reply; 9+ messages in thread
From: w15303746062 @ 2026-05-28 13:49 UTC (permalink / raw)
  To: Christian König
  Cc: maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	sumit.semwal, jeffy.chen, dri-devel, linux-kernel, linux-media,
	linaro-mm-sig, Mingyu Wang

Hi Christian,

Thank you for insisting on this. I've now gone through all callers
of drm_prime_add_buf_handle() in drm_prime.c.

You are absolutely right: both drm_gem_prime_fd_to_handle() and
drm_gem_prime_handle_to_dmabuf() perform the lookup under
prime_fpriv->lock before adding, so a duplicate handle should indeed
never be inserted through those paths.

That said, the syzkaller report clearly shows that the dmabufs tree
is not empty when drm_prime_destroy_file_private() runs, which means
some entry wasn't removed. Given that the normal add/remove paths
appear correct, the trigger might be something more subtle — perhaps
a driver-specific callback that bypasses the generic helpers, or an
error path that leaves an orphan in the dmabufs tree. I haven't been
able to identify the exact trigger yet.

The proposed change to drm_prime_remove_buf_handle() (restart search
instead of break) is intended as a small robustness improvement, not
a fix for a confirmed race. In the normal case it will still execute
only once, but if the trees ever become inconsistent for any reason,
it will clean up all entries for the given handle and prevent the
WARNING.

Would you be okay with such a defensive approach, or would you prefer
that we first track down the precise trigger (e.g. with additional
WARNs or tracing)?

Thanks,
Mingyu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
  2026-05-28 13:49       ` w15303746062
@ 2026-05-29  6:45         ` Christian König
  2026-05-29 11:45           ` w15303746062
  0 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2026-05-29  6:45 UTC (permalink / raw)
  To: w15303746062
  Cc: maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	sumit.semwal, jeffy.chen, dri-devel, linux-kernel, linux-media,
	linaro-mm-sig, Mingyu Wang

Hi Mingyu,

On 5/28/26 15:49, w15303746062 wrote:
> Hi Christian,
> 
> Thank you for insisting on this. I've now gone through all callers
> of drm_prime_add_buf_handle() in drm_prime.c.
> 
> You are absolutely right: both drm_gem_prime_fd_to_handle() and
> drm_gem_prime_handle_to_dmabuf() perform the lookup under
> prime_fpriv->lock before adding, so a duplicate handle should indeed
> never be inserted through those paths.
> 
> That said, the syzkaller report clearly shows that the dmabufs tree
> is not empty when drm_prime_destroy_file_private() runs, which means
> some entry wasn't removed. Given that the normal add/remove paths
> appear correct, the trigger might be something more subtle — perhaps
> a driver-specific callback that bypasses the generic helpers, or an
> error path that leaves an orphan in the dmabufs tree. I haven't been
> able to identify the exact trigger yet.
> 
> The proposed change to drm_prime_remove_buf_handle() (restart search
> instead of break) is intended as a small robustness improvement, not
> a fix for a confirmed race. In the normal case it will still execute
> only once, but if the trees ever become inconsistent for any reason,
> it will clean up all entries for the given handle and prevent the
> WARNING.
> 
> Would you be okay with such a defensive approach, or would you prefer
> that we first track down the precise trigger (e.g. with additional
> WARNs or tracing)?

I don't think so. As far as I can see this is not a robustness improvement but just papering over an issue.

Leaking memory is usually only a very minor problem, things like use after free or random memory corruption is much more worse.

And such things is exactly what starts to happens when you start papering over issues.

So I would say find the root cause of what is going on here, you have certainly stumbled over something, and then we can look into how to fix that.

But just sending out random patches where a bit of simple code reading can prove them incorrect is not really helpful.

Regards,
Christian.

> 
> Thanks,
> Mingyu


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re:Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
  2026-05-29  6:45         ` Christian König
@ 2026-05-29 11:45           ` w15303746062
  0 siblings, 0 replies; 9+ messages in thread
From: w15303746062 @ 2026-05-29 11:45 UTC (permalink / raw)
  To: Christian König
  Cc: maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	sumit.semwal, jeffy.chen, dri-devel, linux-kernel, linux-media,
	linaro-mm-sig, Mingyu Wang

Hi Christian,

Thank you for your guidance and patience throughout this discussion.

After further investigation, I realize that identifying the precise
root cause requires a deeper understanding of the DRM subsystem and
access to the specific syzkaller reproducer, which I currently lack.

To avoid wasting your time with incomplete patches, I'll step back
from this issue for now and continue learning the codebase. If I
manage to reproduce the problem locally or find more concrete
evidence, I'll follow up with a proper analysis.

Thank you again for the review and the valuable lessons.

Regards,
Mingyu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle
  2026-05-28  8:29 [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle w15303746062
  2026-05-28  9:14 ` Christian König
@ 2026-05-31  7:54 ` kernel test robot
  1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2026-05-31  7:54 UTC (permalink / raw)
  To: w15303746062
  Cc: oe-lkp, lkp, dri-devel, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, sumit.semwal, christian.koenig, jeffy.chen,
	linux-kernel, linux-media, linaro-mm-sig, Mingyu Wang, stable,
	oliver.sang



Hello,

kernel test robot noticed "WARNING:possible_recursive_locking_detected" on:

commit: 60a023d26c97753be2beee6062d71ce9416725b1 ("[PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle")
url: https://github.com/intel-lab-lkp/linux/commits/w15303746062-163-com/drm-prime-Fix-unsupervised-rb_tree-corruption-in-drm_prime_remove_buf_handle/20260528-163356
base: https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link: https://lore.kernel.org/all/20260528082912.1051262-1-w15303746062@163.com/
patch subject: [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle

in testcase: boot

config: x86_64-rhel-9.4-bpf
compiler: gcc-14
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202605310941.ddd52610-lkp@intel.com



[   86.630882][  T218] WARNING: possible recursive locking detected
[   86.631312][  T218] 7.1.0-rc2+ #1 Not tainted
[   86.631634][  T218] --------------------------------------------
[   86.632065][  T218] (udev-worker)/218 is trying to acquire lock:
[   86.632529][  T218] ffff8881ab017388 (&prime_fpriv->lock){+.+.}-{4:4}, at: drm_prime_remove_buf_handle (gpu/drm/drm_prime.c:195) drm
[   86.633434][  T218]
[   86.633434][  T218] but task is already holding lock:
[   86.633951][  T218] ffff8881ab017388 (&prime_fpriv->lock){+.+.}-{4:4}, at: drm_gem_object_release_handle (gpu/drm/drm_gem.c:377) drm
[   86.638664][  T218]
[   86.638664][  T218] other info that might help us debug this:
[   86.639229][  T218]  Possible unsafe locking scenario:
[   86.639229][  T218]
[   86.639769][  T218]        CPU0
[   86.640006][  T218]        ----
[   86.640243][  T218]   lock(&prime_fpriv->lock);
[   86.640581][  T218]   lock(&prime_fpriv->lock);
[   86.640916][  T218]
[   86.640916][  T218]  *** DEADLOCK ***
[   86.640916][  T218]
[   86.641480][  T218]  May be due to missing lock nesting notation
[   86.641480][  T218]
[   86.642060][  T218] 4 locks held by (udev-worker)/218:
[   86.642446][  T218]  #0: ffff8881022921f8 (&dev->mutex){....}-{4:4}, at: __driver_attach (linux/device.h:1040 base/dd.c:1174 base/dd.c:1294)
[   86.643133][  T218]  #1: ffff8881ab020260 (&dev->clientlist_mutex){+.+.}-{4:4}, at: drm_client_register (gpu/drm/drm_client.c:129) drm
[   86.644051][  T218]  #2: ffff8882a2b58a70 (&helper->lock){+.+.}-{4:4}, at: drm_fb_helper_initial_config (gpu/drm/drm_fb_helper.c:1717 gpu/drm/drm_fb_helper.c:1710) drm_kms_helper
[   86.644930][  T218]  #3: ffff8881ab017388 (&prime_fpriv->lock){+.+.}-{4:4}, at: drm_gem_object_release_handle (gpu/drm/drm_gem.c:377) drm
[   86.645880][  T218]
[   86.645880][  T218] stack backtrace:
[   86.646300][  T218] CPU: 1 UID: 0 PID: 218 Comm: (udev-worker) Not tainted 7.1.0-rc2+ #1 PREEMPT(full)
[   86.646307][  T218] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   86.646311][  T218] Call Trace:
[   86.646316][  T218]  <TASK>
[   86.646321][  T218]  dump_stack_lvl (dump_stack.c:94 dump_stack.c:120)
[   86.646333][  T218]  print_deadlock_bug.cold (locking/lockdep.c:3041)
[   86.646341][  T218]  validate_chain (locking/lockdep.c:3093 locking/lockdep.c:3895)
[   86.646350][  T218]  __lock_acquire (locking/lockdep.c:5237)
[   86.646366][  T218]  lock_acquire (trace/events/lock.h:24 (discriminator 15) trace/events/lock.h:24 (discriminator 15) locking/lockdep.c:5831 (discriminator 15))
[   86.646372][  T218]  ? drm_prime_remove_buf_handle (gpu/drm/drm_prime.c:195) drm
[   86.646516][  T218]  ? rcu_is_watching (x86/include/asm/atomic.h:23 linux/atomic/atomic-arch-fallback.h:457 linux/context_tracking.h:128 rcu/tree.c:752)
[   86.646521][  T218]  ? drm_prime_remove_buf_handle (gpu/drm/drm_prime.c:195) drm
[   86.646659][  T218]  ? lock_acquire (trace/events/lock.h:24 (discriminator 21) locking/lockdep.c:5831 (discriminator 21))
[   86.646665][  T218]  __mutex_lock (locking/mutex.c:646 locking/mutex.c:820)
[   86.646675][  T218]  ? drm_prime_remove_buf_handle (gpu/drm/drm_prime.c:195) drm
[   86.646814][  T218]  ? drm_prime_remove_buf_handle (gpu/drm/drm_prime.c:195) drm
[   86.646952][  T218]  ? __mutex_lock (locking/mutex.c:656 locking/mutex.c:820)
[   86.646958][  T218]  ? drm_gem_object_release_handle (gpu/drm/drm_gem.c:377) drm
[   86.647103][  T218]  ? __pfx___mutex_lock (locking/mutex.c:914)
[   86.647109][  T218]  ? __pfx___mutex_lock (locking/mutex.c:914)
[   86.647116][  T218]  ? idr_replace (idr.c:304)
[   86.647124][  T218]  ? drm_prime_remove_buf_handle (gpu/drm/drm_prime.c:195) drm
[   86.647264][  T218] drm_prime_remove_buf_handle (gpu/drm/drm_prime.c:195) drm
[   86.647408][  T218] drm_gem_object_release_handle (gpu/drm/drm_gem.c:379) drm
[   86.647555][  T218] drm_gem_handle_delete (gpu/drm/drm_gem.c:413) drm
[   86.647700][  T218] drm_client_buffer_create_dumb (gpu/drm/drm_client.c:424) drm
[   86.647840][  T218]  ? __pfx_drm_client_buffer_create_dumb (gpu/drm/drm_client.c:267) drm
[   86.647977][  T218]  ? drm_fb_helper_single_fb_probe (gpu/drm/drm_fb_helper.c:1414 gpu/drm/drm_fb_helper.c:1445) drm_kms_helper
[   86.648030][  T218] drm_fbdev_shmem_driver_fbdev_probe (gpu/drm/drm_fbdev_shmem.c:151) drm_shmem_helper
[   86.648045][  T218]  ? __pfx_drm_fbdev_shmem_driver_fbdev_probe (gpu/drm/drm_fbdev_shmem.c:119) drm_shmem_helper
[   86.648055][  T218]  ? __kmalloc_noprof (linux/local_lock_internal.h:62 slub.c:4771 slub.c:4883 slub.c:5294 slub.c:5307)
[   86.648064][  T218]  ? rcu_is_watching (x86/include/asm/atomic.h:23 linux/atomic/atomic-arch-fallback.h:457 linux/context_tracking.h:128 rcu/tree.c:752)
[   86.648073][  T218] drm_fb_helper_single_fb_probe (gpu/drm/drm_fb_helper.c:1454) drm_kms_helper
[   86.648121][  T218]  ? __pfx_drm_fb_helper_single_fb_probe (gpu/drm/drm_fb_helper.c:1391) drm_kms_helper
[   86.648167][  T218]  ? fb_copy_cmap (video/fbdev/core/fbcmap.c:187)
[   86.648174][  T218]  ? fb_alloc_cmap_gfp (video/fbdev/core/fbcmap.c:124 (discriminator 1))
[   86.648180][  T218] __drm_fb_helper_initial_config_and_unlock (gpu/drm/drm_fb_helper.c:1635) drm_kms_helper
[   86.648230][  T218] drm_fbdev_client_hotplug (gpu/drm/clients/drm_fbdev_client.c:66) drm_client_lib
[   86.648239][  T218] drm_client_register (gpu/drm/drm_client.c:143) drm
[   86.648376][  T218] drm_fbdev_client_setup (gpu/drm/clients/drm_fbdev_client.c:168) drm_client_lib
[   86.648383][  T218] drm_client_setup (gpu/drm/clients/drm_client_setup.c:46 gpu/drm/clients/drm_client_setup.c:35) drm_client_lib
[   86.648390][  T218] bochs_pci_probe (gpu/drm/tiny/bochs.c:776 gpu/drm/tiny/bochs.c:747) bochs
[   86.648405][  T218]  ? __pfx_bochs_pci_probe (gpu/drm/tiny/bochs.c:254) bochs
[   86.648413][  T218]  local_pci_probe (pci/pci-driver.c:325)
[   86.648422][  T218]  pci_call_probe (pci/pci-driver.c:387)
[   86.648427][  T218]  ? __pfx_pci_call_probe (pci/pci-driver.c:653)
[   86.648432][  T218]  ? find_held_lock (locking/lockdep.c:5350)
[   86.648439][  T218]  ? pci_match_device (linux/spinlock.h:390 pci/pci-driver.c:156)
[   86.648444][  T218]  ? rcu_is_watching (x86/include/asm/atomic.h:23 linux/atomic/atomic-arch-fallback.h:457 linux/context_tracking.h:128 rcu/tree.c:752)
[   86.648448][  T218]  ? trace_preempt_on (trace/events/preemptirq.h:53 (discriminator 21) trace/trace_preemptirq.c:120 (discriminator 21))
[   86.648452][  T218]  ? pci_match_id (pci/pci.h:466 pci/pci.h:460 pci/pci-driver.c:110)
[   86.648459][  T218]  ? pci_match_device (pci/pci-driver.c:168)
[   86.648465][  T218]  pci_device_probe (pci/pci-driver.c:448 pci/pci-driver.c:482)
[   86.648471][  T218]  call_driver_probe (base/dd.c:631)
[   86.648477][  T218]  really_probe (base/dd.c:709)
[   86.648484][  T218]  __driver_probe_device (base/dd.c:871)
[   86.648489][  T218]  driver_probe_device (base/dd.c:901)
[   86.648495][  T218]  __driver_attach (base/dd.c:1295)
[   86.648500][  T218]  ? __pfx___driver_attach (base/dd.c:1004 (discriminator 1))
[   86.648504][  T218]  bus_for_each_dev (base/bus.c:383)
[   86.648511][  T218]  ? __pfx_bus_for_each_dev (base/bus.c:205)
[   86.648516][  T218]  ? bus_add_driver (base/bus.c:754)
[   86.648520][  T218]  ? trace_preempt_on (trace/events/preemptirq.h:53 (discriminator 21) trace/trace_preemptirq.c:120 (discriminator 21))
[   86.648526][  T218]  bus_add_driver (base/bus.c:756)
[   86.648532][  T218]  driver_register (base/driver.c:249)
[   86.648538][  T218]  ? __pfx_bochs_pci_driver_init (bochs.c:?) bochs
[   86.648548][  T218]  do_one_initcall (main.c:1392)
[   86.648555][  T218]  ? __pfx_do_one_initcall (trace/events/initcall.h:10)
[   86.648561][  T218]  ? kasan_unpoison (kasan/shadow.c:146 kasan/shadow.c:178)
[   86.648568][  T218]  ? __kasan_slab_alloc (kasan/common.c:336 kasan/common.c:366)
[   86.648575][  T218]  ? rcu_is_watching (x86/include/asm/atomic.h:23 linux/atomic/atomic-arch-fallback.h:457 linux/context_tracking.h:128 rcu/tree.c:752)
[   86.648579][  T218]  ? kasan_unpoison (kasan/shadow.c:146 kasan/shadow.c:178)
[   86.648586][  T218]  do_init_module (module/main.c:3106)
[   86.648594][  T218]  ? __pfx_do_init_module (trace/events/module.h:50 (discriminator 1))
[   86.648599][  T218]  ? load_module (module/main.c:2528 module/main.c:2523 module/main.c:3575)
[   86.648603][  T218]  ? kfree (linux/kasan.h:235 slub.c:2689 slub.c:6250 slub.c:6565)
[   86.648611][  T218]  load_module (module/main.c:3580)
[   86.648620][  T218]  ? __pfx_load_module (module/main.c:3020)
[   86.648626][  T218]  ? __pfx_kernel_read_file (??:?)
[   86.648632][  T218]  ? do_syscall_64 (linux/irq-entry-common.h:279 linux/entry-common.h:320 x86/entry/syscall_64.c:100)
[   86.648640][  T218]  init_module_from_file (module/main.c:3777)
[   86.648646][  T218]  ? __pfx_init_module_from_file (module/main.c:3634)
[   86.648656][  T218]  ? idempotent_init_module (linux/spinlock.h:390 module/main.c:3688 module/main.c:3788)
[   86.648661][  T218]  ? rcu_is_watching (x86/include/asm/atomic.h:23 linux/atomic/atomic-arch-fallback.h:457 linux/context_tracking.h:128 rcu/tree.c:752)
[   86.648664][  T218]  ? trace_preempt_on (trace/events/preemptirq.h:53 (discriminator 21) trace/trace_preemptirq.c:120 (discriminator 21))
[   86.648669][  T218]  ? preempt_count_sub (sched/core.c:5874 (discriminator 2) sched/core.c:5871 (discriminator 2) sched/core.c:5893 (discriminator 2))
[   86.648676][  T218]  idempotent_init_module (module/main.c:3789)
[   86.648682][  T218]  ? __pfx_idempotent_init_module (module/main.c:3778)
[   86.648687][  T218]  ? preempt_count_sub (sched/core.c:5874 (discriminator 2) sched/core.c:5871 (discriminator 2) sched/core.c:5893 (discriminator 2))
[   86.648696][  T218]  ? security_capable (security.c:660 (discriminator 20))
[   86.648702][  T218]  __x64_sys_finit_module (module/main.c:3815 module/main.c:3799 module/main.c:3799)
[   86.648708][  T218]  do_syscall_64 (x86/entry/syscall_64.c:63 x86/entry/syscall_64.c:94)
[   86.648713][  T218]  ? rcu_is_watching (x86/include/asm/atomic.h:23 linux/atomic/atomic-arch-fallback.h:457 linux/context_tracking.h:128 rcu/tree.c:752)
[   86.648717][  T218]  ? trace_preempt_on (trace/events/preemptirq.h:53 (discriminator 21) trace/trace_preemptirq.c:120 (discriminator 21))
[   86.648720][  T218]  ? do_syscall_64 (linux/randomize_kstack.h:58 x86/entry/syscall_64.c:92)
[   86.648725][  T218]  ? preempt_count_sub (sched/core.c:5874 (discriminator 2) sched/core.c:5871 (discriminator 2) sched/core.c:5893 (discriminator 2))
[   86.648730][  T218]  ? do_syscall_64 (linux/randomize_kstack.h:58 x86/entry/syscall_64.c:92)
[   86.648734][  T218]  ? irqentry_exit (linux/irq-entry-common.h:280 linux/irq-entry-common.h:325 entry/common.c:162)
[   86.648740][  T218]  entry_SYSCALL_64_after_hwframe (x86/entry/entry_64.S:121)
[   86.648747][  T218] RIP: 0033:0x7f3757f90779
[   86.648768][  T218] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 86 0d 00 f7 d8 64 89 01 48
All code
========
   0:	ff c3                	inc    %ebx
   2:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
   9:	00 00 00 
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	48 89 f8             	mov    %rdi,%rax
  14:	48 89 f7             	mov    %rsi,%rdi
  17:	48 89 d6             	mov    %rdx,%rsi
  1a:	48 89 ca             	mov    %rcx,%rdx
  1d:	4d 89 c2             	mov    %r8,%r10
  20:	4d 89 c8             	mov    %r9,%r8
  23:	4c 8b 4c 24 08       	mov    0x8(%rsp),%r9
  28:	0f 05                	syscall
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	ret
  33:	48 8b 0d 4f 86 0d 00 	mov    0xd864f(%rip),%rcx        # 0xd8689
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260531/202605310941.ddd52610-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-05-31  7:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-28  8:29 [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle w15303746062
2026-05-28  9:14 ` Christian König
2026-05-28 12:40   ` w15303746062
2026-05-28 13:29   ` [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release w15303746062
2026-05-28 13:32     ` Christian König
2026-05-28 13:49       ` w15303746062
2026-05-29  6:45         ` Christian König
2026-05-29 11:45           ` w15303746062
2026-05-31  7:54 ` [PATCH] drm/prime: Fix unsupervised rb_tree corruption in drm_prime_remove_buf_handle kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox