Linux Media Controller development
 help / color / mirror / Atom feed
* [PATCH] drm/nouveau: Document weird looking bugfix
@ 2026-06-10  8:26 Philipp Stanner
  2026-06-10  9:12 ` Christian König
  2026-06-10 12:12 ` Tvrtko Ursulin
  0 siblings, 2 replies; 3+ messages in thread
From: Philipp Stanner @ 2026-06-10  8:26 UTC (permalink / raw)
  To: Lyude Paul, Danilo Krummrich, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Christian König
  Cc: dri-devel, nouveau, linux-kernel, linux-media, Philipp Stanner

commit c8a5d5ea3ba6 ("nouveau: fix client work fence deletion race")
fixed a race. To do so, it replaced the automatically locking
dma_fence_is_signaled() with manual locks plus
dma_fence_is_signaled_locked().

For someone browsing through the code, this reads very much like a
cleanup or rework leftover. Future contributors and / or new maintainers
not familiar with the history might be tempted to remove that bugfix.

Document the bugfix.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
(I did not test this)
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 42a81166f3a9..519a0c164a72 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -159,6 +159,13 @@ nouveau_cli_work_ready(struct dma_fence *fence)
 	unsigned long flags;
 	bool ret = true;
 
+	/*
+	 * This is not a cleanup / rework leftover, but a bugfix to prevent a
+	 * race with someone signalling the fence. The locked
+	 * dma_fence_is_signaled() cannot be used. The dma_fence implementation
+	 * is not fully synchronized with locks, but also uses atomic bits,
+	 * which can cause the dma_fence_put() below to be executed too soon.
+	 */
 	dma_fence_lock_irqsave(fence, flags);
 	if (!dma_fence_is_signaled_locked(fence))
 		ret = false;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/nouveau: Document weird looking bugfix
  2026-06-10  8:26 [PATCH] drm/nouveau: Document weird looking bugfix Philipp Stanner
@ 2026-06-10  9:12 ` Christian König
  2026-06-10 12:12 ` Tvrtko Ursulin
  1 sibling, 0 replies; 3+ messages in thread
From: Christian König @ 2026-06-10  9:12 UTC (permalink / raw)
  To: Philipp Stanner, Lyude Paul, Danilo Krummrich, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal
  Cc: dri-devel, nouveau, linux-kernel, linux-media

On 6/10/26 10:26, Philipp Stanner wrote:
> commit c8a5d5ea3ba6 ("nouveau: fix client work fence deletion race")
> fixed a race. To do so, it replaced the automatically locking
> dma_fence_is_signaled() with manual locks plus
> dma_fence_is_signaled_locked().
> 
> For someone browsing through the code, this reads very much like a
> cleanup or rework leftover. Future contributors and / or new maintainers
> not familiar with the history might be tempted to remove that bugfix.
> 
> Document the bugfix.
> 
> Signed-off-by: Philipp Stanner <phasta@kernel.org>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
> (I did not test this)
> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 42a81166f3a9..519a0c164a72 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -159,6 +159,13 @@ nouveau_cli_work_ready(struct dma_fence *fence)
>  	unsigned long flags;
>  	bool ret = true;
>  
> +	/*
> +	 * This is not a cleanup / rework leftover, but a bugfix to prevent a
> +	 * race with someone signalling the fence. The locked
> +	 * dma_fence_is_signaled() cannot be used. The dma_fence implementation
> +	 * is not fully synchronized with locks, but also uses atomic bits,
> +	 * which can cause the dma_fence_put() below to be executed too soon.
> +	 */
>  	dma_fence_lock_irqsave(fence, flags);
>  	if (!dma_fence_is_signaled_locked(fence))
>  		ret = false;


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/nouveau: Document weird looking bugfix
  2026-06-10  8:26 [PATCH] drm/nouveau: Document weird looking bugfix Philipp Stanner
  2026-06-10  9:12 ` Christian König
@ 2026-06-10 12:12 ` Tvrtko Ursulin
  1 sibling, 0 replies; 3+ messages in thread
From: Tvrtko Ursulin @ 2026-06-10 12:12 UTC (permalink / raw)
  To: Philipp Stanner, Lyude Paul, Danilo Krummrich, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Christian König
  Cc: dri-devel, nouveau, linux-kernel, linux-media


On 10/06/2026 09:26, Philipp Stanner wrote:
> commit c8a5d5ea3ba6 ("nouveau: fix client work fence deletion race")
> fixed a race. To do so, it replaced the automatically locking
> dma_fence_is_signaled() with manual locks plus
> dma_fence_is_signaled_locked().
> 
> For someone browsing through the code, this reads very much like a
> cleanup or rework leftover. Future contributors and / or new maintainers
> not familiar with the history might be tempted to remove that bugfix.
> 
> Document the bugfix.
> 
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
> (I did not test this)
> ---
>   drivers/gpu/drm/nouveau/nouveau_drm.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 42a81166f3a9..519a0c164a72 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -159,6 +159,13 @@ nouveau_cli_work_ready(struct dma_fence *fence)
>   	unsigned long flags;
>   	bool ret = true;
>   
> +	/*
> +	 * This is not a cleanup / rework leftover, but a bugfix to prevent a
> +	 * race with someone signalling the fence. The locked
> +	 * dma_fence_is_signaled() cannot be used. The dma_fence implementation
> +	 * is not fully synchronized with locks, but also uses atomic bits,
> +	 * which can cause the dma_fence_put() below to be executed too soon.
> +	 */

IMHO it would also be interesting to document why this happens from the 
nouveau point of view.

For example I see the two references held on this fences in the call 
chain, but apparently neither are enough to close the race. Which 
suggests a third party has a pointer to this fence but with no reference.

I talk about this:

nouveau_gem_object_unmap -> nouveau_cli_work_queue

There it grabs a reference before queing the worker. In the worker it 
drops it before calling the callback nouveau_gem_object_unmap installed:

static void
nouveau_cli_work(struct work_struct *w)
{
	struct nouveau_cli *cli = container_of(w, typeof(*cli), work);
	struct nouveau_cli_work *work, *wtmp;
	mutex_lock(&cli->lock);
	list_for_each_entry_safe(work, wtmp, &cli->worker, head) {
		if (!work->fence || nouveau_cli_work_ready(work->fence)) {

... nouveau_cli_work_ready can drop one reference

			list_del(&work->head);
			work->func(work);

... then work->func was set to nouveau_gem_object_delete_work by 
nouveau_gem_object_unmap, which will end up calling:

nouveau_gem_object_delete -> nouveau_fence_unref

On possibly the same fence.

So if there a path inside nouveau itself which signals the fence without 
holding a reference then could be it that the problem is self-inflicted 
and not due a dma-fence quirks?

I am not entirely sure since it is not very clear. It needs someone with 
nouveau expertise to clarify.

Regards,

Tvrtko

>   	dma_fence_lock_irqsave(fence, flags);
>   	if (!dma_fence_is_signaled_locked(fence))
>   		ret = false;


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-10 12:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10  8:26 [PATCH] drm/nouveau: Document weird looking bugfix Philipp Stanner
2026-06-10  9:12 ` Christian König
2026-06-10 12:12 ` Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox