All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Danilo Krummrich" <dakr@kernel.org>
To: "Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>
Cc: dri-devel@lists.freedesktop.org, kernel-dev@igalia.com,
	"Philipp Stanner" <phasta@kernel.org>,
	nouveau@lists.freedesktop.org,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Boris Brezillon" <boris.brezillon@collabora.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Gary Guo" <gary@garyguo.net>
Subject: Re: [PATCH v3] drm/nouveau: Simplify nouveau_cli_work
Date: Sun, 21 Jun 2026 17:30:06 +0200	[thread overview]
Message-ID: <DJEU27KUZRNN.3R1F8AL6UL30S@kernel.org> (raw)
In-Reply-To: <20260615092607.80917-1-tvrtko.ursulin@igalia.com>

(Cc: Alice, Boris, Christian, Dave, Gary)

On Mon Jun 15, 2026 at 11:26 AM CEST, Tvrtko Ursulin wrote:
> @@ -176,9 +160,25 @@ nouveau_cli_work(struct work_struct *w)
>  	struct nouveau_cli_work *work, *wtmp;
>  	mutex_lock(&cli->lock);
>  	list_for_each_entry_safe(work, wtmp, &cli->worker, head) {
> -		if (!work->fence || nouveau_cli_work_ready(work->fence)) {
> +		struct dma_fence *fence = work->fence;
> +
> +		if (!fence || dma_fence_is_signaled(fence)) {
> +			if (fence) {
> +				unsigned long flags;
> +
> +				/*
> +				 * Because fence can still be in the process of
> +				 * processing the callback list, and the
> +				 * callback references the work we are about to
> +				 * free, we need to sync with the callback
> +				 * processing before freeing the work.
> +				 */
> +				dma_fence_lock_irqsave(fence, flags);
> +				dma_fence_unlock_irqrestore(fence, flags);

This does not ensure that there are no more dma_fence_ops callbacks into the
driver. For instance, a concurrent dma_fence_is_signaled() might read the
signaled flag as false before we do this lock dance and then still enter
ops->signaled().

Luckily, for this specific code here in nouveau it should be enough to ensure
that all dma_fence_cb callbacks are finished, which this hack does achieve.

The reason I call it hack is that it fundamentally relies on dma_fence
implementation details and not a specified API contract.

The closest thing to an API contract for this is the RCU protection that was
recently added. So, the proper approach would be to run this work with
queue_rcu_work() and then call flush_rcu_work().

In this context it is worth noting that in this case it is even more
complicated. Since the work struct is shared per CLI the single RCU barrier in
flush_rcu_work() is not enough, i.e. there is no per fence callback grace period
guarantee.

To me this still seems to be a bit of a footgun (although we could probably
improve the nouveau code to make it a bit simpler).

That said, the patch does make things a bit clearer, so I picked it up --
thanks!

Thanks,
Danilo

> +			}
>  			list_del(&work->head);
>  			work->func(work);
> +			dma_fence_put(fence);
>  		}
>  	}
>  	mutex_unlock(&cli->lock);
> -- 
> 2.54.0


WARNING: multiple messages have this Message-ID (diff)
From: "Danilo Krummrich" <dakr@kernel.org>
To: "Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>
Cc: dri-devel@lists.freedesktop.org, kernel-dev@igalia.com,
	"Philipp Stanner" <phasta@kernel.org>,
	"Lyude Paul" <lyude@redhat.com>,
	nouveau@lists.freedesktop.org,
	"Alice Ryhl" <aliceryhl@google.com>,
	"Boris Brezillon" <boris.brezillon@collabora.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Dave Airlie" <airlied@gmail.com>, "Gary Guo" <gary@garyguo.net>
Subject: Re: [PATCH v3] drm/nouveau: Simplify nouveau_cli_work
Date: Sun, 21 Jun 2026 17:30:06 +0200	[thread overview]
Message-ID: <DJEU27KUZRNN.3R1F8AL6UL30S@kernel.org> (raw)
In-Reply-To: <20260615092607.80917-1-tvrtko.ursulin@igalia.com>

(Cc: Alice, Boris, Christian, Dave, Gary)

On Mon Jun 15, 2026 at 11:26 AM CEST, Tvrtko Ursulin wrote:
> @@ -176,9 +160,25 @@ nouveau_cli_work(struct work_struct *w)
>  	struct nouveau_cli_work *work, *wtmp;
>  	mutex_lock(&cli->lock);
>  	list_for_each_entry_safe(work, wtmp, &cli->worker, head) {
> -		if (!work->fence || nouveau_cli_work_ready(work->fence)) {
> +		struct dma_fence *fence = work->fence;
> +
> +		if (!fence || dma_fence_is_signaled(fence)) {
> +			if (fence) {
> +				unsigned long flags;
> +
> +				/*
> +				 * Because fence can still be in the process of
> +				 * processing the callback list, and the
> +				 * callback references the work we are about to
> +				 * free, we need to sync with the callback
> +				 * processing before freeing the work.
> +				 */
> +				dma_fence_lock_irqsave(fence, flags);
> +				dma_fence_unlock_irqrestore(fence, flags);

This does not ensure that there are no more dma_fence_ops callbacks into the
driver. For instance, a concurrent dma_fence_is_signaled() might read the
signaled flag as false before we do this lock dance and then still enter
ops->signaled().

Luckily, for this specific code here in nouveau it should be enough to ensure
that all dma_fence_cb callbacks are finished, which this hack does achieve.

The reason I call it hack is that it fundamentally relies on dma_fence
implementation details and not a specified API contract.

The closest thing to an API contract for this is the RCU protection that was
recently added. So, the proper approach would be to run this work with
queue_rcu_work() and then call flush_rcu_work().

In this context it is worth noting that in this case it is even more
complicated. Since the work struct is shared per CLI the single RCU barrier in
flush_rcu_work() is not enough, i.e. there is no per fence callback grace period
guarantee.

To me this still seems to be a bit of a footgun (although we could probably
improve the nouveau code to make it a bit simpler).

That said, the patch does make things a bit clearer, so I picked it up --
thanks!

Thanks,
Danilo

> +			}
>  			list_del(&work->head);
>  			work->func(work);
> +			dma_fence_put(fence);
>  		}
>  	}
>  	mutex_unlock(&cli->lock);
> -- 
> 2.54.0


  parent reply	other threads:[~2026-06-21 15:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 16:54 [PATCH] drm/nouveau: Simplify nouveau_cli_work Tvrtko Ursulin
2026-06-12 16:54 ` Tvrtko Ursulin
2026-06-12 17:01 ` sashiko-bot
2026-06-12 17:14 ` Philipp Stanner
2026-06-12 17:14   ` Philipp Stanner
2026-06-12 17:18   ` Tvrtko Ursulin
2026-06-12 17:18     ` Tvrtko Ursulin
2026-06-12 17:29     ` Tvrtko Ursulin
2026-06-12 17:29       ` Tvrtko Ursulin
2026-06-12 17:22 ` [PATCH v2] " Tvrtko Ursulin
2026-06-12 17:22   ` Tvrtko Ursulin
2026-06-12 17:33   ` sashiko-bot
2026-06-15  9:26 ` [PATCH v3] " Tvrtko Ursulin
2026-06-15  9:26   ` Tvrtko Ursulin
2026-06-15  9:31   ` Philipp Stanner
2026-06-15  9:31     ` Philipp Stanner
2026-06-21 15:29   ` Danilo Krummrich
2026-06-21 15:29     ` Danilo Krummrich
2026-06-21 15:30   ` Danilo Krummrich [this message]
2026-06-21 15:30     ` Danilo Krummrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DJEU27KUZRNN.3R1F8AL6UL30S@kernel.org \
    --to=dakr@kernel.org \
    --cc=aliceryhl@google.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gary@garyguo.net \
    --cc=kernel-dev@igalia.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=phasta@kernel.org \
    --cc=tvrtko.ursulin@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.