public inbox for linux-media@vger.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: phasta@kernel.org, Lyude Paul <lyude@redhat.com>,
	Danilo Krummrich <dakr@kernel.org>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	Sabrina Dubroca <sd@queasysnail.net>,
	Sumit Semwal <sumit.semwal@linaro.org>
Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	stable@vger.kernel.org
Subject: Re: [PATCH 1/3] drm/nouveau: Prevent signaled fences in pending list
Date: Thu, 10 Apr 2025 15:16:39 +0200	[thread overview]
Message-ID: <50c9530d-e274-4f89-8620-16afe0981239@amd.com> (raw)
In-Reply-To: <c737c89c7ce9174e349c61ab4e5712eee8946f13.camel@mailbox.org>

[-- Attachment #1: Type: text/plain, Size: 3323 bytes --]

Am 10.04.25 um 15:09 schrieb Philipp Stanner:
> On Thu, 2025-04-10 at 14:58 +0200, Christian König wrote:
>> Am 10.04.25 um 11:24 schrieb Philipp Stanner:
>>> Nouveau currently relies on the assumption that dma_fences will
>>> only
>>> ever get signaled through nouveau_fence_signal(), which takes care
>>> of
>>> removing a signaled fence from the list nouveau_fence_chan.pending.
>>>
>>> This self-imposed rule is violated in nouveau_fence_done(), where
>>> dma_fence_is_signaled() (somewhat surprisingly, considering its
>>> name)
>>> can signal the fence without removing it from the list. This
>>> enables
>>> accesses to already signaled fences through the list, which is a
>>> bug.
>>>
>>> In particular, it can race with nouveau_fence_context_kill(), which
>>> would then attempt to set an error code on an already signaled
>>> fence,
>>> which is illegal.
>>>
>>> In nouveau_fence_done(), the call to nouveau_fence_update() already
>>> ensures to signal all ready fences. Thus, the signaling potentially
>>> performed by dma_fence_is_signaled() is actually not necessary.
>> Ah, I now got what you are trying to do here! But that won't help.
>>
>> The problem is it is perfectly valid for somebody external (e.g.
>> other driver, TTM etc...) to call dma_fence_is_signaled() on a
>> nouveau fence.
>>
>> This will then in turn still signal the fence and leave it on the
>> pending list and creating the problem you have.
> Good to hear – precisely that then is the use case for a dma_fence
> callback! ^_^ It guarantees that, no matter who signals a fence, no
> matter at what place, a certain action will always be performed.
>
> I can't think of any other mechanism which could guarantee that a
> signaled fence immediately gets removed from nouveau's pending list,
> other than the callbacks.
>
> But seriously, I don't think that anyone does this currently, nor do I
> think that anyone could get away with doing it without the entire
> computer burning down.

Yeah, I don't think that this is possible at the moment.

When you do stuff like that from the provider side you will always run into lifetime issues because in the signaling from interrupt case you then drop the last reference before the signaling is completed.

How about the attached (not even compile tested) patch? I think it should fix the issue.

Regards,
Christian.

>
> P.
>
>
>
>> Regards,
>> Christian.
>>
>>> Replace the call to dma_fence_is_signaled() with
>>> nouveau_fence_base_is_signaled().
>>>
>>> Cc: <stable@vger.kernel.org> # 4.10+, precise commit not to be
>>> determined
>>> Signed-off-by: Philipp Stanner <phasta@kernel.org>
>>> ---
>>>  drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
>>> b/drivers/gpu/drm/nouveau/nouveau_fence.c
>>> index 7cc84472cece..33535987d8ed 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
>>> @@ -274,7 +274,7 @@ nouveau_fence_done(struct nouveau_fence *fence)
>>>  			nvif_event_block(&fctx->event);
>>>  		spin_unlock_irqrestore(&fctx->lock, flags);
>>>  	}
>>> -	return dma_fence_is_signaled(&fence->base);
>>> +	return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence-
>>>> base.flags);
>>>  }
>>>  
>>>  static long

[-- Attachment #2: 0001-drm-nouveau-fix-and-cleanup-fence-handling.patch --]
[-- Type: text/x-patch, Size: 2670 bytes --]

From 165df36b603b37f6f1785ce359f7cd184db62196 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Thu, 10 Apr 2025 10:18:29 +0200
Subject: [PATCH] drm/nouveau: fix and cleanup fence handling
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The fence was not removed from the pending list when signaled from the
.signaled callback. Fix that and also remove the superflous
.enable_signaling callback.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 31 +++++++------------------
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 7cc84472cece..53c70ddef964 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -485,32 +485,18 @@ static bool nouveau_fence_is_signaled(struct dma_fence *f)
 		ret = (int)(fctx->read(chan) - fence->base.seqno) >= 0;
 	rcu_read_unlock();
 
-	return ret;
-}
-
-static bool nouveau_fence_no_signaling(struct dma_fence *f)
-{
-	struct nouveau_fence *fence = from_fence(f);
-
-	/*
-	 * caller should have a reference on the fence,
-	 * else fence could get freed here
-	 */
-	WARN_ON(kref_read(&fence->base.refcount) <= 1);
+	if (ret) {
+		/*
+		 * caller should have a reference on the fence,
+		 * else fence could get freed here
+		 */
+		WARN_ON(kref_read(&fence->base.refcount) <= 1);
 
-	/*
-	 * This needs uevents to work correctly, but dma_fence_add_callback relies on
-	 * being able to enable signaling. It will still get signaled eventually,
-	 * just not right away.
-	 */
-	if (nouveau_fence_is_signaled(f)) {
 		list_del(&fence->head);
-
 		dma_fence_put(&fence->base);
-		return false;
 	}
 
-	return true;
+	return ret;
 }
 
 static void nouveau_fence_release(struct dma_fence *f)
@@ -525,7 +511,6 @@ static void nouveau_fence_release(struct dma_fence *f)
 static const struct dma_fence_ops nouveau_fence_ops_legacy = {
 	.get_driver_name = nouveau_fence_get_get_driver_name,
 	.get_timeline_name = nouveau_fence_get_timeline_name,
-	.enable_signaling = nouveau_fence_no_signaling,
 	.signaled = nouveau_fence_is_signaled,
 	.wait = nouveau_fence_wait_legacy,
 	.release = nouveau_fence_release
@@ -540,7 +525,7 @@ static bool nouveau_fence_enable_signaling(struct dma_fence *f)
 	if (!fctx->notify_ref++)
 		nvif_event_allow(&fctx->event);
 
-	ret = nouveau_fence_no_signaling(f);
+	ret = nouveau_fence_is_signaled(f);
 	if (ret)
 		set_bit(DMA_FENCE_FLAG_USER_BITS, &fence->base.flags);
 	else if (!--fctx->notify_ref)
-- 
2.34.1


  reply	other threads:[~2025-04-10 13:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-10  9:24 [PATCH 0/3] drm/nouveau: Fix & improve nouveau_fence_done() Philipp Stanner
2025-04-10  9:24 ` [PATCH 1/3] drm/nouveau: Prevent signaled fences in pending list Philipp Stanner
2025-04-10 12:13   ` Christian König
2025-04-10 12:21     ` Danilo Krummrich
2025-04-10 12:42       ` Christian König
2025-04-10 12:58   ` Christian König
2025-04-10 13:09     ` Philipp Stanner
2025-04-10 13:16       ` Christian König [this message]
2025-04-10 15:36         ` Philipp Stanner
2025-04-11  9:29           ` Philipp Stanner
     [not found]             ` <81a70ba6-94b1-4bb3-a0b2-9e8890f90b33@amd.com>
2025-04-11 12:44               ` Philipp Stanner
2025-04-11 13:06                 ` Christian König
2025-04-11 14:10                   ` Philipp Stanner
2025-04-14  8:54                     ` Philipp Stanner
2025-04-14 14:27                       ` Danilo Krummrich
2025-04-15  9:56                         ` Christian König
2025-04-15 12:54                           ` Philipp Stanner
2025-04-10  9:24 ` [PATCH 2/3] drm/nouveau: Remove surplus if-branch Philipp Stanner
2025-04-10 12:15   ` Christian König
2025-04-10  9:24 ` [PATCH 3/3] drm/nouveau: Add helper to check base fence Philipp Stanner
2025-04-10  9:51 ` [PATCH 0/3] drm/nouveau: Fix & improve nouveau_fence_done() Philipp Stanner
2025-04-10 12:18   ` Christian König
2025-04-10 13:18     ` Philipp Stanner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50c9530d-e274-4f89-8620-16afe0981239@amd.com \
    --to=christian.koenig@amd.com \
    --cc=airlied@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=lyude@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=phasta@kernel.org \
    --cc=sd@queasysnail.net \
    --cc=simona@ffwll.ch \
    --cc=stable@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox