From: Philipp Stanner <phasta@mailbox.org>
To: "Christian König" <christian.koenig@amd.com>,
phasta@kernel.org, "Lyude Paul" <lyude@redhat.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Sabrina Dubroca" <sd@queasysnail.net>,
"Sumit Semwal" <sumit.semwal@linaro.org>
Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
stable@vger.kernel.org
Subject: Re: [PATCH 1/3] drm/nouveau: Prevent signaled fences in pending list
Date: Fri, 11 Apr 2025 14:44:18 +0200 [thread overview]
Message-ID: <aca00cb25b813da4fd2f215829f02337f05642f3.camel@mailbox.org> (raw)
In-Reply-To: <81a70ba6-94b1-4bb3-a0b2-9e8890f90b33@amd.com>
On Fri, 2025-04-11 at 13:05 +0200, Christian König wrote:
> Am 11.04.25 um 11:29 schrieb Philipp Stanner:
>
> > [SNIP]
> >
> > It could be, however, that at the same moment
> > nouveau_fence_signal() is
> > removing that entry, holding the appropriate lock.
> >
> > So we have a race. Again.
> >
>
> Ah, yes of course. If signaled is called with or without the lock is
> actually undetermined.
>
>
> >
> > You see, fixing things in Nouveau is difficult :)
> > It gets more difficult if you want to clean it up "properly", so it
> > conforms to rules such as those from dma_fence.
> >
> > I have now provided two fixes that both work, but you are not
> > satisfied
> > with from the dma_fence-maintainer's perspective. I understand
> > that,
> > but please also understand that it's actually not my primary task
> > to
> > work on Nouveau. I just have to fix this bug to move on with my
> > scheduler work.
> >
>
> Well I'm happy with whatever solution as long as it works, but as
> far as I can see the approach with the callback simply doesn't.
>
> You just can't drop the fence reference for the list from the
> callback.
>
>
> >
> > So if you have another idea, feel free to share it. But I'd like to
> > know how we can go on here.
> >
>
> Well the fence code actually works, doesn't it? The problem is
> rather that setting the error throws a warning because it doesn't
> expect signaled fences on the pending list.
>
> Maybe we should fix that instead.
The fence code works as the author intended, but I would be happy if it
were more explicitly documented.
Regarding the WARN_ON: It occurs in dma_fence_set_error() because there
is an attempt to set an error code on a signaled fence. I don't think
that should be "fixed", it works as intended: You must not set an error
code of a fence that was already signaled.
The reason seems to be that once a fence is signaled, a third party
might evaluate the error code.
But I think this wasn't wat you meant with "fix".
In any case, there must not be signaled fences in nouveau's pending-
list. They must be removed immediately once they signal, and this must
not race.
>
>
> >
> > I'm running out of ideas. What I'm wondering if we couldn't just
> > remove
> > performance hacky fastpath functions such as
> > nouveau_fence_is_signaled() completely. It seems redundant to me.
> >
>
> That would work for me as well.
I'll test this approach. Seems a bit like the nuclear approach, but if
it works we'd at least clean up a lot of this mess.
P.
>
>
> >
> >
> > Or we might add locking to it, but IDK what was achieved with RCU
> > here.
> > In any case it's definitely bad that Nouveau has so many redundant
> > and
> > half-redundant mechanisms.
> >
>
> Yeah, agree messing with the locks even more won't help us here.
>
> Regards,
> Christian.
>
>
> >
> >
> >
> > P.
> >
> >
> > >
> > >
> > > P.
> > >
> > >
> > > >
> > > > Regards,
> > > > Christian.
> > > >
> > > >
> > > > >
> > > > > P.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Christian.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Replace the call to dma_fence_is_signaled() with
> > > > > > > nouveau_fence_base_is_signaled().
> > > > > > >
> > > > > > > Cc: <stable@vger.kernel.org> # 4.10+, precise commit not
> > > > > > > to
> > > > > > > be
> > > > > > > determined
> > > > > > > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > > > > > > ---
> > > > > > > drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
> > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > index 7cc84472cece..33535987d8ed 100644
> > > > > > > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > > > > @@ -274,7 +274,7 @@ nouveau_fence_done(struct
> > > > > > > nouveau_fence
> > > > > > > *fence)
> > > > > > > nvif_event_block(&fctx->event);
> > > > > > > spin_unlock_irqrestore(&fctx->lock,
> > > > > > > flags);
> > > > > > > }
> > > > > > > - return dma_fence_is_signaled(&fence->base);
> > > > > > > + return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> > > > > > > &fence-
> > > > > > >
> > > > > > > >
> > > > > > > > base.flags);
> > > > > > > >
> > > > > > >
> > > > > > > }
> > > > > > >
> > > > > > > static long
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
next prev parent reply other threads:[~2025-04-11 12:44 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-10 9:24 [PATCH 0/3] drm/nouveau: Fix & improve nouveau_fence_done() Philipp Stanner
2025-04-10 9:24 ` [PATCH 1/3] drm/nouveau: Prevent signaled fences in pending list Philipp Stanner
2025-04-10 12:13 ` Christian König
2025-04-10 12:21 ` Danilo Krummrich
2025-04-10 12:42 ` Christian König
2025-04-10 12:58 ` Christian König
2025-04-10 13:09 ` Philipp Stanner
2025-04-10 13:16 ` Christian König
2025-04-10 15:36 ` Philipp Stanner
2025-04-11 9:29 ` Philipp Stanner
[not found] ` <81a70ba6-94b1-4bb3-a0b2-9e8890f90b33@amd.com>
2025-04-11 12:44 ` Philipp Stanner [this message]
2025-04-11 13:06 ` Christian König
2025-04-11 14:10 ` Philipp Stanner
2025-04-14 8:54 ` Philipp Stanner
2025-04-14 14:27 ` Danilo Krummrich
2025-04-15 9:56 ` Christian König
2025-04-15 12:54 ` Philipp Stanner
2025-04-10 9:24 ` [PATCH 2/3] drm/nouveau: Remove surplus if-branch Philipp Stanner
2025-04-10 12:15 ` Christian König
2025-04-10 9:24 ` [PATCH 3/3] drm/nouveau: Add helper to check base fence Philipp Stanner
2025-04-10 9:51 ` [PATCH 0/3] drm/nouveau: Fix & improve nouveau_fence_done() Philipp Stanner
2025-04-10 12:18 ` Christian König
2025-04-10 13:18 ` Philipp Stanner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aca00cb25b813da4fd2f215829f02337f05642f3.camel@mailbox.org \
--to=phasta@mailbox.org \
--cc=airlied@gmail.com \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=lyude@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=phasta@kernel.org \
--cc=sd@queasysnail.net \
--cc=simona@ffwll.ch \
--cc=stable@vger.kernel.org \
--cc=sumit.semwal@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox