From: "Timur Kristóf" <timur.kristof@gmail.com>
To: Alex Deucher <alexdeucher@gmail.com>
Cc: amd-gfx@lists.freedesktop.org, Alex Deucher <alexander.deucher@amd.com>
Subject: Re: [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler
Date: Thu, 18 Dec 2025 09:49:25 -0600 [thread overview]
Message-ID: <2496432.Hq7AAxBmiT@timur-max> (raw)
In-Reply-To: <CADnq5_PABibVsM+d7UtLcRwDKo+O6thYTge__X3eCFd5u3H0nQ@mail.gmail.com>
On 2025. december 18., csütörtök 9:41:41 középső államokbeli zónaidő Alex
Deucher wrote:
> On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <timur.kristof@gmail.com>
wrote:
> > On 2025. december 15., hétfő 10:07:09 középső államokbeli zónaidő Alex
> > Deucher>
> > wrote:
> > > Only set an error on the fence if the fence is not
> > > signalled. We can end up with a warning if the
> > > per queue reset path signals the fence and sets an error
> > > as part of the reset, but fails to recover.
> >
> > Can you please elaborate why this is necessary?
> > I don't entirely see the point of this patch. Why don't want to set an
> > error on the fence when it was signalled by the per queue reset? I would
> > have thought that the next patch does that, and also fixes the warning
> > mentioned in the commit message here.
>
> If you call dma_fence_set_error() on a fence that has already signaled
> it triggers a warning. What could happen is that the queue reset sets
> the error on the fence and then signals the fence as part of the reset
> sequence. However if the queue reset ultimately fails, the fence is
> already signaled and then we try and set an error again here as we
> fall back to adapter reset, triggering the warning.
>
> Alex
I would have thought that the next patch in the series would take care of this
problem by itself. Thanks for the explanation. The patch is:
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
>
> > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > > ---
> > >
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index
> > > 67fde99724bad..7f5d01164897f 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > @@ -147,7 +147,8 @@ static enum drm_gpu_sched_stat
> > > amdgpu_job_timedout(struct drm_sched_job *s_job) dev_err(adev->dev,
> > > "Ring
> > > %s reset failed\n", ring->sched.name); }
> > >
> > > - dma_fence_set_error(&s_job->s_fence->finished, -ETIME);
> > > + if (dma_fence_get_status(&s_job->s_fence->finished) == 0)
> > > + dma_fence_set_error(&s_job->s_fence->finished, -ETIME);
> > >
> > > amdgpu_vm_put_task_info(ti);
next prev parent reply other threads:[~2025-12-18 15:49 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-15 16:07 [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Alex Deucher
2025-12-15 16:07 ` [PATCH 2/6] drm/amdgpu: always backup and reemit fences Alex Deucher
2025-12-18 5:17 ` Timur Kristóf
2025-12-15 16:07 ` [PATCH 3/6] drm/amdgpu: use dma_fence_get_status() for adapter reset Alex Deucher
2025-12-15 16:07 ` [PATCH 4/6] drm/amdgpu: avoid a warning in timedout job handler Alex Deucher
2025-12-18 5:15 ` Timur Kristóf
2025-12-18 15:41 ` Alex Deucher
2025-12-18 15:49 ` Timur Kristóf [this message]
2025-12-15 16:07 ` [PATCH 5/6] drm/amdgpu: mark fences with errors before ring reset Alex Deucher
2025-12-15 16:07 ` [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ " Alex Deucher
2025-12-18 5:11 ` Timur Kristóf
2025-12-18 5:28 ` Timur Kristóf
2025-12-18 15:58 ` Alex Deucher
2025-12-18 17:03 ` Timur Kristóf
2025-12-18 19:02 ` Alex Deucher
2025-12-18 5:20 ` [PATCH 1/6] drm/amdgpu: don't reemit ring contents more than once Timur Kristóf
2025-12-18 15:36 ` Alex Deucher
2025-12-18 15:47 ` Timur Kristóf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2496432.Hq7AAxBmiT@timur-max \
--to=timur.kristof@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=alexdeucher@gmail.com \
--cc=amd-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.