From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Joshua Ashton" <joshua@froggi.es>,
"Christian König" <christian.koenig@amd.com>,
amd-gfx@lists.freedesktop.org
Cc: "Friedrich Vock" <friedrich.vock@gmx.de>,
"Bas Nieuwenhuizen" <bas@basnieuwenhuizen.nl>,
"André Almeida" <andrealmeid@igalia.com>,
stable@vger.kernel.org
Subject: Re: [PATCH 3/3] drm/amdgpu: Increase soft recovery timeout to .5s
Date: Mon, 11 Mar 2024 07:46:00 +0100 [thread overview]
Message-ID: <c1f80459-bf9c-439f-bdba-e08f13aea272@gmail.com> (raw)
In-Reply-To: <d537a460-6e6e-4bda-895c-c687be00ac29@froggi.es>
Am 08.03.24 um 23:31 schrieb Joshua Ashton:
> It definitely takes much longer than 10-20ms in some instances.
>
> Some of these instances can even be shown in Freidrich's hang test
> suite -- specifically when there are a lot of page faults going on.
Exactly that's the part I want to avoid. The context based recovery is
to break out of shaders with endless loops.
When there are page faults going on I would rather recommend a hard
reset of the GPU.
>
> The work (or parts of the work) could also be pending and not in any
> wave yet, just hanging out in the ring. There may be a better solution
> to that, but I don't know it.
Yeah, but killing anything of that should never take longer than what
the original submission supposed to take.
In other words when we assume that we should have at least 20fps then we
should never go over 50ms. And even at this point we have already waited
much longer than that for the shader to complete.
If you really want to raise that this high I would rather say to make it
configurable.
Regards,
Christian.
>
> Raising it to .5s still makes sense to me.
>
> - Joshie 🐸✨
>
> On 3/8/24 08:29, Christian König wrote:
>> Am 07.03.24 um 20:04 schrieb Joshua Ashton:
>>> Results in much more reliable soft recovery on
>>> Steam Deck.
>>
>> Waiting 500ms for a locked up shader is way to long I think. We could
>> increase the 10ms to something like 20ms, but I really wouldn't go
>> much over that.
>>
>> This here just kills shaders which are in an endless loop, when that
>> takes longer than 10-20ms we really have a hardware problem which
>> needs a full reset to resolve.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Signed-off-by: Joshua Ashton <joshua@froggi.es>
>>>
>>> Cc: Friedrich Vock <friedrich.vock@gmx.de>
>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: André Almeida <andrealmeid@igalia.com>
>>> Cc: stable@vger.kernel.org
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 57c94901ed0a..be99db0e077e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -448,7 +448,7 @@ bool amdgpu_ring_soft_recovery(struct
>>> amdgpu_ring *ring, unsigned int vmid,
>>> spin_unlock_irqrestore(fence->lock, flags);
>>> atomic_inc(&ring->adev->gpu_reset_counter);
>>> - deadline = ktime_add_us(ktime_get(), 10000);
>>> + deadline = ktime_add_ms(ktime_get(), 500);
>>> while (!dma_fence_is_signaled(fence) &&
>>> ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
>>> ring->funcs->soft_recovery(ring, vmid);
>>
>
next prev parent reply other threads:[~2024-03-11 6:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-07 19:04 [PATCH 1/3] drm/amdgpu: Forward soft recovery errors to userspace Joshua Ashton
2024-03-07 19:04 ` [PATCH 2/3] drm/amdgpu: Determine soft recovery deadline next to usage Joshua Ashton
2024-03-08 8:23 ` Christian König
2024-03-07 19:04 ` [PATCH 3/3] drm/amdgpu: Increase soft recovery timeout to .5s Joshua Ashton
2024-03-08 8:29 ` Christian König
2024-03-08 22:31 ` Joshua Ashton
2024-03-11 6:46 ` Christian König [this message]
2024-03-08 8:33 ` [PATCH 1/3] drm/amdgpu: Forward soft recovery errors to userspace Christian König
2024-03-09 16:27 ` Marek Olšák
2024-08-01 15:17 ` Friedrich Vock
2024-08-02 8:30 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c1f80459-bf9c-439f-bdba-e08f13aea272@gmail.com \
--to=ckoenig.leichtzumerken@gmail.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=andrealmeid@igalia.com \
--cc=bas@basnieuwenhuizen.nl \
--cc=christian.koenig@amd.com \
--cc=friedrich.vock@gmx.de \
--cc=joshua@froggi.es \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox