From: Mario Limonciello <mario.limonciello@amd.com>
To: "Timur Kristóf" <timur.kristof@gmail.com>,
"Hamza Mahfooz" <someguy@effective-light.com>,
dri-devel@lists.freedesktop.org,
"Christian König" <christian.koenig@amd.com>
Cc: "Alex Deucher" <alexander.deucher@amd.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Harry Wentland" <harry.wentland@amd.com>,
"Leo Li" <sunpeng.li@amd.com>,
"Rodrigo Siqueira" <siqueira@igalia.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"Sunil Khatri" <sunil.khatri@amd.com>,
"Ce Sun" <cesun102@amd.com>, "Lijo Lazar" <lijo.lazar@amd.com>,
"Kenneth Feng" <kenneth.feng@amd.com>,
"Ivan Lipski" <ivan.lipski@amd.com>,
"Alex Hung" <alex.hung@amd.com>,
"Tom Chung" <chiahsuan.chung@amd.com>,
"Melissa Wen" <mwen@igalia.com>,
"Michel Dänzer" <mdaenzer@redhat.com>,
"Fangzhi Zuo" <Jerry.Zuo@amd.com>,
amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] drm: introduce page_flip_timeout()
Date: Fri, 23 Jan 2026 16:30:12 -0600 [thread overview]
Message-ID: <a8b972be-7265-492f-9855-cdec94a0e0dc@amd.com> (raw)
In-Reply-To: <2349754.vFx2qVVIhK@timur-hyperion>
On 1/23/2026 8:44 AM, Timur Kristóf wrote:
> On Friday, January 23, 2026 2:52:44 PM Central European Standard Time
> Christian König wrote:
>> On 1/23/26 01:05, Hamza Mahfooz wrote:
>>> There should be a mechanism for drivers to respond to flip_done
>>> time outs.
>>
>
When there is a display hang, I think that resetting the GPU IP is
really heavy handed. I second what Alex said - Why not instead just
reset DCN? I would think move DCN into D3 and back out should be enough
if trying to use something to recover.
> I am adding Harry and Mario to this email as they are more familiar with this.
>
>> I can only see two reasons why you could run into a timeout:
>>
>> 1. A dma_fence never signals.
>> How that should be handled is already well documented and doesn't
> require
>> any of this.
>
> Page flip timeouts have nothing to do with fence timeouts.
> A page flip timeout can occur even when all fences of all job submissions
> complete correctly and on time.
>
>>
>> 2. A coding error in the vblank or page flip handler leading to waiting
>> forever. In that case calling back into the driver doesn't help either.
>
> At the moment, a page flip timeout will leave the whole system in a hung state
> and the driver does not even attempt to recover it in any way, it just stops
> doing anything, which is unacceptable and I'm pretty surprised that it was
> left like that for so long.
>
> Note that we have approximately a hundred bug reports open on the drm/amd bug
> tracker about "random" page flip timeouts. It affects a lot of users.
Yeah I would much rather leave some messages in the log that this
happened and see a recovery occur than a hang.
>
>>
>> So as far as I can see the whole approach doesn't make any sense at all.
>
> Actually this approach was proposed as a solution at XDC 2025 in Harry's
> presentation, "DRM calls driver callback to attempt recovery", see page 9 in
> this slide deck:
>
> https://indico.freedesktop.org/event/10/contributions/431/attachments/
> 267/355/2025%20XDC%20Hackfest%20Update%20v1.2.pdf
>
> If you disagree with Harry, please make a counter-proposal.
Hamza - since you seem to have a "workload" that can run overnight and
this series recovers, can you try what Alex said and do a dc_suspend()
and dc_resume() for failure?
Make sure you log a message so you can know it worked.
>
> Thanks,
> Timur
>
>
>
>>
>>> Since, as it stands it is possible for the display
>>> to stall indefinitely, necessitating a hard reset. So, introduce
>>> a new crtc callback that is called by
>>> drm_atomic_helper_wait_for_flip_done() to give drivers a shot
>>> at recovering from page flip timeouts.
>>>
>>> Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
>>> ---
>>>
>>> drivers/gpu/drm/drm_atomic_helper.c | 6 +++++-
>>> include/drm/drm_crtc.h | 9 +++++++++
>>> 2 files changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c
>>> b/drivers/gpu/drm/drm_atomic_helper.c index 5840e9cc6f66..3a144c324b19
>>> 100644
>>> --- a/drivers/gpu/drm/drm_atomic_helper.c
>>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
>>> @@ -1881,9 +1881,13 @@ void drm_atomic_helper_wait_for_flip_done(struct
>>> drm_device *dev,>
>>> continue;
>>>
>>> ret = wait_for_completion_timeout(&commit->flip_done, 10
> * HZ);
>>>
>>> - if (ret == 0)
>>> + if (!ret) {
>>>
>>> drm_err(dev, "[CRTC:%d:%s] flip_done timed
> out\n",
>>>
>>> crtc->base.id, crtc->name);
>>>
>>> +
>>> + if (crtc->funcs->page_flip_timeout)
>>> + crtc->funcs-
>> page_flip_timeout(crtc);
>>> + }
>>>
>>> }
>>>
>>> if (state->fake_commit)
>>>
>>> diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
>>> index 66278ffeebd6..45dc5a76e915 100644
>>> --- a/include/drm/drm_crtc.h
>>> +++ b/include/drm/drm_crtc.h
>>> @@ -609,6 +609,15 @@ struct drm_crtc_funcs {
>>>
>>> uint32_t flags, uint32_t target,
>>> struct drm_modeset_acquire_ctx
> *ctx);
>>>
>>> + /**
>>> + * @page_flip_timeout:
>>> + *
>>> + * This optional hook is called if &drm_crtc_commit.flip_done times
> out,
>>> + * and can be used by drivers to attempt to recover from a page
> flip
>>> + * timeout.
>>> + */
>>> + void (*page_flip_timeout)(struct drm_crtc *crtc);
>>> +
>>>
>>> /**
>>>
>>> * @set_property:
>>> *
>
>
>
>
next prev parent reply other threads:[~2026-01-23 22:30 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-23 0:05 [PATCH 1/2] drm: introduce page_flip_timeout() Hamza Mahfooz
2026-01-23 0:05 ` [PATCH 2/2] drm/amdgpu: implement page_flip_timeout() support Hamza Mahfooz
2026-01-23 11:20 ` Timur Kristóf
2026-01-23 14:25 ` Hamza Mahfooz
2026-01-23 12:26 ` kernel test robot
2026-01-23 15:16 ` kernel test robot
2026-01-23 17:49 ` Alex Deucher
2026-01-23 18:10 ` Alex Deucher
2026-01-23 13:52 ` [PATCH 1/2] drm: introduce page_flip_timeout() Christian König
2026-01-23 14:44 ` Hamza Mahfooz
2026-01-23 16:12 ` Christian König
2026-01-23 19:41 ` Alex Deucher
2026-01-23 14:44 ` Timur Kristóf
2026-01-23 22:30 ` Mario Limonciello [this message]
2026-01-24 18:32 ` Hamza Mahfooz
2026-01-24 18:43 ` Mario Limonciello
2026-01-24 19:49 ` Hamza Mahfooz
2026-01-27 22:44 ` Hamza Mahfooz
2026-01-26 14:20 ` Alex Deucher
2026-01-27 22:52 ` Hamza Mahfooz
2026-01-27 22:57 ` Alex Deucher
2026-01-28 10:39 ` Christian König
2026-01-28 11:26 ` Michel Dänzer
2026-01-28 12:14 ` Timur Kristóf
2026-01-28 12:48 ` Christian König
2026-01-28 14:25 ` Michel Dänzer
2026-01-29 10:06 ` Michel Dänzer
2026-01-29 11:25 ` Timur Kristóf
2026-01-29 11:38 ` Christian König
2026-01-29 12:06 ` Timur Kristóf
2026-01-29 12:59 ` Christian König
2026-01-29 14:04 ` Hamza Mahfooz
2026-01-29 14:24 ` Christian König
2026-01-29 14:33 ` Hamza Mahfooz
2026-01-29 14:41 ` Christian König
2026-02-03 21:48 ` Timur Kristóf
2026-01-29 21:56 ` Xaver Hugl
2026-01-26 10:14 ` Christian König
2026-01-26 10:27 ` Michel Dänzer
2026-01-26 13:00 ` Christian König
2026-01-26 14:31 ` Michel Dänzer
2026-01-28 9:19 ` Timur Kristóf
2026-01-28 11:25 ` Christian König
2026-01-28 12:22 ` Timur Kristóf
2026-01-28 14:25 ` Michel Dänzer
2026-01-28 14:35 ` Christian König
2026-01-29 21:39 ` Xaver Hugl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a8b972be-7265-492f-9855-cdec94a0e0dc@amd.com \
--to=mario.limonciello@amd.com \
--cc=Jerry.Zuo@amd.com \
--cc=airlied@gmail.com \
--cc=alex.hung@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=cesun102@amd.com \
--cc=chiahsuan.chung@amd.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=harry.wentland@amd.com \
--cc=ivan.lipski@amd.com \
--cc=kenneth.feng@amd.com \
--cc=lijo.lazar@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mdaenzer@redhat.com \
--cc=mripard@kernel.org \
--cc=mwen@igalia.com \
--cc=simona@ffwll.ch \
--cc=siqueira@igalia.com \
--cc=someguy@effective-light.com \
--cc=sunil.khatri@amd.com \
--cc=sunpeng.li@amd.com \
--cc=timur.kristof@gmail.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox