From: "Christian König" <christian.koenig@amd.com>
To: "Timur Kristóf" <timur.kristof@gmail.com>,
"Alex Deucher" <alexdeucher@gmail.com>,
"Hamza Mahfooz" <someguy@effective-light.com>,
"Michel Dänzer" <michel.daenzer@mailbox.org>
Cc: Mario Limonciello <mario.limonciello@amd.com>,
dri-devel@lists.freedesktop.org,
Alex Deucher <alexander.deucher@amd.com>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Harry Wentland <harry.wentland@amd.com>,
Leo Li <sunpeng.li@amd.com>,
Rodrigo Siqueira <siqueira@igalia.com>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
Sunil Khatri <sunil.khatri@amd.com>, Ce Sun <cesun102@amd.com>,
Lijo Lazar <lijo.lazar@amd.com>,
Kenneth Feng <kenneth.feng@amd.com>,
Ivan Lipski <ivan.lipski@amd.com>, Alex Hung <alex.hung@amd.com>,
Tom Chung <chiahsuan.chung@amd.com>,
Melissa Wen <mwen@igalia.com>, Fangzhi Zuo <Jerry.Zuo@amd.com>,
amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] drm: introduce page_flip_timeout()
Date: Thu, 29 Jan 2026 13:59:00 +0100 [thread overview]
Message-ID: <2f9bc706-02d6-4dec-a56c-53abc5d43f46@amd.com> (raw)
In-Reply-To: <2285353.hkbZ0PkbqX@timur-hyperion>
On 1/29/26 13:06, Timur Kristóf wrote:
> On Thursday, January 29, 2026 12:38:30 PM Central European Standard Time
> Christian König wrote:
>>>
>>> However, just like we do with ring timeouts, we also need to be prepared
>>> for the situation where a page flip timeout happens and we should try to
>>> recover from it. And if it isn't recoverable, fall back to GPU reset.
>>
>> No, that is clearly a bad idea.
>
> I don't see why it's "clearly" a bad idea. It's not clear to me at all, please
> clarify it for me.
The GPU resets are necessary because we allow Turing complete programs to be submitted by userspace and that in turn is then messing up the HW state and we need to reset it to get into a known working state again (e.g. classic reset signal in electronics).
But in this case here when you see a frozen picture on the screen then that means the CRTC is still working, e.g. power is there, clocks are running, hblank, vblank is happening ... this doesn't looks like a HW failure at all.
After the input from Michel I'm pretty sure that what we have here is just messed up SW state, e.g. the DC/DM code has no fallback handling and not only misses the HW event but also blocks all further page flip requests from userspace which would resolve the issue.
>> CRTCs are fixed function devices that GPU
>> reset helps here is just pure coincident.
>
> Currently, the driver doesn't handle page flip timeouts at all, which means
> that if it happens, there is 0% chance of recovering from it.
Yeah and I completely agree that this is the absolutely worse thing we can do.
> If the GPU reset improves that chance to non-zero, it's already an
> improvement, and already more than what AMD did to address this problem for
> the past few years. I just find it incredibly disrespectful towards the
> community that AMD proposes a solution that they neglect to implement, then
> when somebody from the community steps up to implement it, it's rejected.
Well, I've heard about this problem just a few days ago.
>> What we can certainly do is to improve the error handling, e.g. that the
>> system doesn't sit there forever after a page flip timeout.
>
> Sure.
>
>>
>> Let's maybe try a complete different approach. We force a page flip timeout,
>> and see if the system can handle that or not.
>>
>> E.g. every 300 page flip we just fail to signal and see if things still work
>> after the timeout.
>
> How do you propose to do that?
I need to dig a bit into the DAL/DC code and see how the signaling path actually goes.
Going to give that a try tomorrow.
Regards,
Christian.
next prev parent reply other threads:[~2026-01-29 12:59 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-23 0:05 [PATCH 1/2] drm: introduce page_flip_timeout() Hamza Mahfooz
2026-01-23 0:05 ` [PATCH 2/2] drm/amdgpu: implement page_flip_timeout() support Hamza Mahfooz
2026-01-23 11:20 ` Timur Kristóf
2026-01-23 14:25 ` Hamza Mahfooz
2026-01-23 12:26 ` kernel test robot
2026-01-23 15:16 ` kernel test robot
2026-01-23 17:49 ` Alex Deucher
2026-01-23 18:10 ` Alex Deucher
2026-01-23 13:52 ` [PATCH 1/2] drm: introduce page_flip_timeout() Christian König
2026-01-23 14:44 ` Hamza Mahfooz
2026-01-23 16:12 ` Christian König
2026-01-23 19:41 ` Alex Deucher
2026-01-23 14:44 ` Timur Kristóf
2026-01-23 22:30 ` Mario Limonciello
2026-01-24 18:32 ` Hamza Mahfooz
2026-01-24 18:43 ` Mario Limonciello
2026-01-24 19:49 ` Hamza Mahfooz
2026-01-27 22:44 ` Hamza Mahfooz
2026-01-26 14:20 ` Alex Deucher
2026-01-27 22:52 ` Hamza Mahfooz
2026-01-27 22:57 ` Alex Deucher
2026-01-28 10:39 ` Christian König
2026-01-28 11:26 ` Michel Dänzer
2026-01-28 12:14 ` Timur Kristóf
2026-01-28 12:48 ` Christian König
2026-01-28 14:25 ` Michel Dänzer
2026-01-29 10:06 ` Michel Dänzer
2026-01-29 11:25 ` Timur Kristóf
2026-01-29 11:38 ` Christian König
2026-01-29 12:06 ` Timur Kristóf
2026-01-29 12:59 ` Christian König [this message]
2026-01-29 14:04 ` Hamza Mahfooz
2026-01-29 14:24 ` Christian König
2026-01-29 14:33 ` Hamza Mahfooz
2026-01-29 14:41 ` Christian König
2026-02-03 21:48 ` Timur Kristóf
2026-01-29 21:56 ` Xaver Hugl
2026-01-26 10:14 ` Christian König
2026-01-26 10:27 ` Michel Dänzer
2026-01-26 13:00 ` Christian König
2026-01-26 14:31 ` Michel Dänzer
2026-01-28 9:19 ` Timur Kristóf
2026-01-28 11:25 ` Christian König
2026-01-28 12:22 ` Timur Kristóf
2026-01-28 14:25 ` Michel Dänzer
2026-01-28 14:35 ` Christian König
2026-01-29 21:39 ` Xaver Hugl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2f9bc706-02d6-4dec-a56c-53abc5d43f46@amd.com \
--to=christian.koenig@amd.com \
--cc=Jerry.Zuo@amd.com \
--cc=airlied@gmail.com \
--cc=alex.hung@amd.com \
--cc=alexander.deucher@amd.com \
--cc=alexdeucher@gmail.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=cesun102@amd.com \
--cc=chiahsuan.chung@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=harry.wentland@amd.com \
--cc=ivan.lipski@amd.com \
--cc=kenneth.feng@amd.com \
--cc=lijo.lazar@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mario.limonciello@amd.com \
--cc=michel.daenzer@mailbox.org \
--cc=mripard@kernel.org \
--cc=mwen@igalia.com \
--cc=simona@ffwll.ch \
--cc=siqueira@igalia.com \
--cc=someguy@effective-light.com \
--cc=sunil.khatri@amd.com \
--cc=sunpeng.li@amd.com \
--cc=timur.kristof@gmail.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox