From: Hamza Mahfooz <someguy@effective-light.com>
To: Mario Limonciello <mario.limonciello@amd.com>
Cc: "Timur Kristóf" <timur.kristof@gmail.com>,
dri-devel@lists.freedesktop.org,
"Christian König" <christian.koenig@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Harry Wentland" <harry.wentland@amd.com>,
"Leo Li" <sunpeng.li@amd.com>,
"Rodrigo Siqueira" <siqueira@igalia.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"Sunil Khatri" <sunil.khatri@amd.com>,
"Ce Sun" <cesun102@amd.com>, "Lijo Lazar" <lijo.lazar@amd.com>,
"Kenneth Feng" <kenneth.feng@amd.com>,
"Ivan Lipski" <ivan.lipski@amd.com>,
"Alex Hung" <alex.hung@amd.com>,
"Tom Chung" <chiahsuan.chung@amd.com>,
"Melissa Wen" <mwen@igalia.com>,
"Michel Dänzer" <mdaenzer@redhat.com>,
"Fangzhi Zuo" <Jerry.Zuo@amd.com>,
amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] drm: introduce page_flip_timeout()
Date: Sat, 24 Jan 2026 14:49:15 -0500 [thread overview]
Message-ID: <aXUiO9EyYgy8dcW8@hal-station> (raw)
In-Reply-To: <d327fc1a-0db4-4fcc-aed6-ded53fa28b62@amd.com>
On Sat, Jan 24, 2026 at 12:43:02PM -0600, Mario Limonciello wrote:
>
>
> On 1/24/2026 12:32 PM, Hamza Mahfooz wrote:
> > On Fri, Jan 23, 2026 at 04:30:12PM -0600, Mario Limonciello wrote:
> > > Hamza - since you seem to have a "workload" that can run overnight and this
> > > series recovers, can you try what Alex said and do a dc_suspend() and
> > > dc_resume() for failure?
> > >
> > > Make sure you log a message so you can know it worked.
> >
> > Sure, I'll try something along the lines of:
>
> Generally speaking that looks good, but I'll leave a few comments.
>
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> > index 492457c86393..bc7abd00f5f4 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
> > @@ -579,11 +579,28 @@ amdgpu_dm_atomic_crtc_get_property(struct drm_crtc *crtc,
> > }
> > #endif
> >
> > -static void amdgpu_dm_crtc_handle_timeout(struct drm_crtc *crtc)
> > +static void amdgpu_dm_crtc_handle_timeout(struct drm_crtc *crtc,
> > + struct drm_crtc_commit *commit)
> > {
> > struct amdgpu_device *adev = drm_to_adev(crtc->dev);
> > struct amdgpu_reset_context reset_ctx;
> > + struct amdgpu_ip_block *ip_block;
> >
> > + ip_block = amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_DCE);
> > + if (!ip_block)
> > + goto full_reset;
> > +
> > + ip_block->version->funcs->suspend(ip_block);
> > + ip_block->version->funcs->resume(ip_block);
>
> Both of these can fail. Especially considering the page flip timeout could
> be a DCN hang, I think you should check return code for both of them
> sequentially and jump to the full reset condition if either fails.
>
> > +
> > + if (drm_crtc_commit_wait(commit)) {
> > + drm_err(crtc->dev, "suspend-resume failed!\n");
> > + goto full_reset;
> > + }
> > +
>
> At least to prove "this worked" you should log a message "right here" that
> the reset occurred and you recovered. That "might not" be in the final
> version, but I think it's worth having for now.
I have included all of the suggestions in my test run, fingers crossed
that I don't have to wait too long for a repro though.
>
> > + return;
> > +
> > +full_reset:
> > if (amdgpu_device_should_recover_gpu(adev)) {
> > memset(&reset_ctx, 0, sizeof(reset_ctx));
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 7175294ccb57..b38c4ee2fc95 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -1961,7 +1961,7 @@ void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
> > crtc->base.id, crtc->name);
> >
> > if (crtc->funcs->page_flip_timeout)
> > - crtc->funcs->page_flip_timeout(crtc);
> > + crtc->funcs->page_flip_timeout(crtc, commit);
> > }
> > }
> >
> > diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
> > index 45dc5a76e915..47a34a05f6de 100644
> > --- a/include/drm/drm_crtc.h
> > +++ b/include/drm/drm_crtc.h
> > @@ -616,7 +616,8 @@ struct drm_crtc_funcs {
> > * and can be used by drivers to attempt to recover from a page flip
> > * timeout.
> > */
> > - void (*page_flip_timeout)(struct drm_crtc *crtc);
> > + void (*page_flip_timeout)(struct drm_crtc *crtc,
> > + struct drm_crtc_commit *commit);
> >
> > /**
> > * @set_property:
>
next prev parent reply other threads:[~2026-01-24 19:49 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-23 0:05 [PATCH 1/2] drm: introduce page_flip_timeout() Hamza Mahfooz
2026-01-23 0:05 ` [PATCH 2/2] drm/amdgpu: implement page_flip_timeout() support Hamza Mahfooz
2026-01-23 11:20 ` Timur Kristóf
2026-01-23 14:25 ` Hamza Mahfooz
2026-01-23 12:26 ` kernel test robot
2026-01-23 15:16 ` kernel test robot
2026-01-23 17:49 ` Alex Deucher
2026-01-23 18:10 ` Alex Deucher
2026-01-23 13:52 ` [PATCH 1/2] drm: introduce page_flip_timeout() Christian König
2026-01-23 14:44 ` Hamza Mahfooz
2026-01-23 16:12 ` Christian König
2026-01-23 19:41 ` Alex Deucher
2026-01-23 14:44 ` Timur Kristóf
2026-01-23 22:30 ` Mario Limonciello
2026-01-24 18:32 ` Hamza Mahfooz
2026-01-24 18:43 ` Mario Limonciello
2026-01-24 19:49 ` Hamza Mahfooz [this message]
2026-01-27 22:44 ` Hamza Mahfooz
2026-01-26 14:20 ` Alex Deucher
2026-01-27 22:52 ` Hamza Mahfooz
2026-01-27 22:57 ` Alex Deucher
2026-01-28 10:39 ` Christian König
2026-01-28 11:26 ` Michel Dänzer
2026-01-28 12:14 ` Timur Kristóf
2026-01-28 12:48 ` Christian König
2026-01-28 14:25 ` Michel Dänzer
2026-01-29 10:06 ` Michel Dänzer
2026-01-29 11:25 ` Timur Kristóf
2026-01-29 11:38 ` Christian König
2026-01-29 12:06 ` Timur Kristóf
2026-01-29 12:59 ` Christian König
2026-01-29 14:04 ` Hamza Mahfooz
2026-01-29 14:24 ` Christian König
2026-01-29 14:33 ` Hamza Mahfooz
2026-01-29 14:41 ` Christian König
2026-02-03 21:48 ` Timur Kristóf
2026-01-29 21:56 ` Xaver Hugl
2026-01-26 10:14 ` Christian König
2026-01-26 10:27 ` Michel Dänzer
2026-01-26 13:00 ` Christian König
2026-01-26 14:31 ` Michel Dänzer
2026-01-28 9:19 ` Timur Kristóf
2026-01-28 11:25 ` Christian König
2026-01-28 12:22 ` Timur Kristóf
2026-01-28 14:25 ` Michel Dänzer
2026-01-28 14:35 ` Christian König
2026-01-29 21:39 ` Xaver Hugl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXUiO9EyYgy8dcW8@hal-station \
--to=someguy@effective-light.com \
--cc=Jerry.Zuo@amd.com \
--cc=airlied@gmail.com \
--cc=alex.hung@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=cesun102@amd.com \
--cc=chiahsuan.chung@amd.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=harry.wentland@amd.com \
--cc=ivan.lipski@amd.com \
--cc=kenneth.feng@amd.com \
--cc=lijo.lazar@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mario.limonciello@amd.com \
--cc=mdaenzer@redhat.com \
--cc=mripard@kernel.org \
--cc=mwen@igalia.com \
--cc=simona@ffwll.ch \
--cc=siqueira@igalia.com \
--cc=sunil.khatri@amd.com \
--cc=sunpeng.li@amd.com \
--cc=timur.kristof@gmail.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox