AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Timur Kristóf" <timur.kristof@gmail.com>
To: amd-gfx@lists.freedesktop.org, Leo Li <sunpeng.li@amd.com>
Cc: Harry.Wentland@amd.com, Aurabindo.Pillai@amd.com,
	mario.limonciello@amd.com, wiagn233@outlook.com,
	sysdadmin@m1k.cloud
Subject: Re: [PATCH] drm/amd/display: Use vline2 interrupt on DCN instead of vstartup
Date: Thu, 07 May 2026 00:03:03 +0200	[thread overview]
Message-ID: <2381512.vFx2qVVIhK@timur-hyperion> (raw)
In-Reply-To: <d988ded3-92e5-4900-b6d1-887a96891852@amd.com>

On Wednesday, May 6, 2026 10:00:12 PM Central European Summer Time Leo Li 
wrote:
> On 2026-05-04 16:54, Timur Kristóf wrote:
> > On Monday, May 4, 2026 8:36:49 PM Central European Summer Time
> > 
> > sunpeng.li@amd.com wrote:
> >> From: Leo Li <sunpeng.li@amd.com>
> >> 
> >> [Why]
> >> 
> >> VStartup is an OTG event that fires when the pixel pipeline prepares for
> >> pixel scanout of the next frame. It was previously used to deliver
> >> vblank events for commits that do not trigger a fb address update, and
> >> hence a pflip interrupt (hw cursor updates, for example).
> >> 
> >> The issue with vstartup is that HW can mask the interrupt in cases where
> >> idle optimizations are enabled or when a HW lock is active. This could
> >> the explain the range of flip_done timeouts frequently seen in the wild.
> > 
> > Can you help me understand how that could happen with vstartup?
> > Specifically, what is a "HW lock" and when is it active?
> 
> Hi Timur,
> 
> I should've prefaced this patch to say that this is a theoretical fix. I
> haven't been able to reproduce the timeout issues myself, and this patch
> came out of internal discussions with folks more familiar with the HW. I
> don't think this will fix *all* cases of flip_done timeouts, but it may
> address some of them.

I see.
Yeah, I've only very rarely seen that issue myself. Seems that the bug avoids 
driver devs, but it's very popular among end users.

> 
> (But timeouts aside, we *should* transition to vline since it's more
> reliable than vstartup.)

I agree.

> 
> To answer your questions: depending on the DCN generation, there can be a
> few things that affects vstartup firing:
> 
> * DPG - DCN can Dynamically Power Gate parts of the display pipe when a
>   self-refresh capable eDP is connected. DPG is engaged when there's enough
>   static frames (detected thru drm_vblank_off) Once gated, even though the
> OTG (output timing generator) is still enabled, vstartup is masked. vline
> is unaffected.
> 
> * GSL - Driver can use the Global Sync Lock to block HW from latching onto
>   double-buffered registers during programming, to prevent HW from latching
> onto a partially programmed state. This will mask vstartup, but vline is
> unaffected. See dcn20_pipe_control_lock()
> 
> * MALL - A DCN accessible cache introduced in DCN32+ DGPUs that can store fb
> data to allow for longer DRAM sleep. When scanning out from MALL, vstartup
> is masked, vline is unaffected.

Thanks for the explanation.
Just one more question: does DCN always mask the VSTARTUP interrupt under 
those conditions or is that configurable?

> 
> > Many users have experienced flip_done timeouts while playing games.
> > In that scenario, would any idle optimization be enabled or is there a "HW
> > lock"?
> 
> If the game stops submitting frames for ~15 refresh cycles, it's possible
> that PSR kicks in. Though I know there are plenty of reporters running on
> external without PSR support. If it's DGPUs, it's very likely due to MALL.
> A reporter I was debugging with said disabling MALL showed good results[1].
> If it's an APU with an external monitor, then that's less clear.
> 
> A lot of the reporters seem to be running Phoenix (DCN314), with a common
> symptom of DMUB timing out[2]. If a self-refresh panel is involved, then I'm
> curious if this vline2 patch would help. Hamza's recent patch[3] that
> enables various levels of reset may help to mitigate, but it doesn't fix
> the root-cause. I'm planning a branch with this patch and [3], along with
> debug dumps on flip_done timeouts for reporters to try.
> 

That's very nice to hear. I'm crossing my fingers that it works out.

> [1]https://lore.kernel.org/amd-gfx/e415c38b-4102-40e4-a195-0256caf34802@m1k.
> cloud/ [2]https://gitlab.freedesktop.org/drm/amd/-/work_items/4831
> [3]https://lore.kernel.org/lkml/20260505182105.420525-2-someguy@effective-li
> ght.com/
> >> DCN hardware provides 3 generic OTG interrupts that can be programmed
> >> to>> fire on a specific line. Vline 0 and 1 are currently reserved, with
> >> vline2 available to use for event delivery. These interrupts cannot be
> >> masked, as long as the OTG is active.
> >> 
> >> [How]
> >> 
> >> Switch to vline2 for vblank handling. Today, DC will program the
> >> vline2 position to at vupdate -- the point at which HW latches to
> >> double-buffered registers.
> >> 
> >> Since all the vline interrupt types share the same interrupt src_id,
> >> refactor the existing vline0 infrastructure to allow for all the vline0,
> >> 1, and 2 types.
> >> 
> >> Since this is intended to replace vstartup for DCN, use the same handler
> >> logic, but be careful to leave DCE on vstartup.
> > 
> > Why not also switch DCE?
> > Does DCE not have the vline interrupts or does it not have the same issue
> > with the vstartup interrupt?
> 
> I didn't want to touch DCE since I don't have information on how these
> interrupts behave on them, and I didn't want to regress anything. Would need
> to do some digging to find out.
> 

Do we have any reports of these page flip timeouts on DCE?
Maybe it's better to leave DCE well enough alone if the issue doesn't exist 
there. (I have never seen one, but that doesn't mean it doesn't exist.)

Best regards,
Timur






  parent reply	other threads:[~2026-05-06 22:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-04 18:36 [PATCH] drm/amd/display: Use vline2 interrupt on DCN instead of vstartup sunpeng.li
2026-05-04 20:54 ` Timur Kristóf
2026-05-06 20:00   ` Leo Li
2026-05-06 20:32     ` Harry Wentland
2026-05-06 21:57       ` Timur Kristóf
2026-05-06 22:03     ` Timur Kristóf [this message]
2026-05-07  6:15       ` Shengyu Qu
2026-05-07 14:08       ` Leo Li
2026-05-10  6:06 ` Shengyu Qu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2381512.vFx2qVVIhK@timur-hyperion \
    --to=timur.kristof@gmail.com \
    --cc=Aurabindo.Pillai@amd.com \
    --cc=Harry.Wentland@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=mario.limonciello@amd.com \
    --cc=sunpeng.li@amd.com \
    --cc=sysdadmin@m1k.cloud \
    --cc=wiagn233@outlook.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox