Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] xe: Lunar Lake hard freeze on USB-C DP-altmode (MSI MPG 272URX), TC1 PHY pageflip deadlock, no kernel error before lockup
@ 2026-04-19  0:57 Marc
  2026-04-28  8:10 ` Jani Nikula
  0 siblings, 1 reply; 2+ messages in thread
From: Marc @ 2026-04-19  0:57 UTC (permalink / raw)
  To: intel-xe; +Cc: dri-devel

[-- Attachment #1: Type: text/plain, Size: 5088 bytes --]

  Hi,


  Reporting a reproducible hard freeze on Lunar Lake when driving an
external
  DP 2.1 UHBR20 monitor (MSI MPG 272URX QD-OLED, 4K@240Hz) over USB-C DP
  alternate mode. Happy to run debug kernels, collect traces, or bisect.



  === Hardware ===

  - Lenovo Yoga Slim 7 14ILL10 (83JX), Intel Core Ultra / Lunar Lake

  - GPU: 0000:00:02.0, device ID 8086:64a0, display version 20.00 B0

  - Monitor: MSI MPG 272URX QD-OLED, 3840x2160@240Hz, DP 2.1 UHBR20 + DSC

  - Connection: native DP-altmode via USB-C (no dock, no hub); laptop has

    no native DP connector. All historic errors land on CONNECTOR DP-1 /

    ENCODER DDI TC1 / PHY TC1.

  - Laptop also has native HDMI-A-1 (TMDS, capped 4K60), unaffected.



  === Software ===

  - Bazzite (Fedora 43 rpm-ostree), kernel 6.17.7-ba29.fc43.x86_64

  - xe 1.1.0, DMC xe2lpd_dmc.bin v2.29, GuC lnl_guc_70.bin 70.58.0,

    HuC lnl_huc.bin 9.4.13, GSC lnl_gsc_1.bin 104.0.5.1429

  - Compositor: KDE Plasma / kwin_wayland

  - Current kernel cmdline mitigations (see below):

      xe.psr_safest_params=Y xe.enable_psr2_sel_fetch=N xe.enable_sagv=0



  === Two distinct failure signatures observed ===



  Bug A — UHBR20 link training (no mitigation applied):

    xe 0000:00:02.0: [drm] drm_WARN_ON("UHBR20 not supported for the
platform")
    intel_dp_retrain_link / intel_dp_link_check

    pipe state doesn't match: port_clock expected 2000000 found 162000
    GT0/GT1: TLB invalidation fence timeout

    i2c_designware + Bluetooth hci0 tx timeout cascade

    → whole-platform stall within seconds of connect.



  Variants seen across boots:

    - "Interlane align timeout"

    - "Can't reduce link training parameters after failure"

    - "Failed to start 128b/132b TPS2 CDS"

    - "Timeout waiting for DDI BUF A to get active"

    - "CPU pipe B FIFO underrun"



  Workaround (avoids Bug A at runtime): per-connected-DP writes

    /sys/kernel/debug/dri/.../DP-1/i915_dp_force_link_rate = 810000  (HBR3)

    /sys/kernel/debug/dri/.../DP-1/i915_dp_force_link_retrain = 1

  applied via udev on hot-plug + systemd unit at boot. HBR3 cap eliminates

  the UHBR20 warning and the link-training cascade.



  Bug B — silent pageflip deadlock at HBR3 (after Bug A mitigated):

    After 10 minutes to 6 hours of use, the monitor freezes mid-frame with
    image intact (scanout still running, flip queue wedged). Only kernel-

    adjacent signal recovered from the journal:

      kwin_wayland: Pageflip timed out! This is a bug in the xe kernel
driver
    Sometimes followed by:

      GT0/GT1: TLB invalidation fence timeout

    Then the platform is completely locked: no SysRq, no SSH, no journal

    flush. Kernel dies before anything useful is written.



  Multiple reboots (~20 captured) show a single consistent root-cause

  class: all errors on CONNECTOR DP-1 / ENCODER DDI TC1 / PHY TC1.



  === Hypothesis ===
  USB-C DP-altmode TC1 PHY power-state transition deadlock during

  link maintenance (recalibration / SSC / power-state change mid-stream).

  The pageflip-stuck-with-image-intact symptom argues against DSC or

  link-bandwidth failure (which would blank or corrupt). Consistent with

  PHY-level deadlock while the scanout engine is still running.



  === What we ruled out ===

  - DMC: xe2lpd_dmc.bin v2.29 loads clean every boot.
  - GuC: lnl_guc_70.bin 70.58.0 clean, no CT-dead, no GPU reset, no GuC

    hang in any boot log.

  - PSR: external monitor reports "Sink support: PSR = no, Panel Replay

    = no" at /sys/kernel/debug/dri/.../DP-1/i915_psr_status. No PSR

    running on this link. xe.psr_safest_params=Y and
    xe.enable_psr2_sel_fetch=N are no-ops here. Kept only because they're

    harmless on eDP.

  - DSC: tried forcing DSC on, worsened stability. Removed. Symptom

    (frozen image intact) is inconsistent with DSC failure.

  - MCE / OOM / thermal / NMI / RCU stall / NVMe / PCIe AER: all clean.

  - Windows on same hardware + same monitor + same USB-C cable: no

    freezes, full 4K@240 UHBR20 works indefinitely. Hardware and cable

    are not at fault.

  - AnyDesk / userspace daemons: installed after most freezes; not causal.



  === Monitor OSD ===

  MSI MPG 272URX does not expose a "DP 1.4 / 2.1" toggle in OSD

  (confirmed on firmware shipping to end users), so host-side fallback

  to DP 1.4 is the only available workaround. HBR3 cap via debugfs

  works at runtime; module-param equivalent (xe.* params) is 0444 and

  only settable at boot.



  === What I can provide ===

  - Full dmesg from a boot with reproduction.

  - Boots with drm.debug=0x1e if useful.

  - Willing to bisect between 6.16/6.17/6.18 on the TC PHY / xe display

    paths.

  - Willing to test patches against 6.17 or mainline.



  Thanks for any pointers — happy to file at

  https://gitlab.freedesktop.org/drm/xe/kernel/-/issues if preferred.



  -- Marc

[-- Attachment #2: Type: text/html, Size: 15485 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] xe: Lunar Lake hard freeze on USB-C DP-altmode (MSI MPG 272URX), TC1 PHY pageflip deadlock, no kernel error before lockup
  2026-04-19  0:57 [BUG] xe: Lunar Lake hard freeze on USB-C DP-altmode (MSI MPG 272URX), TC1 PHY pageflip deadlock, no kernel error before lockup Marc
@ 2026-04-28  8:10 ` Jani Nikula
  0 siblings, 0 replies; 2+ messages in thread
From: Jani Nikula @ 2026-04-28  8:10 UTC (permalink / raw)
  To: Marc, intel-xe; +Cc: dri-devel

On Sun, 19 Apr 2026, Marc <marclecussano@gmail.com> wrote:
>   Thanks for any pointers — happy to file at
>
>   https://gitlab.freedesktop.org/drm/xe/kernel/-/issues if preferred.

Yes, please. See [1].

BR,
Jani.


[1] https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html


-- 
Jani Nikula, Intel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-28  8:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-19  0:57 [BUG] xe: Lunar Lake hard freeze on USB-C DP-altmode (MSI MPG 272URX), TC1 PHY pageflip deadlock, no kernel error before lockup Marc
2026-04-28  8:10 ` Jani Nikula

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox