From: Matthew Brost <matthew.brost@intel.com>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>,
xen-devel <xen-devel@lists.xenproject.org>,
intel-xe@lists.freedesktop.org, jani.nikula@intel.com
Subject: Re: Graphical glitches (not refreshing?) with Linux's xe driver + Xen 4.19
Date: Wed, 17 Jun 2026 16:25:14 -0700 [thread overview]
Message-ID: <ajMs2lkXmTppifK7@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <ajMD0Jsml3ytlWOY@mail-itl>
On Wed, Jun 17, 2026 at 10:30:08PM +0200, Marek Marczykowski-Górecki wrote:
> On Mon, Mar 02, 2026 at 12:19:04PM +0100, Marek Marczykowski-Górecki wrote:
> > On Tue, Feb 24, 2026 at 04:58:25PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Fri, Feb 13, 2026 at 02:23:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Thu, Feb 12, 2026 at 04:11:50PM +0100, Roger Pau Monné wrote:
> > > > > On Tue, Feb 10, 2026 at 07:06:20PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Recently I started testing compatibility with Intel Lunar Lake. This is
> > > > > > the first one that uses "xe" instead of "i915" Linux driver for iGPU.
> > > > > > I test it with Qubes OS 4.3, which uses Xen 4.19.4 and PV dom0 running
> > > > > > Linux 6.17.9 in this test.
> > > > >
> > > > > Not sure it's going to help a lot, but does using a PVH dom0 make any
> > > > > difference?
> > > >
> > > > Ok, now with the correct Xen version, it's better with PVH dom0. At
> > > > least on the login screen and few applications (from both dom0 and domU)
> > > > I don't see the glitches anymore. I can't do a full test, because PCI
> > > > passthrough doesn't seem to work with PVH dom0 on Xen 4.19 - and I need
> > > > it to start most VMs.
> > > >
> > > > So, if the above test is representative, it's only about PV dom0.
> > >
> > > Some further observations:
> > >
> > > 1. My initial impression that Xen 4.17.6 is not affected is false.
> > > Apparently I got lucky and didn't waited long enough for glitches to
> > > appear. Unfortunately this means I have no way to bisect this...
> > >
> > > 1a. Updated test procedure - either:
> > > - start Qubes OS in full (including default system domUs) and try to
> > > open an app in one of them (for example file manager or pdf viewer)
> > > - start Linux up to lightdm login page, log in, log out, click on a
> > > few lightdm menus (session type selector, poewroff menu etc)
> > >
> > > The second version works even if toolstack version in dom0 doesn't match
> > > Xen version. If no glitches are observed after doing either of those
> > > procedures, assume it's good.
> > >
> > > 2. Xen staging is affected too. As well as Xen staging-4.19 without
> > > any qubes patches.
> > >
> > > 3. After enabling CONFIG_DEBUG in Xen, the xe.ko fails to load firmware:
> > >
> > > xe 0000:00:02.0: [drm] Tile0: GT0: Using GuC firmware from xe/lnl_guc_70.bin version 70.53.0
> > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: load failed: status = 0x40000056, time = 0ms, freq = 1850MHz (req 1850MHz), done = -1
> > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset = 0, BootROM = 0x2B, UKernel = 0x00, MIA = 0x00, Auth = 0x01
> > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: firmware production part check failure
> > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: Failed to initialize uC (-EPROTO)
> > > xe 0000:00:02.0: probe with driver xe failed with error -71
> > >
> > > CONFIG_DEBUG is the only change between "xe.ko loads fine but there are
> > > glitches later on" and "xe.ko fails to load at all". Full console logs:
> > > https://gist.github.com/marmarek/47b5e62a2cdbae6678c2aecc5283cd3f, there
> > > are 3 files:
> > > - CONFIG_DEBUG=n
> > > - CONFIG_DEBUG=y
> > > - CONFIG_DEBUG=y + iommu=debug
> > >
> > > 4. Updating to Linux 7.0-rc1 doesn't help, for example:
> > > https://openqa.qubes-os.org/tests/168119#step/desktop_linux_manager_create_qube/11
> > >
> > > Generally, it does feel like a bug in xe.ko, but I can't exclude some issue
> > > on Xen side too (especially given point 3 above).
> >
> > After waiting some time (Linux 6.19.5 this time), Xen CONFIG_DEBUG=n, I get some timeout messages:
> >
> > [ 8.122120] xe 0000:00:02.0: [drm] [ENCODER:204:DDI A/PHY A] failed to retrieve link info, disabling eDP
> > [ 8.148476] xe 0000:00:02.0: [drm] Tile0: GT0: Using GuC firmware from xe/lnl_guc_70.bin version 70.53.0
> > [ 8.803845] xe 0000:00:02.0: [drm] Tile0: GT0: ccs1 fused off
> > [ 8.804208] xe 0000:00:02.0: [drm] Tile0: GT0: ccs2 fused off
> > [ 8.804556] xe 0000:00:02.0: [drm] Tile0: GT0: ccs3 fused off
> > [ 8.822426] xe 0000:00:02.0: [drm] Tile0: GT1: Using GuC firmware from xe/lnl_guc_70.bin version 70.53.0
> > [ 8.827140] xe 0000:00:02.0: [drm] Tile0: GT1: Using HuC firmware from xe/lnl_huc.bin version 9.4.13
> > [ 8.829478] xe 0000:00:02.0: [drm] Tile0: GT1: Using GSC firmware from xe/lnl_gsc_1.bin version 104.0.5.1429
> > [ 8.852923] xe 0000:00:02.0: [drm] Tile0: GT1: vcs1 fused off
> > [ 8.853513] xe 0000:00:02.0: [drm] Tile0: GT1: vcs2 fused off
> > [ 8.854090] xe 0000:00:02.0: [drm] Tile0: GT1: vcs3 fused off
> > [ 8.854706] xe 0000:00:02.0: [drm] Tile0: GT1: vcs4 fused off
> > [ 8.855310] xe 0000:00:02.0: [drm] Tile0: GT1: vcs5 fused off
> > [ 8.855904] xe 0000:00:02.0: [drm] Tile0: GT1: vcs6 fused off
> > [ 8.856495] xe 0000:00:02.0: [drm] Tile0: GT1: vcs7 fused off
> > [ 8.857079] xe 0000:00:02.0: [drm] Tile0: GT1: vecs1 fused off
> > [ 8.857675] xe 0000:00:02.0: [drm] Tile0: GT1: vecs2 fused off
> > [ 8.858272] xe 0000:00:02.0: [drm] Tile0: GT1: vecs3 fused off
> > [ 8.975881] xe 0000:00:02.0: [drm] Registered 3 planes with drm panic
> > [ 8.976586] [drm] Initialized xe 1.1.0 for 0000:00:02.0 on minor 0
> > [ 8.980882] ACPI: video: Video Device [GFX0] (multi-head: yes rom: no post: no)
> > [ 9.033754] xe 0000:00:02.0: [drm] Tile0: GT1: found GSC cv104.1.0
> > ...
> > [ 1218.319232] xe 0000:00:02.0: [drm] Tile0: GT0: Engine reset: engine_class=rcs, logical_mask: 0x1, guc_id=3
> > [ 1218.319890] xe 0000:00:02.0: [drm] Tile0: GT0: Timedout job: seqno=9883, lrc_seqno=9883, guc_id=3, flags=0x0 in Xorg [3245]
> > [ 1218.320736] xe 0000:00:02.0: [drm] Xe device coredump has been created
> > [ 1218.321140] xe 0000:00:02.0: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
> > [ 1222.285626] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] flip_done timed out
> > [ 1232.525685] xe 0000:00:02.0: [drm] *ERROR* flip_done timed out
> > [ 1232.526280] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] commit wait timed out
> > [ 1242.765717] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] flip_done timed out
> > [ 1253.005696] xe 0000:00:02.0: [drm] *ERROR* flip_done timed out
> > [ 1253.006248] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] commit wait timed out
> > [ 1263.245599] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] flip_done timed out
> >
> > The glitches appear much earlier, though.
> > Would content of /sys/class/drm/card0/device/devcoredump/data be useful
> > for debugging this?
Yes, it would. Jobs hanging can be a bug anywhere in the stack (e.g.,
Hardware bug, KMD bug, UMD bug, application bug, etc...) but the
devcoredump would give us some hints.
> >
> > Full log at https://openqa.qubes-os.org/tests/168813/file/serial0.txt
> > (warning, almost 200MB of those errors...)
>
> The issue still happens with Linux 7.0.12. Current log (quite similar to
> the previous one):
> https://openqa.qubes-os.org/tests/184602/logfile?filename=serial0.txt
Hmm, the 'not started' messages in the dmesg are a bit concerning as
this really shouldn't be possible to trigger even if user space is doing
something wrong.
Can you file a gitlab issue against Xe here: https://gitlab.freedesktop.org/drm/xe/kernel/issues/new
TBH, I have no idea if running Xen / Qubes OS + Xe is something anyone
at Intel has tried out, so please include instructions on to how
reproduce and we will see in someone on engineering team can take a look
at this and if issues in Xe KMD exist, try to get these fixed.
Matt
>
> Not long after GPU errors, nvme driver fails due to full swiotlb.
>
> Any ideas?
>
> --
> Best Regards,
> Marek Marczykowski-Górecki
> Invisible Things Lab
prev parent reply other threads:[~2026-06-17 23:25 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-10 18:06 Graphical glitches (not refreshing?) with Linux's xe driver + Xen 4.19 Marek Marczykowski-Górecki
2026-02-12 14:33 ` Rodrigo Vivi
2026-02-12 15:11 ` Roger Pau Monné
2026-02-12 15:32 ` Marek Marczykowski-Górecki
2026-02-12 16:16 ` Roger Pau Monné
2026-02-12 16:22 ` Marek Marczykowski-Górecki
2026-02-12 16:35 ` Roger Pau Monné
2026-02-13 1:23 ` Marek Marczykowski-Górecki
[not found] ` <a41a15ca-b26e-482a-9084-fc61645fb24e@gmail.com>
2026-02-24 15:31 ` Marek Marczykowski-Górecki
2026-02-24 15:58 ` Marek Marczykowski-Górecki
2026-03-02 11:19 ` Marek Marczykowski-Górecki
2026-06-17 20:30 ` Marek Marczykowski-Górecki
2026-06-17 23:25 ` Matthew Brost [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajMs2lkXmTppifK7@gsse-cloud1.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@intel.com \
--cc=marmarek@invisiblethingslab.com \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox