Linux IOMMU Development
 help / color / mirror / Atom feed
* i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
@ 2022-10-18  3:52 Marek Marczykowski-Górecki
  2022-10-18  8:24 ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-10-18  3:52 UTC (permalink / raw)
  To: Christoph Hellwig, Konrad Rzeszutek Wilk, Anshuman Khandual
  Cc: Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	regressions, xen-devel, iommu

[-- Attachment #1: Type: text/plain, Size: 2063 bytes --]

Hi,

Since 5.19, I observe severe glitches (mostly horizontal black stripes, but
not only) when using IGD in Xen PV dom0. After not very long time Xorg
crashes, and dmesg contain messages like this:

    i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
    i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
    i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang

The issue can be observed on several different hardware (at least Ivy
Bridge, Tiger Lake and Kaby Lake). It doesn't always happen immediately,
sometimes I need to start several VMs first.
Example how it looks like:
https://openqa.qubes-os.org/tests/48187#step/qui_widgets_notifications/8

More screenshots and logs are linked at https://github.com/QubesOS/qubes-issues/issues/7813

I managed to git bisect the issue and ended up with this as the first
bad commit:

    commit a2daa27c0c6137481226aee5b3136e453c642929
    Author: Christoph Hellwig <hch@lst.de>
    Date:   Mon Feb 14 11:44:42 2022 +0100

        swiotlb: simplify swiotlb_max_segment
        
        Remove the bogus Xen override that was usually larger than the actual
        size and just calculate the value on demand.  Note that
        swiotlb_max_segment still doesn't make sense as an interface and should
        eventually be removed.
        
        Signed-off-by: Christoph Hellwig <hch@lst.de>
        Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
        Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
        Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

I tried reverting just this commit on top of 6.0.x, but the context
changed significantly in subsequent commits, so after trying reverting
it together with 3 or 4 more commits I gave up.

What may be an important detail, the system heavily uses cross-VM shared
memory (gntdev) to map window contents from VMs. This is Qubes OS, and
it uses Xen 4.14.


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-10-18 14:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-18  3:52 i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment" Marek Marczykowski-Górecki
2022-10-18  8:24 ` Christoph Hellwig
2022-10-18  8:57   ` Jan Beulich
2022-10-18 11:02     ` Christoph Hellwig
2022-10-18 14:21       ` Jan Beulich
2022-10-18 14:33         ` Christoph Hellwig
2022-10-18 14:53           ` Juergen Gross
2022-10-18 14:55             ` Christoph Hellwig
2022-10-18 12:01   ` Marek Marczykowski-Górecki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox