public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Athul Krishna <athul.krishna.kr@protonmail.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Linux PCI <linux-pci@vger.kernel.org>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>
Subject: Re: [bugzilla-daemon@kernel.org: [Bug 219619] New: vfio-pci: screen graphics artifacts after 6.12 kernel upgrade]
Date: Mon, 23 Dec 2024 11:59:06 -0500	[thread overview]
Message-ID: <Z2mW2k8GfP7S0c5M@x1n> (raw)
In-Reply-To: <Hb6kvXlGizYbogNWGJcvhY3LsKeRwROtpRluHKsGqRcmZl68J35nP60YdzW1KSoPl5RO_dCxuL5x9mM13jPBbU414DEZE_0rUwDNvzuzyb8=@protonmail.com>

On Mon, Dec 23, 2024 at 07:37:46AM +0000, Athul Krishna wrote:
> Can confirm. Reverting f9e54c3a2f5b from v6.13-rc1 fixed the problem.

I suppose Alex should have some more thoughts, probably after the holidays.
Before that, one quick question to ask..

> 
> -------- Original Message --------
> On 23/12/24 04:06, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> >  Forwarding since not everybody follows bugzilla.  Apparently bisected
> >  to f9e54c3a2f5b ("vfio/pci: implement huge_fault support").
> >  
> >  Athul, f9e54c3a2f5b appears to revert cleanly from v6.13-rc1.  Can you
> >  verify that reverting it is enough to avoid these artifacts?
> >  
> >  #regzbot introduced: f9e54c3a2f5b ("vfio/pci: implement huge_fault support")
> >  
> >  ----- Forwarded message from bugzilla-daemon@kernel.org -----
> >  
> >  Date: Sat, 21 Dec 2024 10:10:02 +0000
> >  From: bugzilla-daemon@kernel.org
> >  To: bjorn@helgaf9e54c3a2f5bas.com
> >  Subject: [Bug 219619] New: vfio-pci: screen graphics artifacts after 6.12 kernel upgrade
> >  Message-ID: <bug-219619-41252@https.bugzilla.kernel.org/>
> >  
> >  https://bugzilla.kernel.org/show_bug.cgi?id=219619
> >  
> >              Bug ID: 219619
> >             Summary: vfio-pci: screen graphics artifacts after 6.12 kernel
> >                      upgrade
> >             Product: Drivers
> >             Version: 2.5
> >            Hardware: AMD
> >                  OS: Linux
> >              Status: NEW
> >            Severity: normal
> >            Priority: P3
> >           Component: PCI
> >            Assignee: drivers_pci@kernel-bugs.osdl.org
> >            Reporter: athul.krishna.kr@protonmail.com
> >          Regression: No
> >  
> >  Created attachment 307382
> >    --> https://bugzilla.kernel.org/attachment.cgi?id=307382&action=edit
> >  dmesg

vfio-pci 0000:03:00.0: vfio_bar_restore: reset recovery - restoring BARs
pcieport 0000:00:01.1: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:03:00.1
vfio-pci 0000:03:00.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Requester ID)
vfio-pci 0000:03:00.0:   device [1002:73ef] error status/mask=00100000/00000000
vfio-pci 0000:03:00.0:    [20] UnsupReq               (First)
vfio-pci 0000:03:00.0: AER:   TLP Header: 60001004 000000ff 0000007d fe7eb000
vfio-pci 0000:03:00.1: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Requester ID)
vfio-pci 0000:03:00.1:   device [1002:ab28] error status/mask=00100000/00000000
vfio-pci 0000:03:00.1:    [20] UnsupReq               (First)
vfio-pci 0000:03:00.1: AER:   TLP Header: 60001004 000000ff 0000007d fe7eb000
vfio-pci 0000:03:00.1: AER:   Error of this Agent is reported first
pcieport 0000:02:00.0: AER: broadcast error_detected message
pcieport 0000:02:00.0: AER: broadcast mmio_enabled message
pcieport 0000:02:00.0: AER: broadcast resume message
pcieport 0000:02:00.0: AER: device recovery successful
pcieport 0000:02:00.0: AER: broadcast error_detected message
pcieport 0000:02:00.0: AER: broadcast mmio_enabled message
pcieport 0000:02:00.0: AER: broadcast resume message
pcieport 0000:02:00.0: AER: device recovery successful

> >  
> >  Device: Asus Zephyrus GA402RJ
> >  CPU: Ryzen 7 6800HS
> >  GPU: RX 6700S
> >  Kernel: 6.13.0-rc3-g8faabc041a00
> >  
> >  Problem:
> >  Launching games or gpu bench-marking tools in qemu windows 11 vm will cause
> >  screen artifacts, ultimately qemu will pause with unrecoverable error.

Is there more information on what setup can reproduce it?

For example, does it only happen with Windows guests?  Does the GPU
vendor/model matter?

> >  
> >  Commit:
> >  f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101 is the first bad commit
> >  commit f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101
> >  Author: Alex Williamson <alex.williamson@redhat.com>
> >  Date:   Mon Aug 26 16:43:53 2024 -0400
> >  
> >      vfio/pci: implement huge_fault support

Personally I have no clue yet on how this could affect it.  I was initially
worrying on any implicit cache mode changes on the mappings, but I don't
think any of such was involved in this specific change.

This commit majorly does two things: (1) allow 2M/1G mappings for BARs
instead of small 4Ks always, and (2) always lazy faults rather than
"install everything in the 1st fault".  Maybe one of the two could have
some impact in some way.

IIUC basic paths were covered and hopefully should work, so I wonder what's
the specialty. Might be relevant to above questions on the reproduceable
setups.

Thanks,

-- 
Peter Xu


  reply	other threads:[~2024-12-23 16:59 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-22 22:36 [bugzilla-daemon@kernel.org: [Bug 219619] New: vfio-pci: screen graphics artifacts after 6.12 kernel upgrade] Bjorn Helgaas
2024-12-23  7:37 ` Athul Krishna
2024-12-23 16:59   ` Peter Xu [this message]
2024-12-23 18:15     ` Alex Williamson
2024-12-24 18:06       ` Athul Krishna
2024-12-30 21:03     ` Precific
2024-12-31  1:27       ` Alex Williamson
2024-12-31 15:44         ` Precific
2024-12-31 16:07           ` Alex Williamson
2025-01-01  3:10             ` Precific
2025-01-02 16:39             ` Peter Xu
2025-01-02 17:04               ` Alex Williamson
2025-01-02 18:38                 ` Alex Williamson
2025-02-25 17:59                   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z2mW2k8GfP7S0c5M@x1n \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=athul.krishna.kr@protonmail.com \
    --cc=helgaas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox