kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: kvm@vger.kernel.org
Subject: [Bug 220740] Host crash when do PF passthrough to KVM guest with some devices
Date: Tue, 09 Dec 2025 02:54:35 +0000	[thread overview]
Message-ID: <bug-220740-28872-jH0XOJ9a6f@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-220740-28872@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=220740

--- Comment #8 from kevin.tian@intel.com ---
> From: bugzilla-daemon@kernel.org <bugzilla-daemon@kernel.org>
> Sent: Wednesday, November 5, 2025 8:03 AM
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=220740
> 
> --- Comment #5 from Alex Williamson (alex.l.williamson@gmail.com) ---
> I have an X710, but not a system that can reproduce the issue.
> 
> Also I need to correct my previous statement after untangling the headers.
> This commit did introduce 8-byte access support for archs including x86_64
> where they don't otherwise defined a ioread/write64 support.  This access
> uses
> readq/writeq, where previously we'd use pairs or readl/writel.  The
> expectation
> is that we're more closely matching the access by the guest.
> 
> I'm curious how we're getting into this code for an X710 though, mine shows
> BARs as:
> 
> 03:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for
> 10GbE SFP+ (rev 01)
>         Region 0: Memory at 380000000000 (64-bit, prefetchable) [size=8M]
>         Region 3: Memory at 380001800000 (64-bit, prefetchable) [size=32K]
> 
> Those would typically be mapped directly into the KVM address space and
> not
> fault through QEMU to trigger access through this code.

We have verified this problem caused by 8-byte access to the rom bar:

    Expansion ROM at 93480000 [disabled] [size=512K]

Every qword access to that range triggers a dozens of PCI AER related
prints then in total 64K reads from Qemu lead to many many prints then
the host is not responsive.

There is indeed no access to bar0/bar3 in this path.

Disabling "PCIE Error Enabling" in BIOS just removes the prints to hide
the issue.

Updating to latest X710 firmware didn't help and we didn't find an explicit
errata talking about this dword limitation. 

It is difficult to identify all possible devices suffering from this issue, so
a
safer/simpler way is to universally disable 8-byte access to the rom bar,
e.g. as below:

diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c
b/drivers/vfio/pci/nvgrace-gpu/main.c
index e346392b72f6..9b39184f76b7 100644
--- a/drivers/vfio/pci/nvgrace-gpu/main.c
+++ b/drivers/vfio/pci/nvgrace-gpu/main.c
@@ -491,7 +491,7 @@ nvgrace_gpu_map_and_read(struct nvgrace_gpu_pci_core_device
*nvdev,
                ret = vfio_pci_core_do_io_rw(&nvdev->core_device, false,
                                             nvdev->resmem.ioaddr,
                                             buf, offset, mem_count,
-                                            0, 0, false);
+                                            0, 0, false, true);
        }

        return ret;
@@ -609,7 +609,7 @@ nvgrace_gpu_map_and_write(struct
nvgrace_gpu_pci_core_device *nvdev,
                ret = vfio_pci_core_do_io_rw(&nvdev->core_device, false,
                                             nvdev->resmem.ioaddr,
                                             (char __user *)buf, pos,
mem_count,
-                                            0, 0, true);
+                                            0, 0, true, true);
        }

        return ret;
diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c
b/drivers/vfio/pci/vfio_pci_rdwr.c
index 6192788c8ba3..3467151a632d 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -135,7 +135,7 @@ VFIO_IORDWR(64)
 ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool
test_mem,
                               void __iomem *io, char __user *buf,
                               loff_t off, size_t count, size_t x_start,
-                              size_t x_end, bool iswrite)
+                              size_t x_end, bool iswrite, bool allow_qword)
 {
        ssize_t done = 0;
        int ret;
@@ -150,7 +150,7 @@ ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device
*vdev, bool test_mem,
                else
                        fillable = 0;

-               if (fillable >= 8 && !(off % 8)) {
+               if (allow_qword && fillable >= 8 && !(off % 8)) {
                        ret = vfio_pci_iordwr64(vdev, iswrite, test_mem,
                                                io, buf, off, &filled);
                        if (ret)
@@ -234,6 +234,7 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev,
char __user *buf,
        void __iomem *io;
        struct resource *res = &vdev->pdev->resource[bar];
        ssize_t done;
+       bool allow_qword = true;

        if (pci_resource_start(pdev, bar))
                end = pci_resource_len(pdev, bar);
@@ -262,6 +263,15 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev,
char __user *buf,
                if (!io)
                        return -ENOMEM;
                x_end = end;
+
+               /*
+                * Certain devices (e.g. Intel X710) don't support 8-byte
access
+                * to the ROM bar. Otherwise PCI AER errors might be triggered.
+                *
+                * Disable qword access to the ROM bar universally, which has
been
+                * working reliably for years before 8-byte access is enabled.
+                */
+               allow_qword = false;
        } else {
                int ret = vfio_pci_core_setup_barmap(vdev, bar);
                if (ret) {
@@ -278,7 +288,7 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev,
char __user *buf,
        }

        done = vfio_pci_core_do_io_rw(vdev, res->flags & IORESOURCE_MEM, io,
buf, pos,
-                                     count, x_start, x_end, iswrite);
+                                     count, x_start, x_end, iswrite,
allow_qword);

        if (done >= 0)
                *ppos += done;
@@ -352,7 +362,7 @@ ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev,
char __user *buf,
         * to the memory enable bit in the command register.
         */
        done = vfio_pci_core_do_io_rw(vdev, false, iomem, buf, off, count,
-                                     0, 0, iswrite);
+                                     0, 0, iswrite, true);

        vga_put(vdev->pdev, rsrc);

diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index f541044e42a2..3a75b76eaed3 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -133,7 +133,7 @@ pci_ers_result_t vfio_pci_core_aer_err_detected(struct
pci_dev *pdev,
 ssize_t vfio_pci_core_do_io_rw(struct vfio_pci_core_device *vdev, bool
test_mem,
                               void __iomem *io, char __user *buf,
                               loff_t off, size_t count, size_t x_start,
-                              size_t x_end, bool iswrite);
+                              size_t x_end, bool iswrite, bool allow_qword);
 bool vfio_pci_core_range_intersect_range(loff_t buf_start, size_t buf_cnt,
                                         loff_t reg_start, size_t reg_cnt,
                                         loff_t *buf_offset,

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

      parent reply	other threads:[~2025-12-09  2:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-03  9:12 [Bug 220740] New: Host crash when do PF passthrough to KVM guest with some devices bugzilla-daemon
2025-11-03  9:17 ` [Bug 220740] " bugzilla-daemon
2025-11-03 23:47 ` bugzilla-daemon
2025-11-04  5:48 ` bugzilla-daemon
2025-11-04  5:53 ` bugzilla-daemon
2025-11-05  0:03 ` bugzilla-daemon
2025-12-09  2:54   ` Tian, Kevin
2025-11-05  4:06 ` bugzilla-daemon
2025-11-05  8:12 ` bugzilla-daemon
2025-12-09  2:54 ` bugzilla-daemon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-220740-28872-jH0XOJ9a6f@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).