kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 220740] New: Host crash when do PF passthrough to KVM guest with some devices
@ 2025-11-03  9:12 bugzilla-daemon
  2025-11-03  9:17 ` [Bug 220740] " bugzilla-daemon
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: bugzilla-daemon @ 2025-11-03  9:12 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220740

            Bug ID: 220740
           Summary: Host crash when do PF passthrough to KVM guest with
                    some devices
           Product: Virtualization
           Version: unspecified
          Hardware: Intel
                OS: Linux
            Status: NEW
          Severity: high
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: farrah.chen@intel.com
        Regression: No

Environment:

Host Kernel: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
v6.18.0-rc4

Guest kernel: 6.17-rc7

QEMU: https://gitlab.com/qemu-project/qemu.git master 37ad0e48e9fd58b17

Bug detail description: 

when do PF passthrough to KVM guest with some devices, guest failed to boot and
host crash.

Not all devices can trigger this issue, currently, I found Intel NIC
X710(almost every time) and Nvidia GPU A10(randomly) can reproduce this issue.
VF passthrough can't reproduce this issue.

Reproduce steps: 

Add "intel_iommu=on" host kernel cmdline to enable VTD
Check VTD in dmesg
[root@gnr ~]# dmesg|grep "Virtualization Technology"
[   27.313975] DMAR: Intel(R) Virtualization Technology for Directed I/O
Check BDF of X710
[root@gnr ~]# lspci|grep "X710"
b8:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for
10GbE SFP+ (rev 01)
...
Bind X710 to vfio-pci driver
[root@gnr ~]# modprobe vfio-pci
[root@gnr ~]# echo 0000:b8:00.0 >
/sys/bus/pci/devices/0000\:b8\:00.0/driver/unbind

[root@gnr ~]# lspci -n -s b8:00.0
b8:00.0 0200: 8086:1572 (rev 01)
[root@gnr ~]# echo 8086 1572 > /sys/bus/pci/drivers/vfio-pci/new_id
[root@gnr ~]# lspci -k -s b8:00.0
b8:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for
10GbE SFP+ (rev 01)
        Subsystem: Intel Corporation Ethernet Converged Network Adapter X710-2
        Kernel driver in use: vfio-pci
        Kernel modules: i40e

Boot guest with b8:00.0 assigned
/home/qemu/build/qemu-system-x86_64 \
    -name legacy,debug-threads=on \
    -accel kvm \
    -cpu host \
    -smp 16 \
    -m 16G \
    -drive file=/home/centos9.qcow2,if=none,id=virtio-disk0 \
    -device virtio-blk-pci,drive=virtio-disk0 \
    -vnc :1 \
    -monitor telnet:127.0.0.1:45455,nowait,server \
    -device vfio-pci,host=b8:00.0 \
    -serial stdio
Error log: 

VM failed to boot, no output.
Host crash with below error in serial output.

gnr login: [  120.259677] i40e 0000:b8:00.0: i40e_ptp_stop: removed PHC on
ens26f0np0

[  136.778544] vfio-pci 0000:b8:00.0: resetting

[  136.891303] vfio-pci 0000:b8:00.0: reset done

[  136.896389] vfio-pci 0000:b8:00.0: Masking broken INTx support

[  136.940637] vfio-pci 0000:b8:00.0: resetting

[  137.051298] vfio-pci 0000:b8:00.0: reset done

[IEH] error found at IEH(S:0x1 B:0xFE D:0x2 F:0x0) Sev: IEH CORRECT ERROR

[IEH] ErrorStatus 0x10, MaxBitIdx 0x1D

IEH CORRECT ERROR

[IEH] BitIdx 0x4, ShareIdx 0x0

[IEH] error device is (S:0x1 B:0xB7 D:0x0 F:0x4) BitIdx 0x4, ShareIdx 0x0 [IEH]
error found at IEH(S:0x1 B:0xB7 D:0x0 F:0x4) Sev: IEH CORRECT ERROR

[IEH] ErrorStatus 0x4, MaxBitIdx 0x11

IEH CORRECT ERROR

[IEH] BitIdx 0x2, ShareIdx 0x0

[IEH] error device is (S:0x1 B:0xB7 D:0x2 F:0x0) BitIdx 0x2, ShareIdx 0x0  
[Device Error] error on skt:0x1 Bus:0xB7 Device:0x2 func:0x0

PcieRootPortErrorHandler MailBox->PcieInitPar.SerrEmuTestEn = 0x0

PcieRootPortMultiErrorsHandler RP Error handler.

ERROR: C00000002:V03071008 I0 515DFD4E-2D7E-40D1-8C22-8AD3CD224325 7C7C9818

WHEA: Detected PCIe Error

 --Logging Corrected Error to WHEA

WHEA: Sending OS notification via SCI. Success

ERROR: C00000002:V03071008 I0 515DFD4E-2D7E-40D1-8C22-8AD3CD224325 7C7C9818

WHEA: Detected PCIe Error

 --Logging Corrected Error to WHEA

WHEA: Sending OS notification via SCI. Success
...

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-12-09  2:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-03  9:12 [Bug 220740] New: Host crash when do PF passthrough to KVM guest with some devices bugzilla-daemon
2025-11-03  9:17 ` [Bug 220740] " bugzilla-daemon
2025-11-03 23:47 ` bugzilla-daemon
2025-11-04  5:48 ` bugzilla-daemon
2025-11-04  5:53 ` bugzilla-daemon
2025-11-05  0:03 ` bugzilla-daemon
2025-12-09  2:54   ` Tian, Kevin
2025-11-05  4:06 ` bugzilla-daemon
2025-11-05  8:12 ` bugzilla-daemon
2025-12-09  2:54 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).