kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s
@ 2025-10-06  8:44 bugzilla-daemon
  2025-10-06  8:46 ` [Bug 220631] " bugzilla-daemon
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: bugzilla-daemon @ 2025-10-06  8:44 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220631

               URL: https://bugzilla.proxmox.com/show_bug.cgi?id=6767
            Bug ID: 220631
           Summary: kernel crash in kvm_intel module on Xeon Silver 4314s
           Product: Virtualization
           Version: unspecified
          Hardware: Intel
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: f.gruenbichler@proxmox.com
        Regression: No

Created attachment 308754
  --> https://bugzilla.kernel.org/attachment.cgi?id=308754&action=edit
Qemu commandline

one of our users reports a crash in kvm_intel with Windows guests and live
migration. the source and target hosts are identical hardware, kernel version
and bios version wise:

- Supermicro SYS-120C-TN10R servers with Xeon Silver 4314s
- microcode       : 0xd000404
- 6.14.8-2-pve (based on Ubuntu 6.14.0-26.26 and upstream stable 6.14.
- guest OS: Windows 11 24H2

the Qemu commandline and dmesg output is attached, the user is CCed to this
bug. unfortunately, we don't have the hardware to reproduce and bisect
ourselves.

this is likely a regression, since the user reports running into the issue
since upgrading to PVE 9, which means switching from kernel 6.8.x to 6.14.x

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 220631] kernel crash in kvm_intel module on Xeon Silver 4314s
  2025-10-06  8:44 [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s bugzilla-daemon
@ 2025-10-06  8:46 ` bugzilla-daemon
  2025-10-06 11:45 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2025-10-06  8:46 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220631

--- Comment #1 from Fabian Grünbichler (f.gruenbichler@proxmox.com) ---
Created attachment 308755
  --> https://bugzilla.kernel.org/attachment.cgi?id=308755&action=edit
dmesg with kvm_intel.dump_invalid_vmcs=1

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 220631] kernel crash in kvm_intel module on Xeon Silver 4314s
  2025-10-06  8:44 [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s bugzilla-daemon
  2025-10-06  8:46 ` [Bug 220631] " bugzilla-daemon
@ 2025-10-06 11:45 ` bugzilla-daemon
  2025-10-06 12:05 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2025-10-06 11:45 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220631

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #2 from Artem S. Tashkinov (aros@gmx.com) ---
Vendor kernels are not supported here and neither is 6.14 because it's not an
LTS kernel.

Is this reproducible under 6.17.1?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 220631] kernel crash in kvm_intel module on Xeon Silver 4314s
  2025-10-06  8:44 [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s bugzilla-daemon
  2025-10-06  8:46 ` [Bug 220631] " bugzilla-daemon
  2025-10-06 11:45 ` bugzilla-daemon
@ 2025-10-06 12:05 ` bugzilla-daemon
  2025-10-06 14:26 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2025-10-06 12:05 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220631

--- Comment #3 from Fabian Grünbichler (f.gruenbichler@proxmox.com) ---
I asked the original reporter to register here, hopefully they can attempt to
reproduce using a newer kernel, e.g. using

https://kernel.ubuntu.com/mainline/v6.17.1/ once that one has built, or
https://kernel.ubuntu.com/mainline/v6.17/amd64/

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 220631] kernel crash in kvm_intel module on Xeon Silver 4314s
  2025-10-06  8:44 [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s bugzilla-daemon
                   ` (2 preceding siblings ...)
  2025-10-06 12:05 ` bugzilla-daemon
@ 2025-10-06 14:26 ` bugzilla-daemon
  2025-10-07  1:21 ` bugzilla-daemon
  2025-10-08  6:23 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2025-10-06 14:26 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220631

Sean Christopherson (seanjc@google.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |seanjc@google.com

--- Comment #4 from Sean Christopherson (seanjc@google.com) ---
This is due to a known a CPU/ucode bug where some Intel CPUs generate spurious
#VEs.


[  +0.000151] kvm: #VE 65dbc7f8
[  +0.000154] kvm: #VE 65dbc7f8, spte[4] = 0x8010000418c52807, spte[3] =
0x801000045b676807, spte[2] = 0x86100023d0000bf7

The "fix" is to disable CONFIG_KVM_INTEL_PROVE_VE.  KVM_INTEL_PROVE_VE isn't
intended for production use (it's akin to PROVE_LOCKING), and the help text
even calls out that some CPUs generate spurious #VEs.

config KVM_INTEL_PROVE_VE
        bool "Check that guests do not receive #VE exceptions"
        depends on KVM_INTEL && EXPERT
        help
          Checks that KVM's page table management code will not incorrectly
          let guests receive a virtualization exception.  Virtualization
          exceptions will be trapped by the hypervisor rather than injected
          in the guest.

          Note: some CPUs appear to generate spurious EPT Violations #VEs
          that trigger KVM's WARN, in particular with eptad=0 and/or nested
          virtualization.

          If unsure, say N.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 220631] kernel crash in kvm_intel module on Xeon Silver 4314s
  2025-10-06  8:44 [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s bugzilla-daemon
                   ` (3 preceding siblings ...)
  2025-10-06 14:26 ` bugzilla-daemon
@ 2025-10-07  1:21 ` bugzilla-daemon
  2025-10-08  6:23 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2025-10-07  1:21 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220631

Kevin Boyd (dev@boyd-family.org) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dev@boyd-family.org

--- Comment #5 from Kevin Boyd (dev@boyd-family.org) ---
Original poster here. Happy to try a different kernel but it seems like the
final comment suggests that the CONFIG_KVM_INTEL_PROVE_VE KFLAG should be set
to “N” to quiet the microcode bug.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 220631] kernel crash in kvm_intel module on Xeon Silver 4314s
  2025-10-06  8:44 [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s bugzilla-daemon
                   ` (4 preceding siblings ...)
  2025-10-07  1:21 ` bugzilla-daemon
@ 2025-10-08  6:23 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2025-10-08  6:23 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=220631

--- Comment #6 from Fabian Grünbichler (f.gruenbichler@proxmox.com) ---
thanks Sean!

it seems like disabling the Kconfig (at least in non-debug builds) seems like a
good way forward, I'll forward this to the Ubuntu folks as well (we inherited
the Y from their kernel config).

@Kevin: I'll ping you once a kernel with that disabled is built, so you can
confirm that this makes your issues no longer reproduce and we can close this!

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-10-08  6:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-06  8:44 [Bug 220631] New: kernel crash in kvm_intel module on Xeon Silver 4314s bugzilla-daemon
2025-10-06  8:46 ` [Bug 220631] " bugzilla-daemon
2025-10-06 11:45 ` bugzilla-daemon
2025-10-06 12:05 ` bugzilla-daemon
2025-10-06 14:26 ` bugzilla-daemon
2025-10-07  1:21 ` bugzilla-daemon
2025-10-08  6:23 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).