kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
@ 2024-07-06 11:20 bugzilla-daemon
  2024-08-03 14:59 ` [Bug 219009] " bugzilla-daemon
                   ` (51 more replies)
  0 siblings, 52 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-07-06 11:20 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

            Bug ID: 219009
           Summary: Random host reboots on Ryzen 7000/8000 using nested
                    VMs (vls suspected)
           Product: Virtualization
           Version: unspecified
          Hardware: AMD
                OS: Linux
            Status: NEW
          Severity: high
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: zaltys@natrix.lt
        Regression: No

Running nested VMs on AMD Ryzen 7000/8000 (ZEN4) CPUs results in random host's
reboots.

There is no kernel panic, no log entries, no relevant output to serial console.
It is as if platform is simply hard reset. It seems time to reproduce it varies
from system to system and can be dependent on workload and even specific CPU
model.

I can reproduce it with kernel 6.9.7 and qemu 9.0 on Ryzen 7950X3D under one
hour by using KVM -> Windows 10/11 with Hyper-V services on or KVM -> Windows
10/11 with 3 VBox VMs (also Win11) running. Others people had it repeatedly
reproduced on Ryzen 7700,7600 and 8700GE, including KVM -> KVM -> Linux.[1] I
also have seen Hetzner (company offering Ryzen based dedicated servers)
customers complaining about similiar random reboots.

I tried looking up errata for Ryzen 7000/8000, but could not find one
published, so I decided to check errata for EPYC 9004 [2], which is also Zen4
arch as Ryzen 7000/8000. It has nesting related bug #1495 (on page 49), which
mentions using Virtualized VMLOAD/VMSAVE can result in MCE and/or system reset. 

Based on that errata mentioned above, I reconfigured my system with
kvm_amd.vls=0 and for me random reboots with nested virtualization stopped.
Same was reported by several people from [1].

Somebody from AMD must be asked to confirm if it is really Ryzen 7000/8000
hardware bug, and if there is a better fix than disabling VLS as it has
performance hit. If disabling it is the only fix, then kvm_amd.vls=0 must be
default for Ryzen 7000/8000.

[1]
https://www.reddit.com/r/Proxmox/comments/1cym3pl/nested_virtualization_crashing_ryzen_7000_series/
[2]
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/57095-PUB_1_01.pdf

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
@ 2024-08-03 14:59 ` bugzilla-daemon
  2024-08-23  7:36 ` bugzilla-daemon
                   ` (50 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-03 14:59 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

h4ck3r (michal.litwinczuk@op.pl) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |michal.litwinczuk@op.pl

--- Comment #1 from h4ck3r (michal.litwinczuk@op.pl) ---
To all Zen4 users!!!

I'm now experimenting with different settings and found way to enable nested
virtualization with host cpu type and no performance penalty.

Disabling aspm in bios does fix it for win11 guest. (proxmox, hyperv enabled)
Further testing is required, so anyone having this issue can contribute by
testing other guest operating systems.

For now it's best solution until fix from AMD arrives.
Unless its irreparable with just microcode update...

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
  2024-08-03 14:59 ` [Bug 219009] " bugzilla-daemon
@ 2024-08-23  7:36 ` bugzilla-daemon
  2024-08-23  7:37 ` bugzilla-daemon
                   ` (49 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-23  7:36 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

Ben Hirlston (ozonehelix@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ozonehelix@gmail.com

--- Comment #2 from Ben Hirlston (ozonehelix@gmail.com) ---
I can confirm on my Ryzen 9 7900X30 system disabling aspm helps. kind of
relieved this is a bug. I was having this issue on my Ryzen 7 5700G system but
to a lessor extent and it got worse when I upgraded to Ryzen 7000 series

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
  2024-08-03 14:59 ` [Bug 219009] " bugzilla-daemon
  2024-08-23  7:36 ` bugzilla-daemon
@ 2024-08-23  7:37 ` bugzilla-daemon
  2024-08-23 20:45 ` bugzilla-daemon
                   ` (48 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-23  7:37 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #3 from Ben Hirlston (ozonehelix@gmail.com) ---
(In reply to Ben Hirlston from comment #2)
> I can confirm on my Ryzen 9 7900X30 system disabling aspm helps. kind of
> relieved this is a bug. I was having this issue on my Ryzen 7 5700G system
> but to a lessor extent and it got worse when I upgraded to Ryzen 7000 series

there is a typo I meant 7900X3D sorry about that

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (2 preceding siblings ...)
  2024-08-23  7:37 ` bugzilla-daemon
@ 2024-08-23 20:45 ` bugzilla-daemon
  2024-08-23 20:49 ` bugzilla-daemon
                   ` (47 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-23 20:45 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #4 from Ben Hirlston (ozonehelix@gmail.com) ---
(In reply to Ben Hirlston from comment #3)
> (In reply to Ben Hirlston from comment #2)
> > I can confirm on my Ryzen 9 7900X30 system disabling aspm helps. kind of
> > relieved this is a bug. I was having this issue on my Ryzen 7 5700G system
> > but to a lessor extent and it got worse when I upgraded to Ryzen 7000
> series
> 
> there is a typo I meant 7900X3D sorry about that

no I had to do kvm_amd.vls=0 the hit to performance wasn't as bad as I thought
but disabling aspm didn't help I was wrong

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (3 preceding siblings ...)
  2024-08-23 20:45 ` bugzilla-daemon
@ 2024-08-23 20:49 ` bugzilla-daemon
  2024-08-23 21:08 ` bugzilla-daemon
                   ` (46 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-23 20:49 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

Sagnik Sasmal (sagnik@sagnik.me) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sagnik@sagnik.me

--- Comment #5 from Sagnik Sasmal (sagnik@sagnik.me) ---
I can confirm the same problem occurring as well on 7950X and 7950X3D CPUs.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (4 preceding siblings ...)
  2024-08-23 20:49 ` bugzilla-daemon
@ 2024-08-23 21:08 ` bugzilla-daemon
  2024-08-25 11:44 ` bugzilla-daemon
                   ` (45 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-23 21:08 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #6 from Ben Hirlston (ozonehelix@gmail.com) ---
do we know if Ryzen 9000 has this issue? I know I had this issue on Ryzen 5000
but to a lessor extent

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (5 preceding siblings ...)
  2024-08-23 21:08 ` bugzilla-daemon
@ 2024-08-25 11:44 ` bugzilla-daemon
  2024-08-25 11:45 ` bugzilla-daemon
                   ` (44 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-25 11:44 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #7 from h4ck3r (michal.litwinczuk@op.pl) ---
Update - aspm does not work, thou it decreased chance of random reboot.
Issue might be related to memory addressing method used in zen4.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (6 preceding siblings ...)
  2024-08-25 11:44 ` bugzilla-daemon
@ 2024-08-25 11:45 ` bugzilla-daemon
  2024-08-26  0:07 ` bugzilla-daemon
                   ` (43 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-25 11:45 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #8 from h4ck3r (michal.litwinczuk@op.pl) ---
(In reply to Ben Hirlston from comment #6)
> do we know if Ryzen 9000 has this issue? I know I had this issue on Ryzen
> 5000 but to a lessor extent

Could you elaborate on what was happening with 5000?
(reboots, mce, something other)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (7 preceding siblings ...)
  2024-08-25 11:45 ` bugzilla-daemon
@ 2024-08-26  0:07 ` bugzilla-daemon
  2024-08-26  0:08 ` bugzilla-daemon
                   ` (42 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-26  0:07 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #9 from Ben Hirlston (ozonehelix@gmail.com) ---
(In reply to h4ck3r from comment #8)
> (In reply to Ben Hirlston from comment #6)
> > do we know if Ryzen 9000 has this issue? I know I had this issue on Ryzen
> > 5000 but to a lessor extent
> 
> Could you elaborate on what was happening with 5000?
> (reboots, mce, something other)

I would be using my Virtual Machine to Windows Windows 11 and would be doing
something intensive that was using vls and the machine would reset just like
7000 but it happened way less often

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (8 preceding siblings ...)
  2024-08-26  0:07 ` bugzilla-daemon
@ 2024-08-26  0:08 ` bugzilla-daemon
  2024-08-27 18:16 ` bugzilla-daemon
                   ` (41 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-26  0:08 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #10 from Ben Hirlston (ozonehelix@gmail.com) ---
(In reply to Ben Hirlston from comment #9)
> (In reply to h4ck3r from comment #8)
> > (In reply to Ben Hirlston from comment #6)
> > > do we know if Ryzen 9000 has this issue? I know I had this issue on Ryzen
> > > 5000 but to a lessor extent
> > 
> > Could you elaborate on what was happening with 5000?
> > (reboots, mce, something other)
> 
> I would be using my Virtual Machine to Windows Windows 11 and would be doing
> something intensive that was using vls and the machine would reset just like
> 7000 but it happened way less often

I no longer have that machine anymore so I can't test it anymore but I have
memory of it happening on a bi weekly to monthly occurrence

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (9 preceding siblings ...)
  2024-08-26  0:08 ` bugzilla-daemon
@ 2024-08-27 18:16 ` bugzilla-daemon
  2024-08-27 18:19 ` bugzilla-daemon
                   ` (40 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-27 18:16 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #11 from h4ck3r (michal.litwinczuk@op.pl) ---
Im afraid it might be overlooked issue that propagated to zen5.
If there is someone with zen5 let know if nestes virt also breaks on it.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (10 preceding siblings ...)
  2024-08-27 18:16 ` bugzilla-daemon
@ 2024-08-27 18:19 ` bugzilla-daemon
  2024-08-31  0:10 ` bugzilla-daemon
                   ` (39 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-27 18:19 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #12 from h4ck3r (michal.litwinczuk@op.pl) ---
Kernel and qemu devs perspective is also appriciated on that topic.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (11 preceding siblings ...)
  2024-08-27 18:19 ` bugzilla-daemon
@ 2024-08-31  0:10 ` bugzilla-daemon
  2024-08-31 11:51 ` bugzilla-daemon
                   ` (38 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-31  0:10 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

blake (blake@volian.org) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |blake@volian.org

--- Comment #13 from blake (blake@volian.org) ---
I recently experienced this. I built a proxmox cluster with 7950x. Every node
that I tested on would hard reset with no logs when a VM was doing nested
virtualization.

Our CI testing uses VMs, and putting the CI in a VM itself makes it pretty easy
to reproduce, just takes some time.

Setting kvm_amd.vls=0 seems to have resolved the issue, we had zero node resets
today and I was trying to force them.

Kernel is Proxmox's 6.8.12-1-pve.

Thanks,
Blake

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (12 preceding siblings ...)
  2024-08-31  0:10 ` bugzilla-daemon
@ 2024-08-31 11:51 ` bugzilla-daemon
  2024-08-31 18:58 ` bugzilla-daemon
                   ` (37 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-31 11:51 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #14 from h4ck3r (michal.litwinczuk@op.pl) ---
(In reply to blake from comment #13)
> I recently experienced this. I built a proxmox cluster with 7950x. Every
> node that I tested on would hard reset with no logs when a VM was doing
> nested virtualization.
> 
> Our CI testing uses VMs, and putting the CI in a VM itself makes it pretty
> easy to reproduce, just takes some time.
> 
> Setting kvm_amd.vls=0 seems to have resolved the issue, we had zero node
> resets today and I was trying to force them.
> 
> Kernel is Proxmox's 6.8.12-1-pve.
> 
> Thanks,
> Blake

Disabling vls does help with crashes, but has too much performance penalty.
In my case it would lock gpu utilization at 40% max. (proxmox win10/11 hyperv
enabled)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (13 preceding siblings ...)
  2024-08-31 11:51 ` bugzilla-daemon
@ 2024-08-31 18:58 ` bugzilla-daemon
  2024-08-31 21:51 ` bugzilla-daemon
                   ` (36 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-31 18:58 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #15 from Ben Hirlston (ozonehelix@gmail.com) ---
the kernel I am running for reference is 6.10.6-zen1-1-zen and I have a Ryzen 9
7900X3D

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (14 preceding siblings ...)
  2024-08-31 18:58 ` bugzilla-daemon
@ 2024-08-31 21:51 ` bugzilla-daemon
  2024-08-31 22:54 ` bugzilla-daemon
                   ` (35 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-31 21:51 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #16 from blake (blake@volian.org) ---
At least for me, I haven't noticed any performance hit. These systems are all
headless though.

On Sat, Aug 31, 2024, at 6:51 AM, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=219009
> 
> --- Comment #14 from h4ck3r (michal.litwinczuk@op.pl) ---
> (In reply to blake from comment #13)
> > I recently experienced this. I built a proxmox cluster with 7950x. Every
> > node that I tested on would hard reset with no logs when a VM was doing
> > nested virtualization.
> > 
> > Our CI testing uses VMs, and putting the CI in a VM itself makes it pretty
> > easy to reproduce, just takes some time.
> > 
> > Setting kvm_amd.vls=0 seems to have resolved the issue, we had zero node
> > resets today and I was trying to force them.
> > 
> > Kernel is Proxmox's 6.8.12-1-pve.
> > 
> > Thanks,
> > Blake
> 
> Disabling vls does help with crashes, but has too much performance penalty.
> In my case it would lock gpu utilization at 40% max. (proxmox win10/11 hyperv
> enabled)
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are on the CC list for the bug.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (15 preceding siblings ...)
  2024-08-31 21:51 ` bugzilla-daemon
@ 2024-08-31 22:54 ` bugzilla-daemon
  2024-10-02 22:52 ` bugzilla-daemon
                   ` (34 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-08-31 22:54 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #17 from Ben Hirlston (ozonehelix@gmail.com) ---
(In reply to blake from comment #16)
> Created attachment 306799 [details]
> attachment-12508-0.html
> 
> At least for me, I haven't noticed any performance hit. These systems are
> all headless though.
> 
> On Sat, Aug 31, 2024, at 6:51 AM, bugzilla-daemon@kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=219009
> > 
> > --- Comment #14 from h4ck3r (michal.litwinczuk@op.pl) ---
> > (In reply to blake from comment #13)
> > > I recently experienced this. I built a proxmox cluster with 7950x. Every
> > > node that I tested on would hard reset with no logs when a VM was doing
> > > nested virtualization.
> > > 
> > > Our CI testing uses VMs, and putting the CI in a VM itself makes it
> pretty
> > > easy to reproduce, just takes some time.
> > > 
> > > Setting kvm_amd.vls=0 seems to have resolved the issue, we had zero node
> > > resets today and I was trying to force them.
> > > 
> > > Kernel is Proxmox's 6.8.12-1-pve.
> > > 
> > > Thanks,
> > > Blake
> > 
> > Disabling vls does help with crashes, but has too much performance penalty.
> > In my case it would lock gpu utilization at 40% max. (proxmox win10/11
> hyperv
> > enabled)
> > 
> > -- 
> > You may reply to this email to add a comment.
> > 
> > You are receiving this mail because:
> > You are on the CC list for the bug.

the performance hit for me hasn't been that bad but disabling vls is the same
as disabling svm for the vm its more like telling the VM the CPU is incapable
of nested virtualization that is what I observed

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (16 preceding siblings ...)
  2024-08-31 22:54 ` bugzilla-daemon
@ 2024-10-02 22:52 ` bugzilla-daemon
  2024-10-02 22:53 ` bugzilla-daemon
                   ` (33 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-02 22:52 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #18 from Ben Hirlston (ozonehelix@gmail.com) ---
I was wondering if this issue might be related to this thread on gentoo's bug
tracker?
https://bugs.gentoo.org/808990

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (17 preceding siblings ...)
  2024-10-02 22:52 ` bugzilla-daemon
@ 2024-10-02 22:53 ` bugzilla-daemon
  2024-10-02 22:53 ` bugzilla-daemon
                   ` (32 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-02 22:53 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #19 from Ben Hirlston (ozonehelix@gmail.com) ---
my curiousity relates to this new bug being caused by fixing the CVE's
CVE-2021-3653, CVE-2021-3656

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (18 preceding siblings ...)
  2024-10-02 22:53 ` bugzilla-daemon
@ 2024-10-02 22:53 ` bugzilla-daemon
  2024-10-03 15:03 ` bugzilla-daemon
                   ` (31 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-02 22:53 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #20 from Ben Hirlston (ozonehelix@gmail.com) ---
I am wondering if the fixes for those CVE's are related to this bug

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (19 preceding siblings ...)
  2024-10-02 22:53 ` bugzilla-daemon
@ 2024-10-03 15:03 ` bugzilla-daemon
  2024-10-03 15:05 ` bugzilla-daemon
                   ` (30 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-03 15:03 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

mlevitsk@redhat.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mlevitsk@redhat.com

--- Comment #21 from mlevitsk@redhat.com ---
Nope, these are not related, but do check if disabling AVIC helps (set
enable_avic parameter of kvm_amd to 0)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (20 preceding siblings ...)
  2024-10-03 15:03 ` bugzilla-daemon
@ 2024-10-03 15:05 ` bugzilla-daemon
  2024-10-03 15:11 ` [Bug 219009] New: " Maxim Levitsky
                   ` (29 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-03 15:05 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #22 from mlevitsk@redhat.com ---
I mean avic=0

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (21 preceding siblings ...)
  2024-10-03 15:05 ` bugzilla-daemon
@ 2024-10-03 15:11 ` Maxim Levitsky
  2024-10-03 15:11 ` [Bug 219009] " bugzilla-daemon
                   ` (28 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: Maxim Levitsky @ 2024-10-03 15:11 UTC (permalink / raw)
  To: bugzilla-daemon, kvm
  Cc: Suthikulpanit, Suravee, Tom Lendacky, Sean Christopherson

On Sat, 2024-07-06 at 11:20 +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=219009
> 
>             Bug ID: 219009
>            Summary: Random host reboots on Ryzen 7000/8000 using nested
>                     VMs (vls suspected)
>            Product: Virtualization
>            Version: unspecified
>           Hardware: AMD
>                 OS: Linux
>             Status: NEW
>           Severity: high
>           Priority: P3
>          Component: kvm
>           Assignee: virtualization_kvm@kernel-bugs.osdl.org
>           Reporter: zaltys@natrix.lt
>         Regression: No
> 
> Running nested VMs on AMD Ryzen 7000/8000 (ZEN4) CPUs results in random host's
> reboots.
> 
> There is no kernel panic, no log entries, no relevant output to serial console.
> It is as if platform is simply hard reset. It seems time to reproduce it varies
> from system to system and can be dependent on workload and even specific CPU
> model.
> 
> I can reproduce it with kernel 6.9.7 and qemu 9.0 on Ryzen 7950X3D under one
> hour by using KVM -> Windows 10/11 with Hyper-V services on or KVM -> Windows
> 10/11 with 3 VBox VMs (also Win11) running. Others people had it repeatedly
> reproduced on Ryzen 7700,7600 and 8700GE, including KVM -> KVM -> Linux.[1] I
> also have seen Hetzner (company offering Ryzen based dedicated servers)
> customers complaining about similiar random reboots.
> 
> I tried looking up errata for Ryzen 7000/8000, but could not find one
> published, so I decided to check errata for EPYC 9004 [2], which is also Zen4
> arch as Ryzen 7000/8000. It has nesting related bug #1495 (on page 49), which
> mentions using Virtualized VMLOAD/VMSAVE can result in MCE and/or system reset. 
> 
> Based on that errata mentioned above, I reconfigured my system with
> kvm_amd.vls=0 and for me random reboots with nested virtualization stopped.
> Same was reported by several people from [1].
> 
> Somebody from AMD must be asked to confirm if it is really Ryzen 7000/8000
> hardware bug, and if there is a better fix than disabling VLS as it has
> performance hit. If disabling it is the only fix, then kvm_amd.vls=0 must be
> default for Ryzen 7000/8000.
> 
> [1]
> https://www.reddit.com/r/Proxmox/comments/1cym3pl/nested_virtualization_crashing_ryzen_7000_series/
> [2]
> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/57095-PUB_1_01.pdf
> 

Hi!

Can someone from AMD take a look at this bug:

From the bug report it appears that recent Zen4 CPUs have errata in their virtual VMLOAD/VMSAVE implemenatation,
which causes random host reboots (#MC?) when nesting is used, which is IMHO a quite serious issue.


Thanks,
Best regards,
       Maxim Levitsky


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (22 preceding siblings ...)
  2024-10-03 15:11 ` [Bug 219009] New: " Maxim Levitsky
@ 2024-10-03 15:11 ` bugzilla-daemon
  2024-10-03 17:13 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-03 15:11 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #23 from mlevitsk@redhat.com ---
On Sat, 2024-07-06 at 11:20 +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=219009
> 
>             Bug ID: 219009
>            Summary: Random host reboots on Ryzen 7000/8000 using nested
>                     VMs (vls suspected)
>            Product: Virtualization
>            Version: unspecified
>           Hardware: AMD
>                 OS: Linux
>             Status: NEW
>           Severity: high
>           Priority: P3
>          Component: kvm
>           Assignee: virtualization_kvm@kernel-bugs.osdl.org
>           Reporter: zaltys@natrix.lt
>         Regression: No
> 
> Running nested VMs on AMD Ryzen 7000/8000 (ZEN4) CPUs results in random
> host's
> reboots.
> 
> There is no kernel panic, no log entries, no relevant output to serial
> console.
> It is as if platform is simply hard reset. It seems time to reproduce it
> varies
> from system to system and can be dependent on workload and even specific CPU
> model.
> 
> I can reproduce it with kernel 6.9.7 and qemu 9.0 on Ryzen 7950X3D under one
> hour by using KVM -> Windows 10/11 with Hyper-V services on or KVM -> Windows
> 10/11 with 3 VBox VMs (also Win11) running. Others people had it repeatedly
> reproduced on Ryzen 7700,7600 and 8700GE, including KVM -> KVM -> Linux.[1] I
> also have seen Hetzner (company offering Ryzen based dedicated servers)
> customers complaining about similiar random reboots.
> 
> I tried looking up errata for Ryzen 7000/8000, but could not find one
> published, so I decided to check errata for EPYC 9004 [2], which is also Zen4
> arch as Ryzen 7000/8000. It has nesting related bug #1495 (on page 49), which
> mentions using Virtualized VMLOAD/VMSAVE can result in MCE and/or system
> reset. 
> 
> Based on that errata mentioned above, I reconfigured my system with
> kvm_amd.vls=0 and for me random reboots with nested virtualization stopped.
> Same was reported by several people from [1].
> 
> Somebody from AMD must be asked to confirm if it is really Ryzen 7000/8000
> hardware bug, and if there is a better fix than disabling VLS as it has
> performance hit. If disabling it is the only fix, then kvm_amd.vls=0 must be
> default for Ryzen 7000/8000.
> 
> [1]
>
> https://www.reddit.com/r/Proxmox/comments/1cym3pl/nested_virtualization_crashing_ryzen_7000_series/
> [2]
>
> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/57095-PUB_1_01.pdf
> 

Hi!

Can someone from AMD take a look at this bug:

From the bug report it appears that recent Zen4 CPUs have errata in their
virtual VMLOAD/VMSAVE implemenatation,
which causes random host reboots (#MC?) when nesting is used, which is IMHO a
quite serious issue.


Thanks,
Best regards,
       Maxim Levitsky

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (23 preceding siblings ...)
  2024-10-03 15:11 ` [Bug 219009] " bugzilla-daemon
@ 2024-10-03 17:13 ` bugzilla-daemon
  2024-10-08 17:32 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-03 17:13 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #24 from Žilvinas Žaltiena (zaltys@natrix.lt) ---
Disabling avic does not help. I and some other people tried that a few months
ago.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (24 preceding siblings ...)
  2024-10-03 17:13 ` bugzilla-daemon
@ 2024-10-08 17:32 ` bugzilla-daemon
  2024-10-08 17:43 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-08 17:32 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #25 from h4ck3r (michal.litwinczuk@op.pl) ---
I've recently talked to person which insisted they never had issues on their
guests running on host cpu type without disabling vls.
From what i asked it seems that all guests were linix based and lacked pci
passthrought (proxmox newest kernel as of time of this post)

Further testing is required.

I'm gonna spin up linux guest with approx half of host memory (no balooning)
without any external device attached to see if stability can be archived that
way.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (25 preceding siblings ...)
  2024-10-08 17:32 ` bugzilla-daemon
@ 2024-10-08 17:43 ` bugzilla-daemon
  2024-10-08 17:53 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-08 17:43 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #26 from mlevitsk@redhat.com ---
But the question is - did they use nested virtualization on Linux actively and
with vls enabled?


The use case which causes the reboots as I understand is Hyperv enabled
Windows, in which case pretty much the whole Windows is running as a nested VM,
nested to the Hyperv hypervisor.

Once I get my hands on a client Zen4 machine (I only have Zen2 at home), I will
also try to reproduce this but not promises when this will happen. 

Meanwhile I really hope that someone from AMD can take a look a this, and
either confirm that this is or will be fixed with a microcode patch or confirm
that we have to disable vls on the affected CPUs.

Best regards,
       Maxim Levitsky

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (26 preceding siblings ...)
  2024-10-08 17:43 ` bugzilla-daemon
@ 2024-10-08 17:53 ` bugzilla-daemon
  2024-10-08 18:26 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-08 17:53 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #27 from Žilvinas Žaltiena (zaltys@natrix.lt) ---
(In reply to blake from comment #13)
> I recently experienced this. I built a proxmox cluster with 7950x. Every
> node that I tested on would hard reset with no logs when a VM was doing
> nested virtualization.
> 
> Our CI testing uses VMs, and putting the CI in a VM itself makes it pretty
> easy to reproduce, just takes some time.

Blake, what OS is used in your VMs ?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (27 preceding siblings ...)
  2024-10-08 17:53 ` bugzilla-daemon
@ 2024-10-08 18:26 ` bugzilla-daemon
  2024-10-08 19:05 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-08 18:26 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #28 from blake (blake@volian.org) ---
All of ours are running Debian Bookworm, stock kernel
linux-image-6.1.0-26-amd64. And then hosts are 6.8.12-2-pve

On Tue, Oct 8, 2024, at 12:53 PM, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=219009
> 
> --- Comment #27 from Žilvinas Žaltiena (zaltys@natrix.lt) ---
> (In reply to blake from comment #13)
> > I recently experienced this. I built a proxmox cluster with 7950x. Every
> > node that I tested on would hard reset with no logs when a VM was doing
> > nested virtualization.
> > 
> > Our CI testing uses VMs, and putting the CI in a VM itself makes it pretty
> > easy to reproduce, just takes some time.
> 
> Blake, what OS is used in your VMs ?
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are on the CC list for the bug.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (28 preceding siblings ...)
  2024-10-08 18:26 ` bugzilla-daemon
@ 2024-10-08 19:05 ` bugzilla-daemon
  2024-10-08 19:11 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-08 19:05 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #29 from h4ck3r (michal.litwinczuk@op.pl) ---
(In reply to mlevitsk from comment #26)
> But the question is - did they use nested virtualization on Linux actively
> and with vls enabled?
> 
> 
> The use case which causes the reboots as I understand is Hyperv enabled
> Windows, in which case pretty much the whole Windows is running as a nested
> VM, nested to the Hyperv hypervisor.
> 
> Once I get my hands on a client Zen4 machine (I only have Zen2 at home), I
> will also try to reproduce this but not promises when this will happen. 
> 
> Meanwhile I really hope that someone from AMD can take a look a this, and
> either confirm that this is or will be fixed with a microcode patch or
> confirm that we have to disable vls on the affected CPUs.
> 
> Best regards,
>        Maxim Levitsky

Not really - most of them are microservice type ones.
That would also mean there is less chance of corruption since they occupied
less host memory.
And windows uses nested virt even if hyperv is not installed somehow.
(installation does not, but freshly booted guest crashed my node)

Im afraid it might be unresolvable issue, even with microcode.
At least most things point to similar issue as memory leaks on their igpus.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (29 preceding siblings ...)
  2024-10-08 19:05 ` bugzilla-daemon
@ 2024-10-08 19:11 ` bugzilla-daemon
  2024-10-08 21:35 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-08 19:11 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #30 from h4ck3r (michal.litwinczuk@op.pl) ---
I've also saw some memory related issues after overloading guest ram.
For example after having to write to swap on windows guest i've seen pcie
stutters which persisted even after freeing memory (v4 cpu type)
Same thing happened when guest run for too long.
Thou that might be more of an vfio issue than anything else.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (30 preceding siblings ...)
  2024-10-08 19:11 ` bugzilla-daemon
@ 2024-10-08 21:35 ` bugzilla-daemon
  2024-10-16 13:33 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-08 21:35 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #31 from Ben Hirlston (ozonehelix@gmail.com) ---
in my case I am on Arch Linux and I am running 6.11.2-zen1-1-zen the Zen
Kernel. I need to disable vls for my VM to remain stable if I don't my system
will randomly reboot after a random amount of time. disabling vls helps stop
that but it makes nested virtualization stop working so I can't run WSL 2 in my
Windows 11 guest. or Hyper V correctly

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (31 preceding siblings ...)
  2024-10-08 21:35 ` bugzilla-daemon
@ 2024-10-16 13:33 ` bugzilla-daemon
  2024-10-16 18:04 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-16 13:33 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #32 from mlevitsk@redhat.com ---
This is very intersting - disabling VLS should not have any effect other than
performance loss - KVM emulates VMLOAD/VMSAVE in this case, instead of hardware
doing so.

@Ben Hirlston  can you double check that with VLS WSL2 works, and witouth VLS
it doesn't?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (32 preceding siblings ...)
  2024-10-16 13:33 ` bugzilla-daemon
@ 2024-10-16 18:04 ` bugzilla-daemon
  2024-10-18  9:53 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-16 18:04 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #33 from Ben Hirlston (ozonehelix@gmail.com) ---
I wanna say it doesn't if I ever got it to launch it was probably defaulting to
WSL1 as a fallback

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (33 preceding siblings ...)
  2024-10-16 18:04 ` bugzilla-daemon
@ 2024-10-18  9:53 ` bugzilla-daemon
  2024-10-18 19:03 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-18  9:53 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

isdennu (kernel@isdennu.ru) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kernel@isdennu.ru

--- Comment #34 from isdennu (kernel@isdennu.ru) ---
(In reply to h4ck3r from comment #29)
> (In reply to mlevitsk from comment #26)
> > But the question is - did they use nested virtualization on Linux actively
> > and with vls enabled?
> > 
> > 
> > The use case which causes the reboots as I understand is Hyperv enabled
> > Windows, in which case pretty much the whole Windows is running as a nested
> > VM, nested to the Hyperv hypervisor.
> > 
> > Once I get my hands on a client Zen4 machine (I only have Zen2 at home), I
> > will also try to reproduce this but not promises when this will happen. 
> > 
> > Meanwhile I really hope that someone from AMD can take a look a this, and
> > either confirm that this is or will be fixed with a microcode patch or
> > confirm that we have to disable vls on the affected CPUs.
> > 
> > Best regards,
> >        Maxim Levitsky
> 
> Not really - most of them are microservice type ones.
> That would also mean there is less chance of corruption since they occupied
> less host memory.
> And windows uses nested virt even if hyperv is not installed somehow.
> (installation does not, but freshly booted guest crashed my node)
> 
> Im afraid it might be unresolvable issue, even with microcode.
> At least most things point to similar issue as memory leaks on their igpus.

I have a 7900x on which I use igpu and RX7800XT as dgpu (use for AI work) and
motherboard ASUS ProArt X670E-CREATOR. That being said, I actively use
virtualization.
When disabling VLS and even disabling nested virtualization completely, my host
kept rebooting unpredictably. That is, no advice from this thread helped me.
As a result, I disabled igpu in BIOS and enabled nested virtualization and VLS.
With this configuration the reboots continued. But after disabling VLS the
reboots disappeared and now the PC works without fail.
I can assume that the problem with VLS and the use of igpu may be somehow
related.

It would be very interesting to know if this problem is present on Ryzen 9000.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (34 preceding siblings ...)
  2024-10-18  9:53 ` bugzilla-daemon
@ 2024-10-18 19:03 ` bugzilla-daemon
  2024-10-21  9:43 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-18 19:03 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #35 from Ben Hirlston (ozonehelix@gmail.com) ---
yeah I'm not using my iGPU in my 7900X3D at all I have a 7800 XT and a RTX 3060
that is setup to run with the vfio-pci driver on my MSI MPG X670E CARBON WIFI
motherboard. and disabling VLS made my random reboots stop. so to hear that
having an igpu enabled caused it to continue is interesting

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (35 preceding siblings ...)
  2024-10-18 19:03 ` bugzilla-daemon
@ 2024-10-21  9:43 ` bugzilla-daemon
  2024-10-24 14:37 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-21  9:43 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #36 from h4ck3r (michal.litwinczuk@op.pl) ---
WSL and hyperv should still work with vls disabled on host cpu type. It will
run much slower thou.

Since there is memory leak on am5 igpus they might contribute to reboots.
That said i have igpu enabled in bios (not connected nor used) and never saw
reboot with vls disabled.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (36 preceding siblings ...)
  2024-10-21  9:43 ` bugzilla-daemon
@ 2024-10-24 14:37 ` bugzilla-daemon
  2024-11-05 17:22 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-10-24 14:37 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

Simon Labrecque (simon@wegel.ca) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |simon@wegel.ca

--- Comment #37 from Simon Labrecque (simon@wegel.ca) ---
Wow, this bug sent me to a weird path :P I've been using a Windows VM for the
past 6 months, daily, without problem. Then suddently 2 days ago (didn't update
the kernel or anything else), my PC started randomly rebooting when under load.
Reading on problems with AMD cpus (and specifically 7950x), I bought a PSU and
swapped it. Same problem. Then the cpu (for a 7600), then the motherboard, then
the memory... same. I could 100% reproduce the problem by putting the VM under
load within 3 minutes. AFAIK, there was no MCE logged.

I finally stumbled on this thread, set kvm_amd.vls=0, and that indeed fully
fixed the problem.

I don't know how or why hyper-v was suddently enabled in my Windows VM, but
that's obviously what happened. When it did, it triggered this bug quite
violently.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (37 preceding siblings ...)
  2024-10-24 14:37 ` bugzilla-daemon
@ 2024-11-05 17:22 ` bugzilla-daemon
  2024-11-18 16:22 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-11-05 17:22 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

Mario Limonciello (AMD) (mario.limonciello@amd.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |CODE_FIX

--- Comment #38 from Mario Limonciello (AMD) (mario.limonciello@amd.com) ---
Thanks everyone for your feedback and testing.

The following change will go into 6.12 and back to the stable kernels to fix
this issue.  It is essentially doing the same effect that kvm_amd.vls=0 did.

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=a5ca1dc46a6b610dd4627d8b633d6c84f9724ef0

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (38 preceding siblings ...)
  2024-11-05 17:22 ` bugzilla-daemon
@ 2024-11-18 16:22 ` bugzilla-daemon
  2024-11-18 16:48 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-11-18 16:22 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

Sean (animusnull@fastmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |animusnull@fastmail.com

--- Comment #39 from Sean (animusnull@fastmail.com) ---
I've been encountering random reboots with a 7950X as well, oddly no nested
virtualization. I just came across this via a phoronix article, and tried the
kvm_amd.vls=0 argument I've hit my third reboot today after trying to amend the
flag. I'm going to try bringing in the patch  today.



I have two virtual machines I use. Neither is using nested virtualization 
- Windows 11 with VFIO and pcie passthrough
- Linux guest with spice and opengl via virgl



Even with just the Linux guest it crashes 



My main distribution is NixOs with 6.6. But I've tried Fedora, Arch and Ubuntu
with 6.6,6.8.6.10 and 6.11. I've disable power saving, and a number of
different tweaks. 



I've also swapped memory, motherboard, disks and power supply. Updated bios,
etc. The system started showing instability after I bumped to 6.X, and updating
to the bios post voltage issues with the zen 4 cpus. I was on 5.1X for a while
due to bugs on gpu initialization while using vfio.

There is a note on iGPU which I'm using, and experiencing a number of issues
with. I won't go into that, and expect it's due to 6.6, which I'm locked to
right now due to zfs.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (39 preceding siblings ...)
  2024-11-18 16:22 ` bugzilla-daemon
@ 2024-11-18 16:48 ` bugzilla-daemon
  2024-11-20 19:36 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-11-18 16:48 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #40 from Mario Limonciello (AMD) (mario.limonciello@amd.com) ---
It sounds like you have a separate stability issue, it should be brought into
it's own bug.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (40 preceding siblings ...)
  2024-11-18 16:48 ` bugzilla-daemon
@ 2024-11-20 19:36 ` bugzilla-daemon
  2024-11-20 19:37 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-11-20 19:36 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #41 from Ben Hirlston (ozonehelix@gmail.com) ---
I am testing Linux Zen 6.12 with the vls flag disabled and so far I haven't
encountered a crash with WSL2 and Windows 7 in virtualbox in my Windows VM
KVM/QEMU VM so I guess we will see if that changes with time

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (41 preceding siblings ...)
  2024-11-20 19:36 ` bugzilla-daemon
@ 2024-11-20 19:37 ` bugzilla-daemon
  2024-11-20 19:39 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-11-20 19:37 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #42 from Ben Hirlston (ozonehelix@gmail.com) ---
to test this I fired up my KVM/QEMU Windows 11 VM and installed virtualbox in
it and installed wsl2 to see if running virtual machines in KVM/QEMU would
trigger the crash I've been testing for 20 minutes and have yet to see a crash

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (42 preceding siblings ...)
  2024-11-20 19:37 ` bugzilla-daemon
@ 2024-11-20 19:39 ` bugzilla-daemon
  2024-11-23  1:06 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-11-20 19:39 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #43 from Ben Hirlston (ozonehelix@gmail.com) ---
the way I obtained Linux Zen 6.12 was by going to the extra-testing repo on
archlinux.org and running sudo pacman -U linux-zen-version
linux-zen-headers-version to upgrade to those packages

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (43 preceding siblings ...)
  2024-11-20 19:39 ` bugzilla-daemon
@ 2024-11-23  1:06 ` bugzilla-daemon
  2025-02-21  2:00 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2024-11-23  1:06 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

Ed Tomlinson (edtoml@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |edtoml@gmail.com

--- Comment #44 from Ed Tomlinson (edtoml@gmail.com) ---
I applied the patch for this errata on Saturday, seven days later no resets.  I
am not using VMs.  I do run Arch and their kernel is built with virtualization
support and warns "kernel: Booting paravirtualized kernel on bare hardware". 
Wonder if somewhere in the kernel VMLOAD/VMSAVE gets used.  The closest I come
to running a VM is using an appimage.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (44 preceding siblings ...)
  2024-11-23  1:06 ` bugzilla-daemon
@ 2025-02-21  2:00 ` bugzilla-daemon
  2025-02-21 19:45 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2025-02-21  2:00 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

Christian Haefeli (chaefeli@angband.ch) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |chaefeli@angband.ch

--- Comment #45 from Christian Haefeli (chaefeli@angband.ch) ---
(In reply to Mario Limonciello (AMD) from comment #38)
> Thanks everyone for your feedback and testing.
> 
> The following change will go into 6.12 and back to the stable kernels to fix
> this issue.  It is essentially doing the same effect that kvm_amd.vls=0 did.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/
> urgent&id=a5ca1dc46a6b610dd4627d8b633d6c84f9724ef0

Hello
I would prefer the more flexible solution via kvm_amd module parameter.
I am not having this issue with an Epyc 4244p but am now losing performance
due to this hard coded workaround. Does my CPU also qualify as a 'Zen4 Client
SoC? 


Regards
Christian

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (45 preceding siblings ...)
  2025-02-21  2:00 ` bugzilla-daemon
@ 2025-02-21 19:45 ` bugzilla-daemon
  2025-02-26  1:08 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2025-02-21 19:45 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #46 from Mario Limonciello (AMD) (mario.limonciello@amd.com) ---
> I would prefer the more flexible solution via kvm_amd module parameter.
> I am not having this issue

The failure rate is low, but it's statistically significant.
Feel free to revert it on your local kernel tree if you're not hitting the
issue.

> due to this hard coded workaround.

FWIW If you have a fixed BIOS, you will have the same result.

> Does my CPU also qualify as a 'Zen4 Client SoC? 

Yes it also affects any CPUs leveraged from client CPUs.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (46 preceding siblings ...)
  2025-02-21 19:45 ` bugzilla-daemon
@ 2025-02-26  1:08 ` bugzilla-daemon
  2025-02-26  1:10 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2025-02-26  1:08 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #47 from Christian Haefeli (chaefeli@angband.ch) ---
(In reply to Mario Limonciello (AMD) from comment #46)
> > I would prefer the more flexible solution via kvm_amd module parameter.
> > I am not having this issue
> 
> The failure rate is low, but it's statistically significant.
> Feel free to revert it on your local kernel tree if you're not hitting the
> issue.

I do not have the time to tend to a local kernel tree... IMHO an extreme
measure
in relation to a low failure rate. It looks like an after sales castration to
me. 
> 
> > due to this hard coded workaround.
> 
> FWIW If you have a fixed BIOS, you will have the same result.

So you are saying that from a specific AGESA version onward, these instructions
are being hidden from the OS? If yes, is this the same case for Zen5 based
Client SoC SKUs?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (47 preceding siblings ...)
  2025-02-26  1:08 ` bugzilla-daemon
@ 2025-02-26  1:10 ` bugzilla-daemon
  2025-02-26  9:50 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2025-02-26  1:10 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #48 from Ben Hirlston (ozonehelix@gmail.com) ---
I haven't had issues with this since 6.11 6.12 fixed this for me. for context I
have an Ryzen 9 7900X3D

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (48 preceding siblings ...)
  2025-02-26  1:10 ` bugzilla-daemon
@ 2025-02-26  9:50 ` bugzilla-daemon
  2025-02-27 13:26 ` bugzilla-daemon
  2025-03-06 22:19 ` bugzilla-daemon
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2025-02-26  9:50 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #49 from Christian Haefeli (chaefeli@angband.ch) ---
(In reply to Ben Hirlston from comment #48)
> I haven't had issues with this since 6.11 6.12 fixed this for me. for
> context I have an Ryzen 9 7900X3D

Yes you most likely own an affected CPU. This in-kernel change fixes the
stabiliy issue you have encountered but at the cost of performance. And for
people like me, whose CPUs are not affected it is even worse. We lose
performance for just nothing. IMHO the proper way would be if AMD would offer
a CPU swap for affected customers.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (49 preceding siblings ...)
  2025-02-26  9:50 ` bugzilla-daemon
@ 2025-02-27 13:26 ` bugzilla-daemon
  2025-03-06 22:19 ` bugzilla-daemon
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2025-02-27 13:26 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #50 from h4ck3r (michal.litwinczuk@op.pl) ---
(In reply to Christian Haefeli from comment #49)

They should at least announce that issue globally.
I got new cpu specifficly for nested virt workloads.
Turns out - that "patch" makes it so windows can't use nested virt.
Its complaining about missing features.

Bigger issue - windows does not have patch applied afaik.
This means that win host with hyperv enabled can crash anytime.

Back to linux - disabling vls does have performance penalty around 10-20% as
far as i tested.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Bug 219009] Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected)
  2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
                   ` (50 preceding siblings ...)
  2025-02-27 13:26 ` bugzilla-daemon
@ 2025-03-06 22:19 ` bugzilla-daemon
  51 siblings, 0 replies; 53+ messages in thread
From: bugzilla-daemon @ 2025-03-06 22:19 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=219009

--- Comment #51 from h4ck3r (h4ck3re404@gmail.com) ---
I finally had time to test differences more clearly.

Cpus i used were 7700x and 9700x

Guest os is windows (linux didn't had most issues in the first place)

No nested virt warning with wsl still persists (probably windows being windows
cause zen5 only guests don't have it)

Pcie "slowdown" is gone using zen5 (no performance penalty on gpu)

No issues with nvme drive passthrough (it would behave as broken hdd on zen4
having huge latency and increased temps for some reason - probably irq
handling)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2025-03-06 22:19 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-06 11:20 [Bug 219009] New: Random host reboots on Ryzen 7000/8000 using nested VMs (vls suspected) bugzilla-daemon
2024-08-03 14:59 ` [Bug 219009] " bugzilla-daemon
2024-08-23  7:36 ` bugzilla-daemon
2024-08-23  7:37 ` bugzilla-daemon
2024-08-23 20:45 ` bugzilla-daemon
2024-08-23 20:49 ` bugzilla-daemon
2024-08-23 21:08 ` bugzilla-daemon
2024-08-25 11:44 ` bugzilla-daemon
2024-08-25 11:45 ` bugzilla-daemon
2024-08-26  0:07 ` bugzilla-daemon
2024-08-26  0:08 ` bugzilla-daemon
2024-08-27 18:16 ` bugzilla-daemon
2024-08-27 18:19 ` bugzilla-daemon
2024-08-31  0:10 ` bugzilla-daemon
2024-08-31 11:51 ` bugzilla-daemon
2024-08-31 18:58 ` bugzilla-daemon
2024-08-31 21:51 ` bugzilla-daemon
2024-08-31 22:54 ` bugzilla-daemon
2024-10-02 22:52 ` bugzilla-daemon
2024-10-02 22:53 ` bugzilla-daemon
2024-10-02 22:53 ` bugzilla-daemon
2024-10-03 15:03 ` bugzilla-daemon
2024-10-03 15:05 ` bugzilla-daemon
2024-10-03 15:11 ` [Bug 219009] New: " Maxim Levitsky
2024-10-03 15:11 ` [Bug 219009] " bugzilla-daemon
2024-10-03 17:13 ` bugzilla-daemon
2024-10-08 17:32 ` bugzilla-daemon
2024-10-08 17:43 ` bugzilla-daemon
2024-10-08 17:53 ` bugzilla-daemon
2024-10-08 18:26 ` bugzilla-daemon
2024-10-08 19:05 ` bugzilla-daemon
2024-10-08 19:11 ` bugzilla-daemon
2024-10-08 21:35 ` bugzilla-daemon
2024-10-16 13:33 ` bugzilla-daemon
2024-10-16 18:04 ` bugzilla-daemon
2024-10-18  9:53 ` bugzilla-daemon
2024-10-18 19:03 ` bugzilla-daemon
2024-10-21  9:43 ` bugzilla-daemon
2024-10-24 14:37 ` bugzilla-daemon
2024-11-05 17:22 ` bugzilla-daemon
2024-11-18 16:22 ` bugzilla-daemon
2024-11-18 16:48 ` bugzilla-daemon
2024-11-20 19:36 ` bugzilla-daemon
2024-11-20 19:37 ` bugzilla-daemon
2024-11-20 19:39 ` bugzilla-daemon
2024-11-23  1:06 ` bugzilla-daemon
2025-02-21  2:00 ` bugzilla-daemon
2025-02-21 19:45 ` bugzilla-daemon
2025-02-26  1:08 ` bugzilla-daemon
2025-02-26  1:10 ` bugzilla-daemon
2025-02-26  9:50 ` bugzilla-daemon
2025-02-27 13:26 ` bugzilla-daemon
2025-03-06 22:19 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).