All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] KVM: x86: Unconditionally recompute CR8 intercept on PPR update
@ 2026-06-10 21:45 Carlos López
  2026-06-11 16:15 ` [syzbot ci] " syzbot ci
  0 siblings, 1 reply; 3+ messages in thread
From: Carlos López @ 2026-06-10 21:45 UTC (permalink / raw)
  To: kvm, seanjc, pbonzini
  Cc: osteffen, Carlos López, Stefano Garzarella, stable,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), H. Peter Anvin,
	Roman Kagan, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

The TPR_THRESHOLD field in the VMCS is used by VMX to induce VM exits
when the guest's virtual TPR falls under the specified threshold,
allowing KVM to inject previously masked interrupts.

KVM handles these VM exits in handle_tpr_below_threshold().
Commit eb90f3417a0c ("KVM: vmx: speed up TPR below threshold vmexits")
optimized this function by calling apic_update_ppr() instead of raising
KVM_REQ_EVENT. apic_update_ppr() then raises KVM_REQ_EVENT if there is
a pending, deliverable interrupt.

However, if there are no new interrupts pending, apic_update_ppr() does
not issue the request. Thus, kvm_lapic_update_cr8_intercept() and
vmx_update_cr8_intercept() are not called before VM entry, which results
in a high, stale TPR_THRESHOLD. This is problematic due to the following
sentence in 28.2.1.1 "VM-Execution Control Fields" in the SDM:

  The following check is performed if the “use TPR shadow” VM-execution
  control is 1 and the “virtualize APIC accesses” and “virtual-interrupt
  delivery” VM-execution controls are both 0: the value of bits 3:0 of
  the TPR threshold VM-execution control field should not be greater
  than the value of bits 7:4 of VTPR.

This error condition is typically not observed when KVM runs on a bare
metal system because modern processors support APICv, which enables
virtual-interrupt delivery, and which KVM uses when possible. This
causes the processor to no longer generate TPR-below-threshold exits
and to no longer check TPR_THRESHOLD on entry. However, when running
on older platforms, or under nested virtualization on a hypervisor that
does not support virtual-interrupt delivery and enforces this check
(like Hyper-V) this can cause a VM entry failure with hardware error
0x7, as seen in [1].

Call kvm_lapic_update_cr8_intercept() if apic_update_ppr() does not
find a deliverable interrupt (and thus does not raise KVM_REQ_EVENT).
Remove calls to kvm_lapic_update_cr8_intercept() on paths that end up in
apic_update_ppr(), as they now become redundant. This ensures that any
path that updates the guest's PPR also figures out if KVM needs to wait
for a TPR change (using TPR_THRESHOLD on VMX or CR8 intercepts on SVM).

Link: https://github.com/coconut-svsm/svsm/issues/1081 [1]
Tested-by: Stefano Garzarella <sgarzare@redhat.com>
Cc: stable@vger.kernel.org
Fixes: eb90f3417a0c ("KVM: vmx: speed up TPR below threshold vmexits")
Signed-off-by: Carlos López <clopez@suse.de>
---
v2:
* Call kvm_lapic_update_cr8_intercept() from apic_update_ppr() instead
  of issuing KVM_REQ_EVENTS from handle_tpr_below_threshold().
 arch/x86/kvm/lapic.c | 2 ++
 arch/x86/kvm/x86.c   | 5 +----
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9d2df8623f6d..f6a289d01a26 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -980,6 +980,8 @@ static void apic_update_ppr(struct kvm_lapic *apic)
 	if (__apic_update_ppr(apic, &ppr) &&
 	    apic_has_interrupt_for_ppr(apic, ppr) != -1)
 		kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
+	else
+		kvm_lapic_update_cr8_intercept(apic->vcpu);
 }
 
 void kvm_apic_update_ppr(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cf122b8c3210..6662c8e973f2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5317,7 +5317,6 @@ static int kvm_vcpu_ioctl_set_lapic(struct kvm_vcpu *vcpu,
 	r = kvm_apic_set_state(vcpu, s);
 	if (r)
 		return r;
-	kvm_lapic_update_cr8_intercept(vcpu);
 
 	return 0;
 }
@@ -12418,8 +12417,6 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
 	kvm_register_mark_dirty(vcpu, VCPU_REG_CR3);
 	kvm_x86_call(post_set_cr3)(vcpu, sregs->cr3);
 
-	kvm_set_cr8(vcpu, sregs->cr8);
-
 	*mmu_reset_needed |= vcpu->arch.efer != sregs->efer;
 	kvm_x86_call(set_efer)(vcpu, sregs->efer);
 
@@ -12448,7 +12445,7 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
 	kvm_set_segment(vcpu, &sregs->tr, VCPU_SREG_TR);
 	kvm_set_segment(vcpu, &sregs->ldt, VCPU_SREG_LDTR);
 
-	kvm_lapic_update_cr8_intercept(vcpu);
+	kvm_set_cr8(vcpu, sregs->cr8);
 
 	/* Older userspace won't unhalt the vcpu on reset. */
 	if (kvm_vcpu_is_bsp(vcpu) && kvm_rip_read(vcpu) == 0xfff0 &&

base-commit: c1f7303302927f9cbf4efedf70f0512cde168c65
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [syzbot ci] Re: KVM: x86: Unconditionally recompute CR8 intercept on PPR update
  2026-06-10 21:45 [PATCH v2] KVM: x86: Unconditionally recompute CR8 intercept on PPR update Carlos López
@ 2026-06-11 16:15 ` syzbot ci
  2026-06-11 17:20   ` Sean Christopherson
  0 siblings, 1 reply; 3+ messages in thread
From: syzbot ci @ 2026-06-11 16:15 UTC (permalink / raw)
  To: bp, clopez, dave.hansen, hpa, kvm, linux-kernel, mingo, osteffen,
	pbonzini, rkagan, seanjc, sgarzare, stable, tglx, x86
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v2] KVM: x86: Unconditionally recompute CR8 intercept on PPR update
https://lore.kernel.org/all/20260610214523.2905255-2-clopez@suse.de
* [PATCH v2] KVM: x86: Unconditionally recompute CR8 intercept on PPR update

and found the following issue:
WARNING in vmx_update_cr8_intercept

Full report is available here:
https://ci.syzbot.org/series/d94aebb2-8082-4777-ab08-5c3a0d680bed

***

WARNING in vmx_update_cr8_intercept

tree:      linux-next
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next
base:      c1f7303302927f9cbf4efedf70f0512cde168c65
arch:      amd64
compiler:  Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config:    https://ci.syzbot.org/builds/8f417377-50a6-450a-8ce0-a83de33b8c6d/config
syz repro: https://ci.syzbot.org/findings/62a660e6-a9b1-42ca-9cf0-7aadd2f5d292/syz_repro

------------[ cut here ]------------
debug_locks && !(lock_is_held(&(&vcpu->mutex)->dep_map) || !refcount_read(&vcpu->kvm->users_count))
WARNING: arch/x86/kvm/vmx/nested.h:61 at get_vmcs12 arch/x86/kvm/vmx/nested.h:60 [inline], CPU#0: syz.2.19/5879
WARNING: arch/x86/kvm/vmx/nested.h:61 at vmx_update_cr8_intercept+0x3de/0x4e0 arch/x86/kvm/vmx/vmx.c:6879, CPU#0: syz.2.19/5879
Modules linked in:
CPU: 0 UID: 0 PID: 5879 Comm: syz.2.19 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:get_vmcs12 arch/x86/kvm/vmx/nested.h:60 [inline]
RIP: 0010:vmx_update_cr8_intercept+0x3de/0x4e0 arch/x86/kvm/vmx/vmx.c:6879
Code: 0b 90 e9 f1 fe ff ff e8 30 12 69 00 90 0f 0b 90 e9 59 fe ff ff e8 22 12 69 00 e8 ad 86 d6 ff e9 ca fe ff ff e8 13 12 69 00 90 <0f> 0b 90 e9 fc fc ff ff e8 05 12 69 00 e8 90 86 d6 ff eb a7 48 c7
RSP: 0018:ffffc9000271f758 EFLAGS: 00010293
RAX: ffffffff815d048d RBX: ffff888113380000 RCX: ffff8881142b8000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 00000000ffffffff R08: ffff8881114d9703 R09: 1ffff1102229b2e0
R10: dffffc0000000000 R11: ffffed102229b2e1 R12: 0000000000000000
R13: dffffc0000000000 R14: ffff888116d0bca0 R15: 0000000000000001
FS:  00007ff34d6b86c0(0000) GS:ffff88818dc9b000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffc75e78ae8 CR3: 000000010c609000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 apic_update_ppr arch/x86/kvm/lapic.c:984 [inline]
 kvm_lapic_reset+0x1c24/0x2980 arch/x86/kvm/lapic.c:3023
 kvm_vcpu_reset+0x44c/0x1bf0 arch/x86/kvm/x86.c:12986
 kvm_arch_vcpu_create+0x746/0x8b0 arch/x86/kvm/x86.c:12847
 kvm_vm_ioctl_create_vcpu+0x428/0x930 virt/kvm/kvm_main.c:4201
 kvm_vm_ioctl+0x893/0xd50 virt/kvm/kvm_main.c:5159
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff34c79ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff34d6b8028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ff34ca15fa0 RCX: 00007ff34c79ce59
RDX: 0000000000000002 RSI: 000000000000ae41 RDI: 0000000000000004
RBP: 00007ff34c832d6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ff34ca16038 R14: 00007ff34ca15fa0 R15: 00007ffe4bc28aa8
 </TASK>


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).

The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [syzbot ci] Re: KVM: x86: Unconditionally recompute CR8 intercept on PPR update
  2026-06-11 16:15 ` [syzbot ci] " syzbot ci
@ 2026-06-11 17:20   ` Sean Christopherson
  0 siblings, 0 replies; 3+ messages in thread
From: Sean Christopherson @ 2026-06-11 17:20 UTC (permalink / raw)
  To: syzbot ci
  Cc: bp, clopez, dave.hansen, hpa, kvm, linux-kernel, mingo, osteffen,
	pbonzini, rkagan, sgarzare, stable, tglx, x86, syzbot,
	syzkaller-bugs

On Thu, Jun 11, 2026, syzbot ci wrote:
> syzbot ci has tested the following series
> 
> [v2] KVM: x86: Unconditionally recompute CR8 intercept on PPR update
> https://lore.kernel.org/all/20260610214523.2905255-2-clopez@suse.de
> * [PATCH v2] KVM: x86: Unconditionally recompute CR8 intercept on PPR update
> 
> and found the following issue:
> WARNING in vmx_update_cr8_intercept

...

> ------------[ cut here ]------------
> debug_locks && !(lock_is_held(&(&vcpu->mutex)->dep_map) || !refcount_read(&vcpu->kvm->users_count))
> WARNING: arch/x86/kvm/vmx/nested.h:61 at get_vmcs12 arch/x86/kvm/vmx/nested.h:60 [inline], CPU#0: syz.2.19/5879
> WARNING: arch/x86/kvm/vmx/nested.h:61 at vmx_update_cr8_intercept+0x3de/0x4e0 arch/x86/kvm/vmx/vmx.c:6879, CPU#0: syz.2.19/5879
> Modules linked in:
> CPU: 0 UID: 0 PID: 5879 Comm: syz.2.19 Not tainted syzkaller #0 PREEMPT(full) 
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:get_vmcs12 arch/x86/kvm/vmx/nested.h:60 [inline]
> RIP: 0010:vmx_update_cr8_intercept+0x3de/0x4e0 arch/x86/kvm/vmx/vmx.c:6879
>  apic_update_ppr arch/x86/kvm/lapic.c:984 [inline]
>  kvm_lapic_reset+0x1c24/0x2980 arch/x86/kvm/lapic.c:3023
>  kvm_vcpu_reset+0x44c/0x1bf0 arch/x86/kvm/x86.c:12986
>  kvm_arch_vcpu_create+0x746/0x8b0 arch/x86/kvm/x86.c:12847
>  kvm_vm_ioctl_create_vcpu+0x428/0x930 virt/kvm/kvm_main.c:4201
>  kvm_vm_ioctl+0x893/0xd50 virt/kvm/kvm_main.c:5159
>  vfs_ioctl fs/ioctl.c:51 [inline]
>  __do_sys_ioctl fs/ioctl.c:597 [inline]
>  __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f

This is "fine", the assertion just wants to make sure KVM isn't access vmcs12
without holding vcpu->mutex, otherwise any queries are inherently unstable.
It's just that vCPU creation runs without taking vcpu->mutex, because the vCPU
is otherwise unreachable.

I'm pretty sure we can squash the WARN by grabbing vmcs12 if and only if the vCPU
is actually in guest mode.

diff --git arch/x86/kvm/vmx/vmx.c arch/x86/kvm/vmx/vmx.c
index c548f22375ad..332fbcd924f2 100644
--- arch/x86/kvm/vmx/vmx.c
+++ arch/x86/kvm/vmx/vmx.c
@@ -6876,11 +6876,10 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 
 void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
 {
-       struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
        int tpr_threshold;
 
        if (is_guest_mode(vcpu) &&
-               nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW))
+           nested_cpu_has(get_vmcs12(vcpu), CPU_BASED_TPR_SHADOW))
                return;
 
        guard(vmx_vmcs01)(vcpu);


Longer term, I'll work on figuring out how to handle this in get_vmcs12(), because
to_hv_vcpu() has the solve the same fundamental problem:

https://lore.kernel.org/all/aeqRzanSaa9P_EPg@google.com

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-11 17:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 21:45 [PATCH v2] KVM: x86: Unconditionally recompute CR8 intercept on PPR update Carlos López
2026-06-11 16:15 ` [syzbot ci] " syzbot ci
2026-06-11 17:20   ` Sean Christopherson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.