* [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues
@ 2026-05-06 18:47 Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao
Fix a variety of bugs in SVM's handling of x2APIC MSR passthrough for x2AVIC,
where KVM disables interception for MSR accesses that aren't accelerated by
hardware (pointless and suboptimal), and also does NOT disable interception
for practically any of the "range of vectors" MSRs, i.e. IRR, ISR, and TMR.
Found by inspection when reviewing a TDX patch to fix a bug where KVM botched
the "range of vectors"[*] (I was curious how other KVM code handled the ranges;
wasn't expecting this...).
Note, I tagged all of this for stable, but I could be convinced these fixes
shouldn't be sent to LTS trees. Patch 3 in particular doesn't truly fix
anything, though I definitely don't like relying on poorly documented behavior.
Note #2, the diff stats are misleading due to the hacks, the "real" stats are:
arch/x86/kvm/svm/avic.c | 51 ++++++++++++++++-----------------------------------
1 file changed, 16 insertions(+), 35 deletions(-)
[*] https://lore.kernel.org/all/20260318190111.1041924-1-dmaluka@chromium.org
v2:
- Actually iterate over the mask of readable regs. [Naveen]
- Rewrite the changelog for patch 3 to more accurately capture what happens,
and to avoid conflating "unaccelerated" with "fault-like". [Naveen]
- Massage the changlog for patch 1 to describe the observed behavior of
DFR and ICR2.
- Test the #VMEXIT (or not) behavior with hacks (patches 4 and 5).
v1: https://lore.kernel.org/all/20260409222449.2013847-1-seanjc@google.com
Sean Christopherson (5):
KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually
supports
KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count)
KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are
accelerated
*** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced
exits (for testing)
*** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC
MSR interception
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/kvm/svm/avic.c | 51 ++--
arch/x86/kvm/svm/svm.c | 81 +++++++
arch/x86/kvm/vmx/vmx.c | 79 +++++++
arch/x86/kvm/x86.c | 2 +
.../testing/selftests/kvm/include/x86/apic.h | 84 ++++++-
.../selftests/kvm/x86/fix_hypercall_test.c | 2 +-
.../selftests/kvm/x86/xapic_ipi_test.c | 4 +-
.../selftests/kvm/x86/xapic_state_test.c | 217 ++++++++++++++++++
9 files changed, 476 insertions(+), 46 deletions(-)
base-commit: 6d35786de28116ecf78797a62b84e6bf3c45aa5a
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao
Fix multiple (classes of) bugs with one stone by using KVM's mask of
readable local APIC registers to determine which x2APIC MSRs to pass
through (or not) when toggling x2AVIC on/off. The existing hand-coded
list of MSRs is wrong on multiple fronts:
- ARBPRI isn't supported by x2APIC, but its unaccelerated AVIC intercept
is fault-like; disabling interception is nonsensical and suboptimal as
the access generates a #VMEXIT that requires decoding the instruction.
- DFR and ICR2 aren't supported by x2APIC and so don't need their
intercepts disabled for performance reasons. While the #GP due to
x2APIC being abled has higher priority than the trap-like #VMEXIT,
disabling interception of unsupported MSRs is confusing and unnecessary.
- RRR is completely unsupported.
- AVIC currently fails to pass through the "range of vectors" registers,
IRR, ISR, and TMR, as e.g. X2APIC_MSR(APIC_IRR) only affects IRR0, and
thus only disables intercept for vectors 31:0 (which are the *least*
interesting registers).
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/avic.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index adf211860949..4f203e503e8e 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -122,6 +122,9 @@ static u32 x2avic_max_physical_id;
static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
bool intercept)
{
+ struct kvm_vcpu *vcpu = &svm->vcpu;
+ u64 x2apic_readable_mask;
+
static const u32 x2avic_passthrough_msrs[] = {
X2APIC_MSR(APIC_ID),
X2APIC_MSR(APIC_LVR),
@@ -162,9 +165,16 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
if (!x2avic_enabled)
return;
+ x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+
+ for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
+ BITS_PER_TYPE(x2apic_readable_mask))
+ svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
+ MSR_TYPE_R, intercept);
+
for (i = 0; i < ARRAY_SIZE(x2avic_passthrough_msrs); i++)
- svm_set_intercept_for_msr(&svm->vcpu, x2avic_passthrough_msrs[i],
- MSR_TYPE_RW, intercept);
+ svm_set_intercept_for_msr(vcpu, x2avic_passthrough_msrs[i],
+ MSR_TYPE_W, intercept);
svm->x2avic_msrs_intercepted = intercept;
}
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count)
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao
Explicitly intercept RDMSR for TMMCT, a.k.a. the current APIC timer count,
when x2AVIC is enabled, as TMMCT reads aren't accelerated by hardware.
Disabling interception is suboptimal as the RDMSR generates an
AVIC_UNACCELERATED_ACCESS fault #VMEXIT, which forces KVM to decode the
instruction to figure out what the guest was trying to access.
Note, the only reason this isn't a fatal bug is that the AVIC architecture
had the foresight to guard against buggy hypervisors. E.g. if hardware
simply read from the virtual APIC page, the guest would get garbage.
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/avic.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 4f203e503e8e..d693c9ff9f18 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -172,6 +172,9 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
MSR_TYPE_R, intercept);
+ if (!intercept)
+ svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
+
for (i = 0; i < ARRAY_SIZE(x2avic_passthrough_msrs); i++)
svm_set_intercept_for_msr(vcpu, x2avic_passthrough_msrs[i],
MSR_TYPE_W, intercept);
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson
4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao
When x2AVIC is enabled, disable WRMSR interception only for MSRs that are
actually accelerated by hardware. Disabling interception for MSRs that
aren't accelerated is functionally "fine", and in some cases a weird "win"
for performance, but only for cases that should never be triggered by a
well-behaved VM (writes to read-only registers; the #GP will typically
occur in the guest without taking a #VMEXIT, even for fault-like exits).
But overall, disabling interception for MSRs that aren't accelerated is at
best confusing and unintuitive, and at worst introduces avoidable risk, as
the effective guest-visible behavior depends on the whims of the CPU (the
behavior of x2APIC MSR writes on at least Zen4 doesn't match the behavior
documented in the table in "15.29.3.1 Virtual APIC Register Accesses" of
the APM).
Note, the set of MSRs that are passed through for write is identical to
VMX's set when IPI virtualization is enabled. This is not a coincidence,
and is another motiviating factor for cleaning up the intercepts, as x2AVIC
is functionally equivalent to APICv+IPIv.
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/avic.c | 40 ++++------------------------------------
1 file changed, 4 insertions(+), 36 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index d693c9ff9f18..c5d46c0d2403 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -124,39 +124,6 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
{
struct kvm_vcpu *vcpu = &svm->vcpu;
u64 x2apic_readable_mask;
-
- static const u32 x2avic_passthrough_msrs[] = {
- X2APIC_MSR(APIC_ID),
- X2APIC_MSR(APIC_LVR),
- X2APIC_MSR(APIC_TASKPRI),
- X2APIC_MSR(APIC_ARBPRI),
- X2APIC_MSR(APIC_PROCPRI),
- X2APIC_MSR(APIC_EOI),
- X2APIC_MSR(APIC_RRR),
- X2APIC_MSR(APIC_LDR),
- X2APIC_MSR(APIC_DFR),
- X2APIC_MSR(APIC_SPIV),
- X2APIC_MSR(APIC_ISR),
- X2APIC_MSR(APIC_TMR),
- X2APIC_MSR(APIC_IRR),
- X2APIC_MSR(APIC_ESR),
- X2APIC_MSR(APIC_ICR),
- X2APIC_MSR(APIC_ICR2),
-
- /*
- * Note! Always intercept LVTT, as TSC-deadline timer mode
- * isn't virtualized by hardware, and the CPU will generate a
- * #GP instead of a #VMEXIT.
- */
- X2APIC_MSR(APIC_LVTTHMR),
- X2APIC_MSR(APIC_LVTPC),
- X2APIC_MSR(APIC_LVT0),
- X2APIC_MSR(APIC_LVT1),
- X2APIC_MSR(APIC_LVTERR),
- X2APIC_MSR(APIC_TMICT),
- X2APIC_MSR(APIC_TMCCT),
- X2APIC_MSR(APIC_TDCR),
- };
int i;
if (intercept == svm->x2avic_msrs_intercepted)
@@ -175,9 +142,10 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
if (!intercept)
svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
- for (i = 0; i < ARRAY_SIZE(x2avic_passthrough_msrs); i++)
- svm_set_intercept_for_msr(vcpu, x2avic_passthrough_msrs[i],
- MSR_TYPE_W, intercept);
+ svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_W, intercept);
+ svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W, intercept);
+ svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W, intercept);
+ svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_ICR), MSR_TYPE_W, intercept);
svm->x2avic_msrs_intercepted = intercept;
}
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing)
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
` (2 preceding siblings ...)
2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson
4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao
Not-signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/kvm/svm/svm.c | 81 +++++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.c | 79 ++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 2 +
4 files changed, 164 insertions(+)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c470e40a00aa..bff534bd00dc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1703,6 +1703,8 @@ struct kvm_vcpu_stat {
u64 invlpg;
u64 exits;
+ u64 guest_induced_exits;
+ u64 msr_exits;
u64 io_exits;
u64 mmio_exits;
u64 signal_exits;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e7fdd7a9c280..7886bd1ad8f2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4378,6 +4378,81 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
return EXIT_FASTPATH_NONE;
}
+static bool is_guest_induced_exit(u64 exit_code)
+{
+ switch (exit_code) {
+ case SVM_EXIT_READ_CR0:
+ case SVM_EXIT_READ_CR3:
+ case SVM_EXIT_READ_CR4:
+ case SVM_EXIT_READ_CR8:
+ case SVM_EXIT_CR0_SEL_WRITE:
+ case SVM_EXIT_WRITE_CR0:
+ case SVM_EXIT_WRITE_CR3:
+ case SVM_EXIT_WRITE_CR4:
+ case SVM_EXIT_WRITE_CR8:
+ case SVM_EXIT_READ_DR0:
+ case SVM_EXIT_READ_DR1:
+ case SVM_EXIT_READ_DR2:
+ case SVM_EXIT_READ_DR3:
+ case SVM_EXIT_READ_DR4:
+ case SVM_EXIT_READ_DR5:
+ case SVM_EXIT_READ_DR6:
+ case SVM_EXIT_READ_DR7:
+ case SVM_EXIT_WRITE_DR0:
+ case SVM_EXIT_WRITE_DR1:
+ case SVM_EXIT_WRITE_DR2:
+ case SVM_EXIT_WRITE_DR3:
+ case SVM_EXIT_WRITE_DR4:
+ case SVM_EXIT_WRITE_DR5:
+ case SVM_EXIT_WRITE_DR6:
+ case SVM_EXIT_WRITE_DR7:
+ case SVM_EXIT_EXCP_BASE + DB_VECTOR:
+ case SVM_EXIT_EXCP_BASE + BP_VECTOR:
+ case SVM_EXIT_EXCP_BASE + UD_VECTOR:
+ case SVM_EXIT_EXCP_BASE + PF_VECTOR:
+ case SVM_EXIT_EXCP_BASE + AC_VECTOR:
+ case SVM_EXIT_EXCP_BASE + GP_VECTOR:
+ case SVM_EXIT_RDPMC:
+ case SVM_EXIT_CPUID:
+ case SVM_EXIT_IRET:
+ case SVM_EXIT_INVD:
+ case SVM_EXIT_PAUSE:
+ case SVM_EXIT_HLT:
+ case SVM_EXIT_INVLPG:
+ case SVM_EXIT_INVLPGA:
+ case SVM_EXIT_IOIO:
+ case SVM_EXIT_MSR:
+ case SVM_EXIT_TASK_SWITCH:
+ case SVM_EXIT_SHUTDOWN:
+ case SVM_EXIT_VMRUN:
+ case SVM_EXIT_VMMCALL:
+ case SVM_EXIT_VMLOAD:
+ case SVM_EXIT_VMSAVE:
+ case SVM_EXIT_STGI:
+ case SVM_EXIT_CLGI:
+ case SVM_EXIT_SKINIT:
+ case SVM_EXIT_RDTSCP:
+ case SVM_EXIT_WBINVD:
+ case SVM_EXIT_MONITOR:
+ case SVM_EXIT_MWAIT:
+ case SVM_EXIT_XSETBV:
+ case SVM_EXIT_RDPRU:
+ case SVM_EXIT_EFER_WRITE_TRAP:
+ case SVM_EXIT_CR0_WRITE_TRAP:
+ case SVM_EXIT_CR4_WRITE_TRAP:
+ case SVM_EXIT_CR8_WRITE_TRAP:
+ case SVM_EXIT_INVPCID:
+ case SVM_EXIT_IDLE_HLT:
+ case SVM_EXIT_RSM:
+ case SVM_EXIT_AVIC_INCOMPLETE_IPI:
+ case SVM_EXIT_AVIC_UNACCELERATED_ACCESS:
+ return true;
+ default:
+ break;
+ }
+ return false;
+}
+
static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
{
struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, vcpu->cpu);
@@ -4573,6 +4648,12 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (is_guest_mode(vcpu))
svm->nested.ctl.next_rip = svm->vmcb->control.next_rip;
+ if (is_guest_induced_exit(svm->vmcb->control.exit_code))
+ ++vcpu->stat.guest_induced_exits;
+
+ if (svm->vmcb->control.exit_code == SVM_EXIT_MSR)
+ ++vcpu->stat.msr_exits;
+
return svm_exit_handlers_fastpath(vcpu);
}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5c2c33a5f7dc..859f4bc01445 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7478,6 +7478,79 @@ noinstr void vmx_handle_nmi(struct kvm_vcpu *vcpu)
kvm_after_interrupt(vcpu);
}
+static bool is_guest_induced_exit(struct kvm_vcpu *vcpu)
+{
+ switch (vmx_get_exit_reason(vcpu).basic) {
+ case EXIT_REASON_EXCEPTION_NMI:
+ if (is_nmi(vmx_get_intr_info(vcpu)))
+ return false;
+ return true;
+ case EXIT_REASON_TRIPLE_FAULT:
+ case EXIT_REASON_IO_INSTRUCTION:
+ case EXIT_REASON_CR_ACCESS:
+ case EXIT_REASON_DR_ACCESS:
+ case EXIT_REASON_CPUID:
+ case EXIT_REASON_MSR_READ:
+ case EXIT_REASON_MSR_WRITE:
+ case EXIT_REASON_HLT:
+ case EXIT_REASON_INVD:
+ case EXIT_REASON_INVLPG:
+ case EXIT_REASON_RDPMC:
+ case EXIT_REASON_VMCALL:
+ case EXIT_REASON_VMCLEAR:
+ case EXIT_REASON_VMLAUNCH:
+ case EXIT_REASON_VMPTRLD:
+ case EXIT_REASON_VMPTRST:
+ case EXIT_REASON_VMREAD:
+ case EXIT_REASON_VMRESUME:
+ case EXIT_REASON_VMWRITE:
+ case EXIT_REASON_VMOFF:
+ case EXIT_REASON_VMON:
+ case EXIT_REASON_TPR_BELOW_THRESHOLD:
+ case EXIT_REASON_APIC_ACCESS:
+ case EXIT_REASON_APIC_WRITE:
+ case EXIT_REASON_EOI_INDUCED:
+ case EXIT_REASON_WBINVD:
+ case EXIT_REASON_XSETBV:
+ case EXIT_REASON_TASK_SWITCH:
+ case EXIT_REASON_MCE_DURING_VMENTRY:
+ case EXIT_REASON_GDTR_IDTR:
+ case EXIT_REASON_LDTR_TR:
+ case EXIT_REASON_PAUSE_INSTRUCTION:
+ case EXIT_REASON_MWAIT_INSTRUCTION:
+ case EXIT_REASON_MONITOR_INSTRUCTION:
+ case EXIT_REASON_INVEPT:
+ case EXIT_REASON_INVVPID:
+ case EXIT_REASON_RDRAND:
+ case EXIT_REASON_RDSEED:
+ case EXIT_REASON_INVPCID:
+ case EXIT_REASON_VMFUNC:
+ case EXIT_REASON_ENCLS:
+ case EXIT_REASON_SEAMCALL:
+ case EXIT_REASON_TDCALL:
+ case EXIT_REASON_MSR_READ_IMM:
+ case EXIT_REASON_MSR_WRITE_IMM:
+ return true;
+ default:
+ break;
+ }
+ return false;
+}
+
+static bool is_msr_exit(struct kvm_vcpu *vcpu)
+{
+ switch (vmx_get_exit_reason(vcpu).basic) {
+ case EXIT_REASON_MSR_READ:
+ case EXIT_REASON_MSR_WRITE:
+ case EXIT_REASON_MSR_READ_IMM:
+ case EXIT_REASON_MSR_WRITE_IMM:
+ return true;
+ default:
+ break;
+ }
+ return false;
+}
+
static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
unsigned int flags)
{
@@ -7667,6 +7740,12 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
vmx_recover_nmi_blocking(vmx);
vmx_complete_interrupts(vmx);
+ if (is_guest_induced_exit(vcpu))
+ ++vcpu->stat.guest_induced_exits;
+
+ if (is_msr_exit(vcpu))
+ ++vcpu->stat.msr_exits;
+
return vmx_exit_handlers_fastpath(vcpu, force_immediate_exit);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0a1b63c63d1a..dc69b8cebe0b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -283,6 +283,8 @@ const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
STATS_DESC_COUNTER(VCPU, tlb_flush),
STATS_DESC_COUNTER(VCPU, invlpg),
STATS_DESC_COUNTER(VCPU, exits),
+ STATS_DESC_COUNTER(VCPU, guest_induced_exits),
+ STATS_DESC_COUNTER(VCPU, msr_exits),
STATS_DESC_COUNTER(VCPU, io_exits),
STATS_DESC_COUNTER(VCPU, mmio_exits),
STATS_DESC_COUNTER(VCPU, signal_exits),
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
` (3 preceding siblings ...)
2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao
Not-signed-off-by: Sean Christopherson <seanjc@google.com>
---
.../testing/selftests/kvm/include/x86/apic.h | 84 ++++++-
.../selftests/kvm/x86/fix_hypercall_test.c | 2 +-
.../selftests/kvm/x86/xapic_ipi_test.c | 4 +-
.../selftests/kvm/x86/xapic_state_test.c | 217 ++++++++++++++++++
4 files changed, 296 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/x86/apic.h b/tools/testing/selftests/kvm/include/x86/apic.h
index 31887bdc3d6c..3c3e362f2b98 100644
--- a/tools/testing/selftests/kvm/include/x86/apic.h
+++ b/tools/testing/selftests/kvm/include/x86/apic.h
@@ -23,21 +23,64 @@
#define APIC_BASE_MSR 0x800
#define X2APIC_ENABLE (1UL << 10)
+
+#define APIC_DELIVERY_MODE_FIXED 0
+#define APIC_DELIVERY_MODE_LOWESTPRIO 1
+#define APIC_DELIVERY_MODE_SMI 2
+#define APIC_DELIVERY_MODE_NMI 4
+#define APIC_DELIVERY_MODE_INIT 5
+#define APIC_DELIVERY_MODE_EXTINT 7
+
#define APIC_ID 0x20
+
#define APIC_LVR 0x30
-#define GET_APIC_ID_FIELD(x) (((x) >> 24) & 0xFF)
+#define APIC_LVR_MASK 0xFF00FF
+#define APIC_LVR_DIRECTED_EOI (1 << 24)
+#define GET_APIC_VERSION(x) ((x) & 0xFFu)
+#define GET_APIC_MAXLVT(x) (((x) >> 16) & 0xFFu)
+#ifdef CONFIG_X86_32
+# define APIC_INTEGRATED(x) ((x) & 0xF0u)
+#else
+# define APIC_INTEGRATED(x) (1)
+#endif
+#define APIC_XAPIC(x) ((x) >= 0x14)
+#define APIC_EXT_SPACE(x) ((x) & 0x80000000)
#define APIC_TASKPRI 0x80
+#define APIC_TPRI_MASK 0xFFu
+#define APIC_ARBPRI 0x90
+#define APIC_ARBPRI_MASK 0xFFu
#define APIC_PROCPRI 0xA0
-#define GET_APIC_PRI(x) (((x) & GENMASK(7, 4)) >> 4)
-#define SET_APIC_PRI(x, y) (((x) & ~GENMASK(7, 4)) | (y << 4))
+#define GET_APIC_PRI(x) (((x) & GENMASK(7, 4)) >> 4)
+#define SET_APIC_PRI(x, y) (((x) & ~GENMASK(7, 4)) | (y << 4))
#define APIC_EOI 0xB0
+#define APIC_EOI_ACK 0x0 /* Docs say 0 for future compat. */
+#define APIC_RRR 0xC0
+#define APIC_LDR 0xD0
+#define APIC_LDR_MASK (0xFFu << 24)
+#define GET_APIC_LOGICAL_ID(x) (((x) >> 24) & 0xFFu)
+#define SET_APIC_LOGICAL_ID(x) (((x) << 24))
+#define APIC_ALL_CPUS 0xFFu
+#define APIC_DFR 0xE0
+#define APIC_DFR_CLUSTER 0x0FFFFFFFul
+#define APIC_DFR_FLAT 0xFFFFFFFFul
#define APIC_SPIV 0xF0
+#define APIC_SPIV_DIRECTED_EOI (1 << 12)
#define APIC_SPIV_FOCUS_DISABLED (1 << 9)
#define APIC_SPIV_APIC_ENABLED (1 << 8)
#define APIC_ISR 0x100
-#define APIC_IRR 0x200
+#define APIC_ISR_NR 0x8 /* Number of 32 bit ISR registers. */
+#define APIC_TMR 0x180
+#define APIC_IRR 0x200
+#define APIC_ESR 0x280
+#define APIC_ESR_SEND_CS 0x00001
+#define APIC_ESR_RECV_CS 0x00002
+#define APIC_ESR_SEND_ACC 0x00004
+#define APIC_ESR_RECV_ACC 0x00008
+#define APIC_ESR_SENDILL 0x00020
+#define APIC_ESR_RECVILL 0x00040
+#define APIC_ESR_ILLREGA 0x00080
+#define APIC_LVTCMCI 0x2f0
#define APIC_ICR 0x300
-#define APIC_LVTCMCI 0x2f0
#define APIC_DEST_SELF 0x40000
#define APIC_DEST_ALLINC 0x80000
#define APIC_DEST_ALLBUT 0xC0000
@@ -61,16 +104,41 @@
#define APIC_DM_EXTINT 0x00700
#define APIC_VECTOR_MASK 0x000FF
#define APIC_ICR2 0x310
-#define SET_APIC_DEST_FIELD(x) ((x) << 24)
-#define APIC_LVTT 0x320
+#define GET_XAPIC_DEST_FIELD(x) (((x) >> 24) & 0xFF)
+#define SET_XAPIC_DEST_FIELD(x) ((x) << 24)
+#define APIC_LVTT 0x320
+#define APIC_LVTTHMR 0x330
+#define APIC_LVTPC 0x340
+#define APIC_LVT0 0x350
#define APIC_LVT_TIMER_ONESHOT (0 << 17)
#define APIC_LVT_TIMER_PERIODIC (1 << 17)
#define APIC_LVT_TIMER_TSCDEADLINE (2 << 17)
#define APIC_LVT_MASKED (1 << 16)
+#define APIC_LVT_LEVEL_TRIGGER (1 << 15)
+#define APIC_LVT_REMOTE_IRR (1 << 14)
+#define APIC_INPUT_POLARITY (1 << 13)
+#define APIC_SEND_PENDING (1 << 12)
+#define APIC_MODE_MASK 0x700
+#define GET_APIC_DELIVERY_MODE(x) (((x) >> 8) & 0x7)
+#define SET_APIC_DELIVERY_MODE(x, y) (((x) & ~0x700) | ((y) << 8))
+#define APIC_MODE_FIXED 0x0
+#define APIC_MODE_NMI 0x4
+#define APIC_MODE_EXTINT 0x7
+#define APIC_LVT1 0x360
+#define APIC_LVTERR 0x370
#define APIC_TMICT 0x380
#define APIC_TMCCT 0x390
#define APIC_TDCR 0x3E0
-#define APIC_SELF_IPI 0x3F0
+#define APIC_SELF_IPI 0x3F0
+#define APIC_TDR_DIV_TMBASE (1 << 2)
+#define APIC_TDR_DIV_1 0xB
+#define APIC_TDR_DIV_2 0x0
+#define APIC_TDR_DIV_4 0x1
+#define APIC_TDR_DIV_8 0x2
+#define APIC_TDR_DIV_16 0x3
+#define APIC_TDR_DIV_32 0x8
+#define APIC_TDR_DIV_64 0x9
+#define APIC_TDR_DIV_128 0xA
#define APIC_VECTOR_TO_BIT_NUMBER(v) ((unsigned int)(v) % 32)
#define APIC_VECTOR_TO_REG_OFFSET(v) ((unsigned int)(v) / 32 * 0x10)
diff --git a/tools/testing/selftests/kvm/x86/fix_hypercall_test.c b/tools/testing/selftests/kvm/x86/fix_hypercall_test.c
index 753a0e730ea8..ad61da99ee4c 100644
--- a/tools/testing/selftests/kvm/x86/fix_hypercall_test.c
+++ b/tools/testing/selftests/kvm/x86/fix_hypercall_test.c
@@ -63,7 +63,7 @@ static void guest_main(void)
memcpy(hypercall_insn, other_hypercall_insn, HYPERCALL_INSN_SIZE);
- ret = do_sched_yield(GET_APIC_ID_FIELD(xapic_read_reg(APIC_ID)));
+ ret = do_sched_yield(GET_XAPIC_DEST_FIELD(xapic_read_reg(APIC_ID)));
/*
* If the quirk is disabled, verify that guest_ud_handler() "returned"
diff --git a/tools/testing/selftests/kvm/x86/xapic_ipi_test.c b/tools/testing/selftests/kvm/x86/xapic_ipi_test.c
index 39ce9a9369f5..75b87f850abc 100644
--- a/tools/testing/selftests/kvm/x86/xapic_ipi_test.c
+++ b/tools/testing/selftests/kvm/x86/xapic_ipi_test.c
@@ -91,7 +91,7 @@ static void halter_guest_code(struct test_data_page *data)
verify_apic_base_addr();
xapic_enable();
- data->halter_apic_id = GET_APIC_ID_FIELD(xapic_read_reg(APIC_ID));
+ data->halter_apic_id = GET_XAPIC_DEST_FIELD(xapic_read_reg(APIC_ID));
data->halter_lvr = xapic_read_reg(APIC_LVR);
/*
@@ -147,7 +147,7 @@ static void sender_guest_code(struct test_data_page *data)
* set data->halter_apic_id.
*/
icr_val = (APIC_DEST_PHYSICAL | APIC_DM_FIXED | IPI_VECTOR);
- icr2_val = SET_APIC_DEST_FIELD(data->halter_apic_id);
+ icr2_val = SET_XAPIC_DEST_FIELD(data->halter_apic_id);
data->icr = icr_val;
data->icr2 = icr2_val;
diff --git a/tools/testing/selftests/kvm/x86/xapic_state_test.c b/tools/testing/selftests/kvm/x86/xapic_state_test.c
index 637bb90c1d93..3c7c6a5485e4 100644
--- a/tools/testing/selftests/kvm/x86/xapic_state_test.c
+++ b/tools/testing/selftests/kvm/x86/xapic_state_test.c
@@ -222,6 +222,221 @@ static void test_x2apic_id(void)
kvm_vm_free(vm);
}
+#define X2APIC_MSR(r) (0x800 + ((r) >> 4))
+
+static bool is_x2apic_mode = true;
+
+static bool is_ro_only_reg(int reg)
+{
+ switch (reg) {
+ case APIC_ID:
+ case APIC_LVR:
+ case APIC_PROCPRI:
+ case APIC_LDR:
+ case APIC_ARBPRI:
+ case APIC_ISR:
+ case APIC_TMR:
+ case APIC_IRR:
+ case APIC_TMCCT:
+ return true;
+ }
+ return false;
+}
+
+static bool is_xapic_only_reg(int reg)
+{
+ return reg == APIC_ARBPRI || reg == APIC_DFR || reg == APIC_ICR2;
+}
+
+static bool is_accelerated_reg(int reg, bool write)
+{
+ if (!write)
+ return reg != APIC_TMCCT;
+
+ switch (reg) {
+ case APIC_TASKPRI:
+ case APIC_EOI:
+ case APIC_SELF_IPI:
+ case APIC_ICR:
+ return true;
+ default:
+ break;
+ }
+ return false;
+}
+
+static void x2apic_msr_guest_code(void)
+{
+ const u32 xapic_regs[] = {
+ APIC_ID,
+ APIC_LVR,
+ APIC_TASKPRI,
+ APIC_PROCPRI,
+ APIC_LDR,
+ APIC_SPIV,
+ APIC_ISR,
+ APIC_TMR,
+ APIC_IRR,
+ APIC_ESR,
+ APIC_LVTT,
+ APIC_LVTTHMR,
+ APIC_LVTPC,
+ APIC_LVT0,
+ APIC_LVT1,
+ APIC_LVTERR,
+ APIC_TMICT,
+ APIC_TMCCT,
+ APIC_TDCR,
+
+ // APIC_EOI,
+ // APIC_SELF_IPI,
+ // APIC_ICR,
+
+ APIC_ARBPRI,
+ APIC_DFR,
+ APIC_ICR2,
+ };
+ int i, j;
+ u64 val;
+ u32 msr;
+ u8 vec;
+
+ cli();
+
+ if (is_x2apic_mode)
+ x2apic_enable();
+
+ GUEST_SYNC(0xbeef);
+
+ for (i = 0; i < ARRAY_SIZE(xapic_regs); i++) {
+ int nr_regs;
+ u8 rd, wr;
+
+ if (!is_x2apic_mode || is_xapic_only_reg(xapic_regs[i])) {
+ rd = wr = GP_VECTOR;
+ } else {
+ rd = 0;
+ wr = is_ro_only_reg(xapic_regs[i]) ? GP_VECTOR : 0;
+ }
+
+ if (xapic_regs[i] == APIC_IRR ||
+ xapic_regs[i] == APIC_ISR ||
+ xapic_regs[i] == APIC_TMR)
+ nr_regs = APIC_ISR_NR;
+ else
+ nr_regs = 1;
+
+ for (j = 0; j < nr_regs; j++) {
+ msr = X2APIC_MSR(xapic_regs[i] + j * 0x10);
+
+ vec = rdmsr_safe(msr, &val);
+ __GUEST_ASSERT(vec == rd,
+ "Wanted %s on RDMSR(%x), got %s",
+ ex_str(rd), msr, ex_str(vec));
+ GUEST_SYNC3(xapic_regs[i], false, vec);
+
+ vec = wrmsr_safe(msr, 0);
+ __GUEST_ASSERT(vec == wr,
+ "Wanted %s on WRMSR(%x), got %s",
+ ex_str(wr), msr, ex_str(vec));
+
+ GUEST_SYNC3(xapic_regs[i], true, vec);
+ }
+ }
+ GUEST_DONE();
+}
+
+static void test_x2apic_msr_intercepts(void)
+{
+ u64 last_guest, last_io, last_msr, guest_exits, io_exits, msr_exits;
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+ struct ucall uc;
+
+ vm = vm_create_with_one_vcpu(&vcpu, x2apic_msr_guest_code);
+
+ TEST_ASSERT_EQ(vcpu_get_stat(vcpu, guest_induced_exits), 0);
+
+ vcpu_run(vcpu);
+ TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+ TEST_ASSERT_EQ(get_ucall(vcpu, NULL), UCALL_SYNC);
+
+ for ( ;; ) {
+ last_guest = vcpu_get_stat(vcpu, guest_induced_exits);
+ last_io = vcpu_get_stat(vcpu, io_exits);
+ last_msr = vcpu_get_stat(vcpu, msr_exits);
+
+ vcpu_run(vcpu);
+ TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+ guest_exits = vcpu_get_stat(vcpu, guest_induced_exits);
+ io_exits = vcpu_get_stat(vcpu, io_exits);
+ msr_exits = vcpu_get_stat(vcpu, msr_exits);
+
+ TEST_ASSERT_EQ(io_exits, last_io + 1);
+
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_SYNC: {
+ int vector = uc.args[2];
+ bool write = uc.args[1];
+ u32 reg = uc.args[0];
+
+ printf("reg = %x, write = %u, fault = %u\n", reg, write, vector);
+ if (vector || !is_accelerated_reg(reg, write)) {
+ TEST_ASSERT_EQ(msr_exits, last_msr + 1);
+ TEST_ASSERT_EQ(guest_exits - last_guest,
+ io_exits - last_io + msr_exits - last_msr);
+ } else {
+ TEST_ASSERT_EQ(msr_exits, last_msr);
+ TEST_ASSERT_EQ(guest_exits - last_guest,
+ io_exits - last_io);
+ }
+ // printf("On msr = %x\n", msr);
+ break;
+ }
+ case UCALL_DONE:
+ goto test_xapic;
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT(uc);
+ default:
+ TEST_FAIL("Unknown ucall %lu", uc.cmd);
+ }
+ }
+
+test_xapic:
+ kvm_vm_free(vm);
+
+ vm = vm_create_with_one_vcpu(&vcpu, x2apic_msr_guest_code);
+ vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_X2APIC);
+ is_x2apic_mode = false;
+ sync_global_to_guest(vm, is_x2apic_mode);
+
+ for ( ;; ) {
+ vcpu_run(vcpu);
+ TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+ // vcpu_get_stat(vcpu, io_exits);
+ // vcpu_get_stat(vcpu, msr_exits);
+ // vcpu_get_stat(vcpu, guest_induced_exits);
+
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_SYNC:
+ // u32 msr = uc.args[1];
+
+ // printf("On msr = %x\n", msr);
+ break;
+ case UCALL_DONE:
+ goto done;
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT(uc);
+ default:
+ TEST_FAIL("Unknown ucall %lu", uc.cmd);
+ }
+ }
+done:
+}
+
int main(int argc, char *argv[])
{
struct xapic_vcpu x = {
@@ -230,6 +445,8 @@ int main(int argc, char *argv[])
};
struct kvm_vm *vm;
+ test_x2apic_msr_intercepts();
+
vm = vm_create_with_one_vcpu(&x.vcpu, x2apic_guest_code);
test_icr(&x);
kvm_vm_free(vm);
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-06 18:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox