public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues
@ 2026-05-06 18:47 Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao

Fix a variety of bugs in SVM's handling of x2APIC MSR passthrough for x2AVIC,
where KVM disables interception for MSR accesses that aren't accelerated by
hardware (pointless and suboptimal), and also does NOT disable interception
for practically any of the "range of vectors" MSRs, i.e. IRR, ISR, and TMR.

Found by inspection when reviewing a TDX patch to fix a bug where KVM botched
the "range of vectors"[*] (I was curious how other KVM code handled the ranges;
wasn't expecting this...).

Note, I tagged all of this for stable, but I could be convinced these fixes
shouldn't be sent to LTS trees.  Patch 3 in particular doesn't truly fix
anything, though I definitely don't like relying on poorly documented behavior.

Note #2, the diff stats are misleading due to the hacks, the "real" stats are:

  arch/x86/kvm/svm/avic.c | 51 ++++++++++++++++-----------------------------------
  1 file changed, 16 insertions(+), 35 deletions(-)

[*] https://lore.kernel.org/all/20260318190111.1041924-1-dmaluka@chromium.org

v2:
 - Actually iterate over the mask of readable regs. [Naveen]
 - Rewrite the changelog for patch 3 to more accurately capture what happens,
   and to avoid conflating "unaccelerated" with "fault-like". [Naveen]
 - Massage the changlog for patch 1 to describe the observed behavior of
   DFR and ICR2.
 - Test the #VMEXIT (or not) behavior with hacks (patches 4 and 5).

v1: https://lore.kernel.org/all/20260409222449.2013847-1-seanjc@google.com

Sean Christopherson (5):
  KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually
    supports
  KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count)
  KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are
    accelerated
  *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced
    exits (for testing)
  *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC
    MSR interception

 arch/x86/include/asm/kvm_host.h               |   2 +
 arch/x86/kvm/svm/avic.c                       |  51 ++--
 arch/x86/kvm/svm/svm.c                        |  81 +++++++
 arch/x86/kvm/vmx/vmx.c                        |  79 +++++++
 arch/x86/kvm/x86.c                            |   2 +
 .../testing/selftests/kvm/include/x86/apic.h  |  84 ++++++-
 .../selftests/kvm/x86/fix_hypercall_test.c    |   2 +-
 .../selftests/kvm/x86/xapic_ipi_test.c        |   4 +-
 .../selftests/kvm/x86/xapic_state_test.c      | 217 ++++++++++++++++++
 9 files changed, 476 insertions(+), 46 deletions(-)


base-commit: 6d35786de28116ecf78797a62b84e6bf3c45aa5a
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports
  2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao

Fix multiple (classes of) bugs with one stone by using KVM's mask of
readable local APIC registers to determine which x2APIC MSRs to pass
through (or not) when toggling x2AVIC on/off.  The existing hand-coded
list of MSRs is wrong on multiple fronts:

 - ARBPRI isn't supported by x2APIC, but its unaccelerated AVIC intercept
   is fault-like; disabling interception is nonsensical and suboptimal as
   the access generates a #VMEXIT that requires decoding the instruction.

 - DFR and ICR2 aren't supported by x2APIC and so don't need their
   intercepts disabled for performance reasons.  While the #GP due to
   x2APIC being abled has higher priority than the trap-like #VMEXIT,
   disabling interception of unsupported MSRs is confusing and unnecessary.

 - RRR is completely unsupported.

 - AVIC currently fails to pass through the "range of vectors" registers,
   IRR, ISR, and TMR, as e.g. X2APIC_MSR(APIC_IRR) only affects IRR0, and
   thus only disables intercept for vectors 31:0 (which are the *least*
   interesting registers).

Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index adf211860949..4f203e503e8e 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -122,6 +122,9 @@ static u32 x2avic_max_physical_id;
 static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
 					     bool intercept)
 {
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	u64 x2apic_readable_mask;
+
 	static const u32 x2avic_passthrough_msrs[] = {
 		X2APIC_MSR(APIC_ID),
 		X2APIC_MSR(APIC_LVR),
@@ -162,9 +165,16 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
 	if (!x2avic_enabled)
 		return;
 
+	x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+
+	for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
+			 BITS_PER_TYPE(x2apic_readable_mask))
+		svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
+					  MSR_TYPE_R, intercept);
+
 	for (i = 0; i < ARRAY_SIZE(x2avic_passthrough_msrs); i++)
-		svm_set_intercept_for_msr(&svm->vcpu, x2avic_passthrough_msrs[i],
-					  MSR_TYPE_RW, intercept);
+		svm_set_intercept_for_msr(vcpu, x2avic_passthrough_msrs[i],
+					  MSR_TYPE_W, intercept);
 
 	svm->x2avic_msrs_intercepted = intercept;
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count)
  2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao

Explicitly intercept RDMSR for TMMCT, a.k.a. the current APIC timer count,
when x2AVIC is enabled, as TMMCT reads aren't accelerated by hardware.
Disabling interception is suboptimal as the RDMSR generates an
AVIC_UNACCELERATED_ACCESS fault #VMEXIT, which forces KVM to decode the
instruction to figure out what the guest was trying to access.

Note, the only reason this isn't a fatal bug is that the AVIC architecture
had the foresight to guard against buggy hypervisors.  E.g. if hardware
simply read from the virtual APIC page, the guest would get garbage.

Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 4f203e503e8e..d693c9ff9f18 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -172,6 +172,9 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
 		svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
 					  MSR_TYPE_R, intercept);
 
+	if (!intercept)
+		svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
+
 	for (i = 0; i < ARRAY_SIZE(x2avic_passthrough_msrs); i++)
 		svm_set_intercept_for_msr(vcpu, x2avic_passthrough_msrs[i],
 					  MSR_TYPE_W, intercept);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated
  2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson
  4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao

When x2AVIC is enabled, disable WRMSR interception only for MSRs that are
actually accelerated by hardware.  Disabling interception for MSRs that
aren't accelerated is functionally "fine", and in some cases a weird "win"
for performance, but only for cases that should never be triggered by a
well-behaved VM (writes to read-only registers; the #GP will typically
occur in the guest without taking a #VMEXIT, even for fault-like exits).

But overall, disabling interception for MSRs that aren't accelerated is at
best confusing and unintuitive, and at worst introduces avoidable risk, as
the effective guest-visible behavior depends on the whims of the CPU (the
behavior of x2APIC MSR writes on at least Zen4 doesn't match the behavior
documented in the table in "15.29.3.1 Virtual APIC Register Accesses" of
the APM).

Note, the set of MSRs that are passed through for write is identical to
VMX's set when IPI virtualization is enabled.  This is not a coincidence,
and is another motiviating factor for cleaning up the intercepts, as x2AVIC
is functionally equivalent to APICv+IPIv.

Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/avic.c | 40 ++++------------------------------------
 1 file changed, 4 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index d693c9ff9f18..c5d46c0d2403 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -124,39 +124,6 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
 {
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 	u64 x2apic_readable_mask;
-
-	static const u32 x2avic_passthrough_msrs[] = {
-		X2APIC_MSR(APIC_ID),
-		X2APIC_MSR(APIC_LVR),
-		X2APIC_MSR(APIC_TASKPRI),
-		X2APIC_MSR(APIC_ARBPRI),
-		X2APIC_MSR(APIC_PROCPRI),
-		X2APIC_MSR(APIC_EOI),
-		X2APIC_MSR(APIC_RRR),
-		X2APIC_MSR(APIC_LDR),
-		X2APIC_MSR(APIC_DFR),
-		X2APIC_MSR(APIC_SPIV),
-		X2APIC_MSR(APIC_ISR),
-		X2APIC_MSR(APIC_TMR),
-		X2APIC_MSR(APIC_IRR),
-		X2APIC_MSR(APIC_ESR),
-		X2APIC_MSR(APIC_ICR),
-		X2APIC_MSR(APIC_ICR2),
-
-		/*
-		 * Note!  Always intercept LVTT, as TSC-deadline timer mode
-		 * isn't virtualized by hardware, and the CPU will generate a
-		 * #GP instead of a #VMEXIT.
-		 */
-		X2APIC_MSR(APIC_LVTTHMR),
-		X2APIC_MSR(APIC_LVTPC),
-		X2APIC_MSR(APIC_LVT0),
-		X2APIC_MSR(APIC_LVT1),
-		X2APIC_MSR(APIC_LVTERR),
-		X2APIC_MSR(APIC_TMICT),
-		X2APIC_MSR(APIC_TMCCT),
-		X2APIC_MSR(APIC_TDCR),
-	};
 	int i;
 
 	if (intercept == svm->x2avic_msrs_intercepted)
@@ -175,9 +142,10 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
 	if (!intercept)
 		svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
 
-	for (i = 0; i < ARRAY_SIZE(x2avic_passthrough_msrs); i++)
-		svm_set_intercept_for_msr(vcpu, x2avic_passthrough_msrs[i],
-					  MSR_TYPE_W, intercept);
+	svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_W, intercept);
+	svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W, intercept);
+	svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W, intercept);
+	svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_ICR), MSR_TYPE_W, intercept);
 
 	svm->x2avic_msrs_intercepted = intercept;
 }
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing)
  2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
                   ` (2 preceding siblings ...)
  2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
  2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson
  4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao

Not-signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +
 arch/x86/kvm/svm/svm.c          | 81 +++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.c          | 79 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |  2 +
 4 files changed, 164 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c470e40a00aa..bff534bd00dc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1703,6 +1703,8 @@ struct kvm_vcpu_stat {
 	u64 invlpg;
 
 	u64 exits;
+	u64 guest_induced_exits;
+	u64 msr_exits;
 	u64 io_exits;
 	u64 mmio_exits;
 	u64 signal_exits;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e7fdd7a9c280..7886bd1ad8f2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4378,6 +4378,81 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 	return EXIT_FASTPATH_NONE;
 }
 
+static bool is_guest_induced_exit(u64 exit_code)
+{
+	switch (exit_code) {
+	case SVM_EXIT_READ_CR0:
+	case SVM_EXIT_READ_CR3:
+	case SVM_EXIT_READ_CR4:
+	case SVM_EXIT_READ_CR8:
+	case SVM_EXIT_CR0_SEL_WRITE:
+	case SVM_EXIT_WRITE_CR0:
+	case SVM_EXIT_WRITE_CR3:
+	case SVM_EXIT_WRITE_CR4:
+	case SVM_EXIT_WRITE_CR8:
+	case SVM_EXIT_READ_DR0:
+	case SVM_EXIT_READ_DR1:
+	case SVM_EXIT_READ_DR2:
+	case SVM_EXIT_READ_DR3:
+	case SVM_EXIT_READ_DR4:
+	case SVM_EXIT_READ_DR5:
+	case SVM_EXIT_READ_DR6:
+	case SVM_EXIT_READ_DR7:
+	case SVM_EXIT_WRITE_DR0:
+	case SVM_EXIT_WRITE_DR1:
+	case SVM_EXIT_WRITE_DR2:
+	case SVM_EXIT_WRITE_DR3:
+	case SVM_EXIT_WRITE_DR4:
+	case SVM_EXIT_WRITE_DR5:
+	case SVM_EXIT_WRITE_DR6:
+	case SVM_EXIT_WRITE_DR7:
+	case SVM_EXIT_EXCP_BASE + DB_VECTOR:
+	case SVM_EXIT_EXCP_BASE + BP_VECTOR:
+	case SVM_EXIT_EXCP_BASE + UD_VECTOR:
+	case SVM_EXIT_EXCP_BASE + PF_VECTOR:
+	case SVM_EXIT_EXCP_BASE + AC_VECTOR:
+	case SVM_EXIT_EXCP_BASE + GP_VECTOR:
+	case SVM_EXIT_RDPMC:
+	case SVM_EXIT_CPUID:
+	case SVM_EXIT_IRET:
+	case SVM_EXIT_INVD:
+	case SVM_EXIT_PAUSE:
+	case SVM_EXIT_HLT:
+	case SVM_EXIT_INVLPG:
+	case SVM_EXIT_INVLPGA:
+	case SVM_EXIT_IOIO:
+	case SVM_EXIT_MSR:
+	case SVM_EXIT_TASK_SWITCH:
+	case SVM_EXIT_SHUTDOWN:
+	case SVM_EXIT_VMRUN:
+	case SVM_EXIT_VMMCALL:
+	case SVM_EXIT_VMLOAD:
+	case SVM_EXIT_VMSAVE:
+	case SVM_EXIT_STGI:
+	case SVM_EXIT_CLGI:
+	case SVM_EXIT_SKINIT:
+	case SVM_EXIT_RDTSCP:
+	case SVM_EXIT_WBINVD:
+	case SVM_EXIT_MONITOR:
+	case SVM_EXIT_MWAIT:
+	case SVM_EXIT_XSETBV:
+	case SVM_EXIT_RDPRU:
+	case SVM_EXIT_EFER_WRITE_TRAP:
+	case SVM_EXIT_CR0_WRITE_TRAP:
+	case SVM_EXIT_CR4_WRITE_TRAP:
+	case SVM_EXIT_CR8_WRITE_TRAP:
+	case SVM_EXIT_INVPCID:
+	case SVM_EXIT_IDLE_HLT:
+	case SVM_EXIT_RSM:
+	case SVM_EXIT_AVIC_INCOMPLETE_IPI:
+	case SVM_EXIT_AVIC_UNACCELERATED_ACCESS:
+		return true;
+	default:
+		break;
+	}
+	return false;
+}
+
 static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
 {
 	struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, vcpu->cpu);
@@ -4573,6 +4648,12 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
 	if (is_guest_mode(vcpu))
 		svm->nested.ctl.next_rip = svm->vmcb->control.next_rip;
 
+	if (is_guest_induced_exit(svm->vmcb->control.exit_code))
+		++vcpu->stat.guest_induced_exits;
+
+	if (svm->vmcb->control.exit_code == SVM_EXIT_MSR)
+		++vcpu->stat.msr_exits;
+
 	return svm_exit_handlers_fastpath(vcpu);
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5c2c33a5f7dc..859f4bc01445 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7478,6 +7478,79 @@ noinstr void vmx_handle_nmi(struct kvm_vcpu *vcpu)
 	kvm_after_interrupt(vcpu);
 }
 
+static bool is_guest_induced_exit(struct kvm_vcpu *vcpu)
+{
+	switch (vmx_get_exit_reason(vcpu).basic) {
+	case EXIT_REASON_EXCEPTION_NMI:
+		if (is_nmi(vmx_get_intr_info(vcpu)))
+			return false;
+		return true;
+	case EXIT_REASON_TRIPLE_FAULT:
+	case EXIT_REASON_IO_INSTRUCTION:
+	case EXIT_REASON_CR_ACCESS:
+	case EXIT_REASON_DR_ACCESS:
+	case EXIT_REASON_CPUID:
+	case EXIT_REASON_MSR_READ:
+	case EXIT_REASON_MSR_WRITE:
+	case EXIT_REASON_HLT:
+	case EXIT_REASON_INVD:
+	case EXIT_REASON_INVLPG:
+	case EXIT_REASON_RDPMC:
+	case EXIT_REASON_VMCALL:
+	case EXIT_REASON_VMCLEAR:
+	case EXIT_REASON_VMLAUNCH:
+	case EXIT_REASON_VMPTRLD:
+	case EXIT_REASON_VMPTRST:
+	case EXIT_REASON_VMREAD:
+	case EXIT_REASON_VMRESUME:
+	case EXIT_REASON_VMWRITE:
+	case EXIT_REASON_VMOFF:
+	case EXIT_REASON_VMON:
+	case EXIT_REASON_TPR_BELOW_THRESHOLD:
+	case EXIT_REASON_APIC_ACCESS:
+	case EXIT_REASON_APIC_WRITE:
+	case EXIT_REASON_EOI_INDUCED:
+	case EXIT_REASON_WBINVD:
+	case EXIT_REASON_XSETBV:
+	case EXIT_REASON_TASK_SWITCH:
+	case EXIT_REASON_MCE_DURING_VMENTRY:
+	case EXIT_REASON_GDTR_IDTR:
+	case EXIT_REASON_LDTR_TR:
+	case EXIT_REASON_PAUSE_INSTRUCTION:
+	case EXIT_REASON_MWAIT_INSTRUCTION:
+	case EXIT_REASON_MONITOR_INSTRUCTION:
+	case EXIT_REASON_INVEPT:
+	case EXIT_REASON_INVVPID:
+	case EXIT_REASON_RDRAND:
+	case EXIT_REASON_RDSEED:
+	case EXIT_REASON_INVPCID:
+	case EXIT_REASON_VMFUNC:
+	case EXIT_REASON_ENCLS:
+	case EXIT_REASON_SEAMCALL:
+	case EXIT_REASON_TDCALL:
+	case EXIT_REASON_MSR_READ_IMM:
+	case EXIT_REASON_MSR_WRITE_IMM:
+		return true;
+	default:
+		break;
+	}
+	return false;
+}
+
+static bool is_msr_exit(struct kvm_vcpu *vcpu)
+{
+	switch (vmx_get_exit_reason(vcpu).basic) {
+	case EXIT_REASON_MSR_READ:
+	case EXIT_REASON_MSR_WRITE:
+	case EXIT_REASON_MSR_READ_IMM:
+	case EXIT_REASON_MSR_WRITE_IMM:
+		return true;
+	default:
+		break;
+	}
+	return false;
+}
+
 static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 					unsigned int flags)
 {
@@ -7667,6 +7740,12 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
 	vmx_recover_nmi_blocking(vmx);
 	vmx_complete_interrupts(vmx);
 
+	if (is_guest_induced_exit(vcpu))
+		++vcpu->stat.guest_induced_exits;
+
+	if (is_msr_exit(vcpu))
+		++vcpu->stat.msr_exits;
+
 	return vmx_exit_handlers_fastpath(vcpu, force_immediate_exit);
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0a1b63c63d1a..dc69b8cebe0b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -283,6 +283,8 @@ const struct kvm_stats_desc kvm_vcpu_stats_desc[] = {
 	STATS_DESC_COUNTER(VCPU, tlb_flush),
 	STATS_DESC_COUNTER(VCPU, invlpg),
 	STATS_DESC_COUNTER(VCPU, exits),
+	STATS_DESC_COUNTER(VCPU, guest_induced_exits),
+	STATS_DESC_COUNTER(VCPU, msr_exits),
 	STATS_DESC_COUNTER(VCPU, io_exits),
 	STATS_DESC_COUNTER(VCPU, mmio_exits),
 	STATS_DESC_COUNTER(VCPU, signal_exits),
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception
  2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
                   ` (3 preceding siblings ...)
  2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
@ 2026-05-06 18:47 ` Sean Christopherson
  4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2026-05-06 18:47 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Naveen N Rao

Not-signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../testing/selftests/kvm/include/x86/apic.h  |  84 ++++++-
 .../selftests/kvm/x86/fix_hypercall_test.c    |   2 +-
 .../selftests/kvm/x86/xapic_ipi_test.c        |   4 +-
 .../selftests/kvm/x86/xapic_state_test.c      | 217 ++++++++++++++++++
 4 files changed, 296 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86/apic.h b/tools/testing/selftests/kvm/include/x86/apic.h
index 31887bdc3d6c..3c3e362f2b98 100644
--- a/tools/testing/selftests/kvm/include/x86/apic.h
+++ b/tools/testing/selftests/kvm/include/x86/apic.h
@@ -23,21 +23,64 @@
 
 #define APIC_BASE_MSR	0x800
 #define X2APIC_ENABLE	(1UL << 10)
+
+#define APIC_DELIVERY_MODE_FIXED	0
+#define APIC_DELIVERY_MODE_LOWESTPRIO	1
+#define APIC_DELIVERY_MODE_SMI		2
+#define APIC_DELIVERY_MODE_NMI		4
+#define APIC_DELIVERY_MODE_INIT		5
+#define APIC_DELIVERY_MODE_EXTINT	7
+
 #define	APIC_ID		0x20
+
 #define	APIC_LVR	0x30
-#define		GET_APIC_ID_FIELD(x)	(((x) >> 24) & 0xFF)
+#define		APIC_LVR_MASK		0xFF00FF
+#define		APIC_LVR_DIRECTED_EOI	(1 << 24)
+#define		GET_APIC_VERSION(x)	((x) & 0xFFu)
+#define		GET_APIC_MAXLVT(x)	(((x) >> 16) & 0xFFu)
+#ifdef CONFIG_X86_32
+#  define	APIC_INTEGRATED(x)	((x) & 0xF0u)
+#else
+#  define	APIC_INTEGRATED(x)	(1)
+#endif
+#define		APIC_XAPIC(x)		((x) >= 0x14)
+#define		APIC_EXT_SPACE(x)	((x) & 0x80000000)
 #define	APIC_TASKPRI	0x80
+#define		APIC_TPRI_MASK		0xFFu
+#define	APIC_ARBPRI	0x90
+#define		APIC_ARBPRI_MASK	0xFFu
 #define	APIC_PROCPRI	0xA0
-#define	GET_APIC_PRI(x) (((x) & GENMASK(7, 4)) >> 4)
-#define	SET_APIC_PRI(x, y) (((x) & ~GENMASK(7, 4)) | (y << 4))
+#define		GET_APIC_PRI(x) (((x) & GENMASK(7, 4)) >> 4)
+#define		SET_APIC_PRI(x, y) (((x) & ~GENMASK(7, 4)) | (y << 4))
 #define	APIC_EOI	0xB0
+#define		APIC_EOI_ACK		0x0 /* Docs say 0 for future compat. */
+#define	APIC_RRR	0xC0
+#define	APIC_LDR	0xD0
+#define		APIC_LDR_MASK		(0xFFu << 24)
+#define		GET_APIC_LOGICAL_ID(x)	(((x) >> 24) & 0xFFu)
+#define		SET_APIC_LOGICAL_ID(x)	(((x) << 24))
+#define		APIC_ALL_CPUS		0xFFu
+#define	APIC_DFR	0xE0
+#define		APIC_DFR_CLUSTER		0x0FFFFFFFul
+#define		APIC_DFR_FLAT			0xFFFFFFFFul
 #define	APIC_SPIV	0xF0
+#define		APIC_SPIV_DIRECTED_EOI		(1 << 12)
 #define		APIC_SPIV_FOCUS_DISABLED	(1 << 9)
 #define		APIC_SPIV_APIC_ENABLED		(1 << 8)
 #define	APIC_ISR	0x100
-#define APIC_IRR	0x200
+#define	APIC_ISR_NR     0x8     /* Number of 32 bit ISR registers. */
+#define	APIC_TMR	0x180
+#define	APIC_IRR	0x200
+#define	APIC_ESR	0x280
+#define		APIC_ESR_SEND_CS	0x00001
+#define		APIC_ESR_RECV_CS	0x00002
+#define		APIC_ESR_SEND_ACC	0x00004
+#define		APIC_ESR_RECV_ACC	0x00008
+#define		APIC_ESR_SENDILL	0x00020
+#define		APIC_ESR_RECVILL	0x00040
+#define		APIC_ESR_ILLREGA	0x00080
+#define 	APIC_LVTCMCI	0x2f0
 #define	APIC_ICR	0x300
-#define	APIC_LVTCMCI	0x2f0
 #define		APIC_DEST_SELF		0x40000
 #define		APIC_DEST_ALLINC	0x80000
 #define		APIC_DEST_ALLBUT	0xC0000
@@ -61,16 +104,41 @@
 #define		APIC_DM_EXTINT		0x00700
 #define		APIC_VECTOR_MASK	0x000FF
 #define	APIC_ICR2	0x310
-#define		SET_APIC_DEST_FIELD(x)	((x) << 24)
-#define APIC_LVTT	0x320
+#define		GET_XAPIC_DEST_FIELD(x)	(((x) >> 24) & 0xFF)
+#define		SET_XAPIC_DEST_FIELD(x)	((x) << 24)
+#define	APIC_LVTT	0x320
+#define	APIC_LVTTHMR	0x330
+#define	APIC_LVTPC	0x340
+#define	APIC_LVT0	0x350
 #define		APIC_LVT_TIMER_ONESHOT		(0 << 17)
 #define		APIC_LVT_TIMER_PERIODIC		(1 << 17)
 #define		APIC_LVT_TIMER_TSCDEADLINE	(2 << 17)
 #define		APIC_LVT_MASKED			(1 << 16)
+#define		APIC_LVT_LEVEL_TRIGGER		(1 << 15)
+#define		APIC_LVT_REMOTE_IRR		(1 << 14)
+#define		APIC_INPUT_POLARITY		(1 << 13)
+#define		APIC_SEND_PENDING		(1 << 12)
+#define		APIC_MODE_MASK			0x700
+#define		GET_APIC_DELIVERY_MODE(x)	(((x) >> 8) & 0x7)
+#define		SET_APIC_DELIVERY_MODE(x, y)	(((x) & ~0x700) | ((y) << 8))
+#define			APIC_MODE_FIXED		0x0
+#define			APIC_MODE_NMI		0x4
+#define			APIC_MODE_EXTINT	0x7
+#define	APIC_LVT1	0x360
+#define	APIC_LVTERR	0x370
 #define	APIC_TMICT	0x380
 #define	APIC_TMCCT	0x390
 #define	APIC_TDCR	0x3E0
-#define	APIC_SELF_IPI	0x3F0
+#define APIC_SELF_IPI	0x3F0
+#define		APIC_TDR_DIV_TMBASE	(1 << 2)
+#define		APIC_TDR_DIV_1		0xB
+#define		APIC_TDR_DIV_2		0x0
+#define		APIC_TDR_DIV_4		0x1
+#define		APIC_TDR_DIV_8		0x2
+#define		APIC_TDR_DIV_16		0x3
+#define		APIC_TDR_DIV_32		0x8
+#define		APIC_TDR_DIV_64		0x9
+#define		APIC_TDR_DIV_128	0xA
 
 #define APIC_VECTOR_TO_BIT_NUMBER(v) ((unsigned int)(v) % 32)
 #define APIC_VECTOR_TO_REG_OFFSET(v) ((unsigned int)(v) / 32 * 0x10)
diff --git a/tools/testing/selftests/kvm/x86/fix_hypercall_test.c b/tools/testing/selftests/kvm/x86/fix_hypercall_test.c
index 753a0e730ea8..ad61da99ee4c 100644
--- a/tools/testing/selftests/kvm/x86/fix_hypercall_test.c
+++ b/tools/testing/selftests/kvm/x86/fix_hypercall_test.c
@@ -63,7 +63,7 @@ static void guest_main(void)
 
 	memcpy(hypercall_insn, other_hypercall_insn, HYPERCALL_INSN_SIZE);
 
-	ret = do_sched_yield(GET_APIC_ID_FIELD(xapic_read_reg(APIC_ID)));
+	ret = do_sched_yield(GET_XAPIC_DEST_FIELD(xapic_read_reg(APIC_ID)));
 
 	/*
 	 * If the quirk is disabled, verify that guest_ud_handler() "returned"
diff --git a/tools/testing/selftests/kvm/x86/xapic_ipi_test.c b/tools/testing/selftests/kvm/x86/xapic_ipi_test.c
index 39ce9a9369f5..75b87f850abc 100644
--- a/tools/testing/selftests/kvm/x86/xapic_ipi_test.c
+++ b/tools/testing/selftests/kvm/x86/xapic_ipi_test.c
@@ -91,7 +91,7 @@ static void halter_guest_code(struct test_data_page *data)
 	verify_apic_base_addr();
 	xapic_enable();
 
-	data->halter_apic_id = GET_APIC_ID_FIELD(xapic_read_reg(APIC_ID));
+	data->halter_apic_id = GET_XAPIC_DEST_FIELD(xapic_read_reg(APIC_ID));
 	data->halter_lvr = xapic_read_reg(APIC_LVR);
 
 	/*
@@ -147,7 +147,7 @@ static void sender_guest_code(struct test_data_page *data)
 	 * set data->halter_apic_id.
 	 */
 	icr_val = (APIC_DEST_PHYSICAL | APIC_DM_FIXED | IPI_VECTOR);
-	icr2_val = SET_APIC_DEST_FIELD(data->halter_apic_id);
+	icr2_val = SET_XAPIC_DEST_FIELD(data->halter_apic_id);
 	data->icr = icr_val;
 	data->icr2 = icr2_val;
 
diff --git a/tools/testing/selftests/kvm/x86/xapic_state_test.c b/tools/testing/selftests/kvm/x86/xapic_state_test.c
index 637bb90c1d93..3c7c6a5485e4 100644
--- a/tools/testing/selftests/kvm/x86/xapic_state_test.c
+++ b/tools/testing/selftests/kvm/x86/xapic_state_test.c
@@ -222,6 +222,221 @@ static void test_x2apic_id(void)
 	kvm_vm_free(vm);
 }
 
+#define X2APIC_MSR(r) (0x800 + ((r) >> 4))
+
+static bool is_x2apic_mode = true;
+
+static bool is_ro_only_reg(int reg)
+{
+	switch (reg) {
+	case APIC_ID:
+	case APIC_LVR:
+	case APIC_PROCPRI:
+	case APIC_LDR:
+	case APIC_ARBPRI:
+	case APIC_ISR:
+	case APIC_TMR:
+	case APIC_IRR:
+	case APIC_TMCCT:
+		return true;
+	}
+	return false;
+}
+
+static bool is_xapic_only_reg(int reg)
+{
+	return reg == APIC_ARBPRI || reg == APIC_DFR || reg == APIC_ICR2;
+}
+
+static bool is_accelerated_reg(int reg, bool write)
+{
+	if (!write)
+		return reg != APIC_TMCCT;
+
+	switch (reg) {
+	case APIC_TASKPRI:
+	case APIC_EOI:
+	case APIC_SELF_IPI:
+	case APIC_ICR:
+		return true;
+	default:
+		break;
+	}
+	return false;
+}
+
+static void x2apic_msr_guest_code(void)
+{
+	const u32 xapic_regs[] = {
+		APIC_ID,
+		APIC_LVR,
+		APIC_TASKPRI,
+		APIC_PROCPRI,
+		APIC_LDR,
+		APIC_SPIV,
+		APIC_ISR,
+		APIC_TMR,
+		APIC_IRR,
+		APIC_ESR,
+		APIC_LVTT,
+		APIC_LVTTHMR,
+		APIC_LVTPC,
+		APIC_LVT0,
+		APIC_LVT1,
+		APIC_LVTERR,
+		APIC_TMICT,
+		APIC_TMCCT,
+		APIC_TDCR,
+
+		// APIC_EOI,
+		// APIC_SELF_IPI,
+		// APIC_ICR,
+
+		APIC_ARBPRI,
+		APIC_DFR,
+		APIC_ICR2,
+	};
+	int i, j;
+	u64 val;
+	u32 msr;
+	u8 vec;
+
+	cli();
+
+	if (is_x2apic_mode)
+		x2apic_enable();
+
+	GUEST_SYNC(0xbeef);
+
+	for (i = 0; i < ARRAY_SIZE(xapic_regs); i++) {
+		int nr_regs;
+		u8 rd, wr;
+
+		if (!is_x2apic_mode || is_xapic_only_reg(xapic_regs[i])) {
+			rd = wr = GP_VECTOR;
+		} else {
+			rd = 0;
+			wr = is_ro_only_reg(xapic_regs[i]) ? GP_VECTOR : 0;
+		}
+
+		if (xapic_regs[i] == APIC_IRR ||
+		    xapic_regs[i] == APIC_ISR ||
+		    xapic_regs[i] == APIC_TMR)
+			nr_regs = APIC_ISR_NR;
+		else
+			nr_regs = 1;
+
+		for (j = 0; j < nr_regs; j++) {
+			msr = X2APIC_MSR(xapic_regs[i] + j * 0x10);
+
+			vec = rdmsr_safe(msr, &val);
+			__GUEST_ASSERT(vec == rd,
+				       "Wanted %s on RDMSR(%x), got %s",
+				       ex_str(rd), msr, ex_str(vec));
+			GUEST_SYNC3(xapic_regs[i], false, vec);
+
+			vec = wrmsr_safe(msr, 0);
+			__GUEST_ASSERT(vec == wr,
+				       "Wanted %s on WRMSR(%x), got %s",
+				       ex_str(wr), msr, ex_str(vec));
+
+			GUEST_SYNC3(xapic_regs[i], true, vec);
+		}
+	}
+	GUEST_DONE();
+}
+
+static void test_x2apic_msr_intercepts(void)
+{
+	u64 last_guest, last_io, last_msr, guest_exits, io_exits, msr_exits;
+	struct kvm_vcpu *vcpu;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	vm = vm_create_with_one_vcpu(&vcpu, x2apic_msr_guest_code);
+
+	TEST_ASSERT_EQ(vcpu_get_stat(vcpu, guest_induced_exits), 0);
+
+	vcpu_run(vcpu);
+	TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+	TEST_ASSERT_EQ(get_ucall(vcpu, NULL), UCALL_SYNC);
+
+	for ( ;; ) {
+		last_guest = vcpu_get_stat(vcpu, guest_induced_exits);
+		last_io = vcpu_get_stat(vcpu, io_exits);
+		last_msr = vcpu_get_stat(vcpu, msr_exits);
+
+		vcpu_run(vcpu);
+		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+		guest_exits = vcpu_get_stat(vcpu, guest_induced_exits);
+		io_exits = vcpu_get_stat(vcpu, io_exits);
+		msr_exits = vcpu_get_stat(vcpu, msr_exits);
+
+		TEST_ASSERT_EQ(io_exits, last_io + 1);
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_SYNC: {
+			int vector = uc.args[2];
+			bool write = uc.args[1];
+			u32 reg = uc.args[0];
+
+			printf("reg = %x, write = %u, fault = %u\n", reg, write, vector);
+			if (vector || !is_accelerated_reg(reg, write)) {
+				TEST_ASSERT_EQ(msr_exits, last_msr + 1);
+				TEST_ASSERT_EQ(guest_exits - last_guest,
+					       io_exits - last_io + msr_exits - last_msr);
+			} else {
+				TEST_ASSERT_EQ(msr_exits, last_msr);
+				TEST_ASSERT_EQ(guest_exits - last_guest,
+					       io_exits - last_io);
+			}
+			// printf("On msr = %x\n", msr);
+			break;
+		}
+		case UCALL_DONE:
+			goto test_xapic;
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT(uc);
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+
+test_xapic:
+	kvm_vm_free(vm);
+
+	vm = vm_create_with_one_vcpu(&vcpu, x2apic_msr_guest_code);
+	vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_X2APIC);
+	is_x2apic_mode = false;
+	sync_global_to_guest(vm, is_x2apic_mode);
+
+	for ( ;; ) {
+		vcpu_run(vcpu);
+		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
+
+		// vcpu_get_stat(vcpu, io_exits);
+		// vcpu_get_stat(vcpu, msr_exits);
+		// vcpu_get_stat(vcpu, guest_induced_exits);
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_SYNC:
+			// u32 msr = uc.args[1];
+
+			// printf("On msr = %x\n", msr);
+			break;
+		case UCALL_DONE:
+			goto done;
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT(uc);
+		default:
+			TEST_FAIL("Unknown ucall %lu", uc.cmd);
+		}
+	}
+done:
+}
+
 int main(int argc, char *argv[])
 {
 	struct xapic_vcpu x = {
@@ -230,6 +445,8 @@ int main(int argc, char *argv[])
 	};
 	struct kvm_vm *vm;
 
+	test_x2apic_msr_intercepts();
+
 	vm = vm_create_with_one_vcpu(&x.vcpu, x2apic_guest_code);
 	test_icr(&x);
 	kvm_vm_free(vm);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-06 18:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox