From: Sean Christopherson <seanjc@google.com>
To: Naveen N Rao <naveen@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count)
Date: Thu, 7 May 2026 08:44:06 -0700 [thread overview]
Message-ID: <afyzRvFqb19Kodej@google.com> (raw)
In-Reply-To: <afyfFRkUIHuodvG2@blrnaveerao1>
On Thu, May 07, 2026, Naveen N Rao wrote:
> On Wed, May 06, 2026 at 11:47:43AM -0700, Sean Christopherson wrote:
> > Explicitly intercept RDMSR for TMMCT, a.k.a. the current APIC timer count,
> > when x2AVIC is enabled, as TMMCT reads aren't accelerated by hardware.
>
> s/TMMCT/TMCCT for the above two lines.
>
> > Disabling interception is suboptimal as the RDMSR generates an
> > AVIC_UNACCELERATED_ACCESS fault #VMEXIT, which forces KVM to decode the
> > instruction to figure out what the guest was trying to access.
> >
> > Note, the only reason this isn't a fatal bug is that the AVIC architecture
> > had the foresight to guard against buggy hypervisors. E.g. if hardware
> > simply read from the virtual APIC page, the guest would get garbage.
> >
> > Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
> > Cc: stable@vger.kernel.org
> > Cc: Naveen N Rao (AMD) <naveen@kernel.org>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > arch/x86/kvm/svm/avic.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > index 4f203e503e8e..d693c9ff9f18 100644
> > --- a/arch/x86/kvm/svm/avic.c
> > +++ b/arch/x86/kvm/svm/avic.c
> > @@ -172,6 +172,9 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
> > svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
> > MSR_TYPE_R, intercept);
> >
> > + if (!intercept)
> > + svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
> > +
>
> Nit: I'm thinking it might be better to roll this into the previous
> loop. That way, all MSR_TYPE_R intercepts are setup in one place and we
> don't need to parse the if (!intercept) condition..
>
> Something like this?
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index c5d46c0d2403..f292cba45e07 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -136,11 +136,9 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
>
> for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
> BITS_PER_TYPE(x2apic_readable_mask))
> - svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
> - MSR_TYPE_R, intercept);
> -
> - if (!intercept)
> - svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
> + if (APIC_BASE_MSR + i != X2APIC_MSR(APIC_TMCCT))
> + svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
> + MSR_TYPE_R, intercept);
Hmm, I don't love burying the check in the for-loop, as it makes it a bit hard to
see that KVM *never* toggles the intercept for TMCCT. And in the unlikely scenario
we need to exempt more MSRs, this will be annoying to maintain.
What about adjusting the mask straightaway?
diff --git arch/x86/kvm/lapic.h arch/x86/kvm/lapic.h
index 274885af4ebc..2842a7b70d74 100644
--- arch/x86/kvm/lapic.h
+++ arch/x86/kvm/lapic.h
@@ -156,6 +156,8 @@ int kvm_hv_vapic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len);
void kvm_lapic_exit(void);
+#define APIC_REG_MASK(reg) (1ull << ((reg) >> 4))
+
u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic);
static inline void kvm_lapic_set_irr(int vec, struct kvm_lapic *apic)
diff --git arch/x86/kvm/svm/avic.c arch/x86/kvm/svm/avic.c
index c5d46c0d2403..e51f468477a4 100644
--- arch/x86/kvm/svm/avic.c
+++ arch/x86/kvm/svm/avic.c
@@ -132,16 +132,14 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
if (!x2avic_enabled)
return;
- x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+ x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic) &
+ ~APIC_REG_MASK(APIC_TMCCT);
for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
BITS_PER_TYPE(x2apic_readable_mask))
svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
MSR_TYPE_R, intercept);
- if (!intercept)
- svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
-
svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_W, intercept);
svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W, intercept);
svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W, intercept);
Oh! Actually, even better! This is a great opportunity to dedup Intel vs. AMD
(and we can/should do the same for writes).
diff --git arch/x86/kvm/lapic.c arch/x86/kvm/lapic.c
index 4078e624ca66..ac57d6dbe032 100644
--- arch/x86/kvm/lapic.c
+++ arch/x86/kvm/lapic.c
@@ -1730,7 +1730,7 @@ static inline struct kvm_lapic *to_lapic(struct kvm_io_device *dev)
#define APIC_REGS_MASK(first, count) \
(APIC_REG_MASK(first) * ((1ull << (count)) - 1))
-u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic)
+static u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic)
{
/* Leave bits '0' for reserved and write-only registers. */
u64 valid_reg_mask =
@@ -1766,7 +1766,22 @@ u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic)
return valid_reg_mask;
}
-EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_lapic_readable_reg_mask);
+
+u64 kvm_x2apic_disable_intercept_reg_mask(struct kvm_vcpu *vcpu)
+{
+ if (WARN_ON_ONCE(!lapic_in_kernel(vcpu)))
+ return 0;
+
+ /*
+ * TMMCT, a.k.a. the current APIC timer count, reads aren't accelerated
+ * by hardware (Intel or AMD), and handling the fault-like APIC-access
+ * VM-Exit is more expensive than handling a WRMSR VM-Exit (because the
+ * APIC-access path requires slow emulation of the code stream).
+ */
+ return kvm_lapic_readable_reg_mask(vcpu->arch.apic) &
+ ~APIC_REG_MASK(APIC_TMCCT);
+}
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_x2apic_disable_intercept_reg_mask);
static int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
void *data)
diff --git arch/x86/kvm/lapic.h arch/x86/kvm/lapic.h
index 274885af4ebc..7b61167e1b5c 100644
--- arch/x86/kvm/lapic.h
+++ arch/x86/kvm/lapic.h
@@ -156,7 +156,7 @@ int kvm_hv_vapic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len);
void kvm_lapic_exit(void);
-u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic);
+u64 kvm_x2apic_disable_intercept_reg_mask(struct kvm_vcpu *vcpu);
static inline void kvm_lapic_set_irr(int vec, struct kvm_lapic *apic)
{
diff --git arch/x86/kvm/svm/avic.c arch/x86/kvm/svm/avic.c
index c5d46c0d2403..1532bfff5686 100644
--- arch/x86/kvm/svm/avic.c
+++ arch/x86/kvm/svm/avic.c
@@ -132,16 +132,13 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
if (!x2avic_enabled)
return;
- x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+ x2apic_readable_mask = kvm_x2apic_disable_intercept_reg_mask(vcpu);
for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
BITS_PER_TYPE(x2apic_readable_mask))
svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
MSR_TYPE_R, intercept);
- if (!intercept)
- svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
-
svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_W, intercept);
svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W, intercept);
svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W, intercept);
diff --git arch/x86/kvm/vmx/vmx.c arch/x86/kvm/vmx/vmx.c
index 859f4bc01445..40dff50e800a 100644
--- arch/x86/kvm/vmx/vmx.c
+++ arch/x86/kvm/vmx/vmx.c
@@ -4138,7 +4138,7 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu)
* mode, only the current timer count needs on-demand emulation by KVM.
*/
if (mode & MSR_BITMAP_MODE_X2APIC_APICV)
- msr_bitmap[read_idx] = ~kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+ msr_bitmap[read_idx] = ~kvm_x2apic_disable_intercept_reg_mask(vcpu);
else
msr_bitmap[read_idx] = ~0ull;
msr_bitmap[write_idx] = ~0ull;
@@ -4151,7 +4151,6 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu)
!(mode & MSR_BITMAP_MODE_X2APIC));
if (mode & MSR_BITMAP_MODE_X2APIC_APICV) {
- vmx_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_RW);
vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W);
vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W);
if (enable_ipiv)
next prev parent reply other threads:[~2026-05-07 15:44 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
2026-05-07 13:56 ` Naveen N Rao
2026-05-07 14:27 ` Sean Christopherson
2026-05-08 16:35 ` Naveen N Rao
2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
2026-05-07 14:19 ` Naveen N Rao
2026-05-07 15:44 ` Sean Christopherson [this message]
2026-05-07 18:26 ` Sean Christopherson
2026-05-08 16:41 ` Naveen N Rao
2026-05-08 16:56 ` Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
2026-05-08 16:59 ` Naveen N Rao
2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
2026-05-08 17:14 ` Naveen N Rao
2026-05-08 17:49 ` Sean Christopherson
2026-05-09 5:08 ` Naveen N Rao
2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson
2026-05-09 5:10 ` [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Naveen N Rao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afyzRvFqb19Kodej@google.com \
--to=seanjc@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=naveen@kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox