The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Naveen N Rao <naveen@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count)
Date: Thu, 7 May 2026 08:44:06 -0700	[thread overview]
Message-ID: <afyzRvFqb19Kodej@google.com> (raw)
In-Reply-To: <afyfFRkUIHuodvG2@blrnaveerao1>

On Thu, May 07, 2026, Naveen N Rao wrote:
> On Wed, May 06, 2026 at 11:47:43AM -0700, Sean Christopherson wrote:
> > Explicitly intercept RDMSR for TMMCT, a.k.a. the current APIC timer count,
> > when x2AVIC is enabled, as TMMCT reads aren't accelerated by hardware.
> 
> s/TMMCT/TMCCT for the above two lines.
> 
> > Disabling interception is suboptimal as the RDMSR generates an
> > AVIC_UNACCELERATED_ACCESS fault #VMEXIT, which forces KVM to decode the
> > instruction to figure out what the guest was trying to access.
> > 
> > Note, the only reason this isn't a fatal bug is that the AVIC architecture
> > had the foresight to guard against buggy hypervisors.  E.g. if hardware
> > simply read from the virtual APIC page, the guest would get garbage.
> > 
> > Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
> > Cc: stable@vger.kernel.org
> > Cc: Naveen N Rao (AMD) <naveen@kernel.org>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/svm/avic.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > index 4f203e503e8e..d693c9ff9f18 100644
> > --- a/arch/x86/kvm/svm/avic.c
> > +++ b/arch/x86/kvm/svm/avic.c
> > @@ -172,6 +172,9 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
> >  		svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
> >  					  MSR_TYPE_R, intercept);
> >  
> > +	if (!intercept)
> > +		svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
> > +
> 
> Nit: I'm thinking it might be better to roll this into the previous 
> loop. That way, all MSR_TYPE_R intercepts are setup in one place and we 
> don't need to parse the if (!intercept) condition..
> 
> Something like this?
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index c5d46c0d2403..f292cba45e07 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -136,11 +136,9 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
> 
>         for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
>                          BITS_PER_TYPE(x2apic_readable_mask))
> -               svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
> -                                         MSR_TYPE_R, intercept);
> -
> -       if (!intercept)
> -               svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
> +               if (APIC_BASE_MSR + i != X2APIC_MSR(APIC_TMCCT))
> +                       svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
> +                                                 MSR_TYPE_R, intercept);

Hmm, I don't love burying the check in the for-loop, as it makes it a bit hard to
see that KVM *never* toggles the intercept for TMCCT.  And in the unlikely scenario
we need to exempt more MSRs, this will be annoying to maintain.

What about adjusting the mask straightaway?

diff --git arch/x86/kvm/lapic.h arch/x86/kvm/lapic.h
index 274885af4ebc..2842a7b70d74 100644
--- arch/x86/kvm/lapic.h
+++ arch/x86/kvm/lapic.h
@@ -156,6 +156,8 @@ int kvm_hv_vapic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
 int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len);
 void kvm_lapic_exit(void);
 
+#define APIC_REG_MASK(reg)     (1ull << ((reg) >> 4))
+
 u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic);
 
 static inline void kvm_lapic_set_irr(int vec, struct kvm_lapic *apic)
diff --git arch/x86/kvm/svm/avic.c arch/x86/kvm/svm/avic.c
index c5d46c0d2403..e51f468477a4 100644
--- arch/x86/kvm/svm/avic.c
+++ arch/x86/kvm/svm/avic.c
@@ -132,16 +132,14 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
        if (!x2avic_enabled)
                return;
 
-       x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+       x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic) &
+                              ~APIC_REG_MASK(APIC_TMCCT);
 
        for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
                         BITS_PER_TYPE(x2apic_readable_mask))
                svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
                                          MSR_TYPE_R, intercept);
 
-       if (!intercept)
-               svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
-
        svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_W, intercept);
        svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W, intercept);
        svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W, intercept);


Oh!  Actually, even better!  This is a great opportunity to dedup Intel vs. AMD
(and we can/should do the same for writes).

diff --git arch/x86/kvm/lapic.c arch/x86/kvm/lapic.c
index 4078e624ca66..ac57d6dbe032 100644
--- arch/x86/kvm/lapic.c
+++ arch/x86/kvm/lapic.c
@@ -1730,7 +1730,7 @@ static inline struct kvm_lapic *to_lapic(struct kvm_io_device *dev)
 #define APIC_REGS_MASK(first, count) \
        (APIC_REG_MASK(first) * ((1ull << (count)) - 1))
 
-u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic)
+static u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic)
 {
        /* Leave bits '0' for reserved and write-only registers. */
        u64 valid_reg_mask =
@@ -1766,7 +1766,22 @@ u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic)
 
        return valid_reg_mask;
 }
-EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_lapic_readable_reg_mask);
+
+u64 kvm_x2apic_disable_intercept_reg_mask(struct kvm_vcpu *vcpu)
+{
+       if (WARN_ON_ONCE(!lapic_in_kernel(vcpu)))
+               return 0;
+
+       /*
+        * TMMCT, a.k.a. the current APIC timer count, reads aren't accelerated
+        * by hardware (Intel or AMD), and handling the fault-like APIC-access
+        * VM-Exit is more expensive than handling a WRMSR VM-Exit (because the
+        * APIC-access path requires slow emulation of the code stream).
+        */
+       return kvm_lapic_readable_reg_mask(vcpu->arch.apic) &
+              ~APIC_REG_MASK(APIC_TMCCT);
+}
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_x2apic_disable_intercept_reg_mask);
 
 static int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
                              void *data)
diff --git arch/x86/kvm/lapic.h arch/x86/kvm/lapic.h
index 274885af4ebc..7b61167e1b5c 100644
--- arch/x86/kvm/lapic.h
+++ arch/x86/kvm/lapic.h
@@ -156,7 +156,7 @@ int kvm_hv_vapic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
 int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len);
 void kvm_lapic_exit(void);
 
-u64 kvm_lapic_readable_reg_mask(struct kvm_lapic *apic);
+u64 kvm_x2apic_disable_intercept_reg_mask(struct kvm_vcpu *vcpu);
 
 static inline void kvm_lapic_set_irr(int vec, struct kvm_lapic *apic)
 {
diff --git arch/x86/kvm/svm/avic.c arch/x86/kvm/svm/avic.c
index c5d46c0d2403..1532bfff5686 100644
--- arch/x86/kvm/svm/avic.c
+++ arch/x86/kvm/svm/avic.c
@@ -132,16 +132,13 @@ static void avic_set_x2apic_msr_interception(struct vcpu_svm *svm,
        if (!x2avic_enabled)
                return;
 
-       x2apic_readable_mask = kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+       x2apic_readable_mask = kvm_x2apic_disable_intercept_reg_mask(vcpu);
 
        for_each_set_bit(i, (unsigned long *)&x2apic_readable_mask,
                         BITS_PER_TYPE(x2apic_readable_mask))
                svm_set_intercept_for_msr(vcpu, APIC_BASE_MSR + i,
                                          MSR_TYPE_R, intercept);
 
-       if (!intercept)
-               svm_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
-
        svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_W, intercept);
        svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W, intercept);
        svm_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W, intercept);
diff --git arch/x86/kvm/vmx/vmx.c arch/x86/kvm/vmx/vmx.c
index 859f4bc01445..40dff50e800a 100644
--- arch/x86/kvm/vmx/vmx.c
+++ arch/x86/kvm/vmx/vmx.c
@@ -4138,7 +4138,7 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu)
         * mode, only the current timer count needs on-demand emulation by KVM.
         */
        if (mode & MSR_BITMAP_MODE_X2APIC_APICV)
-               msr_bitmap[read_idx] = ~kvm_lapic_readable_reg_mask(vcpu->arch.apic);
+               msr_bitmap[read_idx] = ~kvm_x2apic_disable_intercept_reg_mask(vcpu);
        else
                msr_bitmap[read_idx] = ~0ull;
        msr_bitmap[write_idx] = ~0ull;
@@ -4151,7 +4151,6 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu)
                                  !(mode & MSR_BITMAP_MODE_X2APIC));
 
        if (mode & MSR_BITMAP_MODE_X2APIC_APICV) {
-               vmx_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_RW);
                vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W);
                vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W);
                if (enable_ipiv)

  reply	other threads:[~2026-05-07 15:44 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06 18:47 [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 1/5] KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports Sean Christopherson
2026-05-07 13:56   ` Naveen N Rao
2026-05-07 14:27     ` Sean Christopherson
2026-05-08 16:35       ` Naveen N Rao
2026-05-06 18:47 ` [PATCH v2 2/5] KVM: SVM: Always intercept RDMSR for TMCCT (current APIC timer count) Sean Christopherson
2026-05-07 14:19   ` Naveen N Rao
2026-05-07 15:44     ` Sean Christopherson [this message]
2026-05-07 18:26       ` Sean Christopherson
2026-05-08 16:41         ` Naveen N Rao
2026-05-08 16:56           ` Sean Christopherson
2026-05-06 18:47 ` [PATCH v2 3/5] KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated Sean Christopherson
2026-05-08 16:59   ` Naveen N Rao
2026-05-06 18:47 ` [PATCH v2 4/5] *** DO NOT MERGE *** KVM: x86: Hack in a stat to track guest-induced exits (for testing) Sean Christopherson
2026-05-08 17:14   ` Naveen N Rao
2026-05-08 17:49     ` Sean Christopherson
2026-05-09  5:08       ` Naveen N Rao
2026-05-06 18:47 ` [PATCH v2 5/5] *** DO NOT MERGE *** KVM: selftests: Add hacky test to verify x2APIC MSR interception Sean Christopherson
2026-05-09  5:10 ` [PATCH v2 0/5] KVM: SVM: Fix x2AVIC MSR interception issues Naveen N Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afyzRvFqb19Kodej@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=naveen@kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox