Re: [PATCH 1/2] KVM: x86: Synthesize at most one PMI per VM-exit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mingwei Zhang <mizhang@google.com>
To: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
Cc: Jim Mattson <jmattson@google.com>,
	kvm@vger.kernel.org, Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Like Xu <likexu@tencent.com>,
	Roman Kagan <rkagan@amazon.de>, Kan Liang <kan.liang@intel.com>,
	Dapeng1 Mi <dapeng1.mi@intel.com>
Subject: Re: [PATCH 1/2] KVM: x86: Synthesize at most one PMI per VM-exit
Date: Wed, 6 Sep 2023 20:54:43 +0000	[thread overview]
Message-ID: <ZPjnExoXOsqlfagD@google.com> (raw)
In-Reply-To: <8b964afb-4b8e-b8fb-542c-c76487340705@linux.intel.com>

On Wed, Sep 06, 2023, Mi, Dapeng wrote:
> On 9/2/2023 2:56 AM, Jim Mattson wrote:
> > When the irq_work callback, kvm_pmi_trigger_fn(), is invoked during a
> > VM-exit that also invokes __kvm_perf_overflow() as a result of
> > instruction emulation, kvm_pmu_deliver_pmi() will be called twice
> > before the next VM-entry.
> 
> 
> Do we have a way to reproduce this spurious NMI error constantly?
> 
I think I shared with you in another thread. I also shared the event
list and command here:

https://lore.kernel.org/all/ZKCD30QrE5g9XGIh@google.com/

To observe the spurious NMIs, you can just continously look at the
NMIs/PMIs in /proc/interrupts and see if you have huge number within 2
minutes when running the above command in a 8-vcpu QEMU VM on Intel SPR
machine. Huge here means more than 10K.

In addition, you may observe the following warning from kernel dmesg:

[3939579.462832] Uhhuh. NMI received for unknown reason 30 on CPU 43.
[3939579.462836] Dazed and confused, but trying to continue

> 
> > 
> > That shouldn't be a problem. The local APIC is supposed to
> > automatically set the mask flag in LVTPC when it handles a PMI, so the
> > second PMI should be inhibited. However, KVM's local APIC emulation
> > fails to set the mask flag in LVTPC when it handles a PMI, so two PMIs
> > are delivered via the local APIC. In the common case, where LVTPC is
> > configured to deliver an NMI, the first NMI is vectored through the
> > guest IDT, and the second one is held pending. When the NMI handler
> > returns, the second NMI is vectored through the IDT. For Linux guests,
> > this results in the "dazed and confused" spurious NMI message.
> > 
> > Though the obvious fix is to set the mask flag in LVTPC when handling
> > a PMI, KVM's logic around synthesizing a PMI is unnecessarily
> > convoluted.
> > 
> > Remove the irq_work callback for synthesizing a PMI, and all of the
> > logic for invoking it. Instead, to prevent a vcpu from leaving C0 with
> > a PMI pending, add a check for KVM_REQ_PMI to kvm_vcpu_has_events().
> > 
> > Fixes: 9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions")
> > Signed-off-by: Jim Mattson <jmattson@google.com>
> > ---
> >   arch/x86/include/asm/kvm_host.h |  1 -
> >   arch/x86/kvm/pmu.c              | 27 +--------------------------
> >   arch/x86/kvm/x86.c              |  3 +++
> >   3 files changed, 4 insertions(+), 27 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 3bc146dfd38d..f6b9e3ae08bf 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -528,7 +528,6 @@ struct kvm_pmu {
> >   	u64 raw_event_mask;
> >   	struct kvm_pmc gp_counters[KVM_INTEL_PMC_MAX_GENERIC];
> >   	struct kvm_pmc fixed_counters[KVM_PMC_MAX_FIXED];
> > -	struct irq_work irq_work;
> >   	/*
> >   	 * Overlay the bitmap with a 64-bit atomic so that all bits can be
> > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > index bf653df86112..0c117cd24077 100644
> > --- a/arch/x86/kvm/pmu.c
> > +++ b/arch/x86/kvm/pmu.c
> > @@ -93,14 +93,6 @@ void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
> >   #undef __KVM_X86_PMU_OP
> >   }
> > -static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
> > -{
> > -	struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
> > -	struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
> > -
> > -	kvm_pmu_deliver_pmi(vcpu);
> > -}
> > -
> >   static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
> >   {
> >   	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> > @@ -124,20 +116,7 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
> >   		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
> >   	}
> > -	if (!pmc->intr || skip_pmi)
> > -		return;
> > -
> > -	/*
> > -	 * Inject PMI. If vcpu was in a guest mode during NMI PMI
> > -	 * can be ejected on a guest mode re-entry. Otherwise we can't
> > -	 * be sure that vcpu wasn't executing hlt instruction at the
> > -	 * time of vmexit and is not going to re-enter guest mode until
> > -	 * woken up. So we should wake it, but this is impossible from
> > -	 * NMI context. Do it from irq work instead.
> > -	 */
> > -	if (in_pmi && !kvm_handling_nmi_from_guest(pmc->vcpu))
> > -		irq_work_queue(&pmc_to_pmu(pmc)->irq_work);
> > -	else
> > +	if (pmc->intr && !skip_pmi)
> >   		kvm_make_request(KVM_REQ_PMI, pmc->vcpu);
> >   }
> > @@ -677,9 +656,6 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
> >   void kvm_pmu_reset(struct kvm_vcpu *vcpu)
> >   {
> > -	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> > -
> > -	irq_work_sync(&pmu->irq_work);
> >   	static_call(kvm_x86_pmu_reset)(vcpu);
> >   }
> > @@ -689,7 +665,6 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
> >   	memset(pmu, 0, sizeof(*pmu));
> >   	static_call(kvm_x86_pmu_init)(vcpu);
> > -	init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
> >   	pmu->event_count = 0;
> >   	pmu->need_cleanup = false;
> >   	kvm_pmu_refresh(vcpu);
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index c381770bcbf1..0732c09fbd2d 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -12841,6 +12841,9 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
> >   		return true;
> >   #endif
> > +	if (kvm_test_request(KVM_REQ_PMI, vcpu))
> > +		return true;
> > +
> >   	if (kvm_arch_interrupt_allowed(vcpu) &&
> >   	    (kvm_cpu_has_interrupt(vcpu) ||
> >   	    kvm_guest_apic_has_interrupt(vcpu)))

next prev parent reply	other threads:[~2023-09-06 20:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-01 18:56 [PATCH 1/2] KVM: x86: Synthesize at most one PMI per VM-exit Jim Mattson
2023-09-01 18:56 ` [PATCH 2/2] KVM: x86: Mask LVTPC when handling a PMI Jim Mattson
2023-09-02 19:06   ` Mingwei Zhang
2023-09-06  8:59   ` Mi, Dapeng1
2023-09-22 18:22   ` Sean Christopherson
2023-09-25 17:52     ` Jim Mattson
2023-09-25 18:00       ` Sean Christopherson
2023-09-02 19:05 ` [PATCH 1/2] KVM: x86: Synthesize at most one PMI per VM-exit Mingwei Zhang
2023-09-06  9:17 ` Mi, Dapeng
2023-09-06 20:54   ` Mingwei Zhang [this message]
2023-09-07  6:29     ` Mi, Dapeng
2023-09-14 11:57 ` Like Xu
2023-09-14 14:27   ` Sean Christopherson
2023-09-22 18:46 ` Sean Christopherson
2023-09-22 19:04   ` Jim Mattson
2023-09-22 19:21     ` Sean Christopherson
2023-09-22 20:25       ` Mingwei Zhang
2023-09-22 20:34         ` Sean Christopherson
2023-09-22 20:49           ` Mingwei Zhang
2023-09-22 21:02             ` Mingwei Zhang
2023-09-22 22:44               ` Sean Christopherson
2023-09-25  6:00                 ` Mingwei Zhang
2023-09-25 19:54               ` Mingwei Zhang
2023-09-22 21:06             ` Sean Christopherson
2023-09-22 22:42               ` Mingwei Zhang
2023-09-22 23:00                 ` Sean Christopherson
2023-09-25  6:09                   ` Mingwei Zhang
2023-09-25 16:22                     ` Mingwei Zhang
2023-09-25 17:06                       ` Sean Christopherson
2023-09-25  7:06                 ` Like Xu
2023-09-25  7:33       ` Like Xu
  -- strict thread matches above, loose matches on Subject: below --
2023-09-25 17:34 [PATCH 0/2] Fix the duplicate PMI injections in vPMU Mingwei Zhang
2023-09-25 17:34 ` [PATCH 1/2] KVM: x86: Synthesize at most one PMI per VM-exit Mingwei Zhang
2023-09-25 17:59   ` Sean Christopherson
2023-09-25 19:33     ` Mingwei Zhang
2023-09-25 21:28       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZPjnExoXOsqlfagD@google.com \
    --to=mizhang@google.com \
    --cc=dapeng1.mi@intel.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=jmattson@google.com \
    --cc=kan.liang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=likexu@tencent.com \
    --cc=pbonzini@redhat.com \
    --cc=rkagan@amazon.de \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.