From: Sean Christopherson <seanjc@google.com>
To: Bernhard Kauer <bk@alpico.io>
Cc: kvm@vger.kernel.org, Chao Gao <chao.gao@intel.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v2] KVM: x86: Drop the kvm_has_noapic_vcpu optimization
Date: Mon, 9 Dec 2024 17:40:48 -0800 [thread overview]
Message-ID: <Z1ecILHBlpkiAThl@google.com> (raw)
In-Reply-To: <Z1eXyv2VVsFiw_0i@google.com>
+Paolo, I'm pretty sure he still doesn't subscribe to kvm@ :-)
On Mon, Dec 09, 2024, Sean Christopherson wrote:
> On Mon, Oct 21, 2024, Bernhard Kauer wrote:
> > It used a static key to avoid loading the lapic pointer from
> > the vcpu->arch structure. However, in the common case the load
> > is from a hot cacheline and the CPU should be able to perfectly
> > predict it. Thus there is no upside of this premature optimization.
> >
> > The downside is that code patching including an IPI to all CPUs
> > is required whenever the first VM without an lapic is created or
> > the last is destroyed.
> >
> > Signed-off-by: Bernhard Kauer <bk@alpico.io>
> > ---
> >
> > V1->V2: remove spillover from other patch and fix style
> >
> > arch/x86/kvm/lapic.c | 10 ++--------
> > arch/x86/kvm/lapic.h | 6 +-----
> > arch/x86/kvm/x86.c | 6 ------
> > 3 files changed, 3 insertions(+), 19 deletions(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 2098dc689088..287a43fae041 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -135,8 +135,6 @@ static inline int __apic_test_and_clear_vector(int vec, void *bitmap)
> > return __test_and_clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
> > }
> >
> > -__read_mostly DEFINE_STATIC_KEY_FALSE(kvm_has_noapic_vcpu);
> > -EXPORT_SYMBOL_GPL(kvm_has_noapic_vcpu);
> >
> > __read_mostly DEFINE_STATIC_KEY_DEFERRED_FALSE(apic_hw_disabled, HZ);
> > __read_mostly DEFINE_STATIC_KEY_DEFERRED_FALSE(apic_sw_disabled, HZ);
>
> I'm on the fence, slightly leaning towards removing all three of these static keys.
>
> If we remove kvm_has_noapic_vcpu to avoid the text patching, then we should
> definitely drop apic_sw_disabled, as vCPUs are practically guaranteed to toggle
> the S/W enable bit, e.g. it starts out '0' at RESET. And if we drop apic_sw_disabled,
> then keeping apic_hw_disabled seems rather pointless.
>
> Removing all three keys is measurable, but the impact is so tiny that I have a
> hard time believing anyone would notice in practice.
>
> To measure, I tweaked KVM to handle CPUID exits in the fastpath and then ran the
> KVM-Unit-Test CPUID microbenchmark (with some minor modifications). Handling
> CPUID in the fastpath makes the kvm_lapic_enabled() call in the innermost run loop
> stick out (that helpers checks all three keys/conditions).
>
> for (;;) {
> /*
> * Assert that vCPU vs. VM APICv state is consistent. An APICv
> * update must kick and wait for all vCPUs before toggling the
> * per-VM state, and responding vCPUs must wait for the update
> * to complete before servicing KVM_REQ_APICV_UPDATE.
> */
> WARN_ON_ONCE((kvm_vcpu_apicv_activated(vcpu) != kvm_vcpu_apicv_active(vcpu)) &&
> (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED));
>
> exit_fastpath = kvm_x86_call(vcpu_run)(vcpu,
> req_immediate_exit);
> if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
> break;
>
> if (kvm_lapic_enabled(vcpu))
> kvm_x86_call(sync_pir_to_irr)(vcpu);
>
> if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
> break;
> }
>
> /* Note, VM-Exits that go down the "slow" path are accounted below. */
> ++vcpu->stat.exits;
> }
>
> With a single vCPU pinned to a single pCPU, the average latency for a CPUID exit
> goes from 1018 => 1027 cycles, plus or minus a few. With 8 vCPUs, no pinning
> (mostly laziness), the average latency goes from 1034 => 1053.
>
> Other flows that check multiple vCPUs, e.g. kvm_irq_delivery_to_apic(), might be
> more affected? The optimized APIC map should help for common cases, but KVM does
> still check if APICs are enabled multiple times when delivering interrupts. And
> that's really my only hesitation: there are checks *everywhere* in KVM.
>
> On the other hand, we lose gobs and gobs of cycles with far less thought. E.g.
> with mitigations on, the latency for a single vCPU jumps all the way to 1600+ cycles.
>
> And while the diff stats are quite nice, the relevant code is low maintenance.
>
> arch/x86/kvm/lapic.c | 41 ++---------------------------------------
> arch/x86/kvm/lapic.h | 19 +++----------------
> arch/x86/kvm/x86.c | 4 +---
> 3 files changed, 6 insertions(+), 58 deletions(-)
>
> Paolo or anyone else... thoughts?
next prev parent reply other threads:[~2024-12-10 1:40 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-21 10:22 [PATCH v2] KVM: x86: Drop the kvm_has_noapic_vcpu optimization Bernhard Kauer
2024-12-10 1:22 ` Sean Christopherson
2024-12-10 1:40 ` Sean Christopherson [this message]
2024-12-10 8:16 ` Bernhard Kauer
2024-12-11 17:16 ` Sean Christopherson
2024-12-12 10:19 ` Bernhard Kauer
2024-12-12 15:16 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1ecILHBlpkiAThl@google.com \
--to=seanjc@google.com \
--cc=bk@alpico.io \
--cc=chao.gao@intel.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox