From: Sean Christopherson <seanjc@google.com>
To: Bernhard Kauer <bk@alpico.io>
Cc: kvm@vger.kernel.org, Chao Gao <chao.gao@intel.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v2] KVM: x86: Drop the kvm_has_noapic_vcpu optimization
Date: Mon, 9 Dec 2024 17:40:48 -0800 [thread overview]
Message-ID: <Z1ecILHBlpkiAThl@google.com> (raw)
In-Reply-To: <Z1eXyv2VVsFiw_0i@google.com>
+Paolo, I'm pretty sure he still doesn't subscribe to kvm@ :-)
On Mon, Dec 09, 2024, Sean Christopherson wrote:
> On Mon, Oct 21, 2024, Bernhard Kauer wrote:
> > It used a static key to avoid loading the lapic pointer from
> > the vcpu->arch structure. However, in the common case the load
> > is from a hot cacheline and the CPU should be able to perfectly
> > predict it. Thus there is no upside of this premature optimization.
> >
> > The downside is that code patching including an IPI to all CPUs
> > is required whenever the first VM without an lapic is created or
> > the last is destroyed.
> >
> > Signed-off-by: Bernhard Kauer <bk@alpico.io>
> > ---
> >
> > V1->V2: remove spillover from other patch and fix style
> >
> > arch/x86/kvm/lapic.c | 10 ++--------
> > arch/x86/kvm/lapic.h | 6 +-----
> > arch/x86/kvm/x86.c | 6 ------
> > 3 files changed, 3 insertions(+), 19 deletions(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 2098dc689088..287a43fae041 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -135,8 +135,6 @@ static inline int __apic_test_and_clear_vector(int vec, void *bitmap)
> > return __test_and_clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
> > }
> >
> > -__read_mostly DEFINE_STATIC_KEY_FALSE(kvm_has_noapic_vcpu);
> > -EXPORT_SYMBOL_GPL(kvm_has_noapic_vcpu);
> >
> > __read_mostly DEFINE_STATIC_KEY_DEFERRED_FALSE(apic_hw_disabled, HZ);
> > __read_mostly DEFINE_STATIC_KEY_DEFERRED_FALSE(apic_sw_disabled, HZ);
>
> I'm on the fence, slightly leaning towards removing all three of these static keys.
>
> If we remove kvm_has_noapic_vcpu to avoid the text patching, then we should
> definitely drop apic_sw_disabled, as vCPUs are practically guaranteed to toggle
> the S/W enable bit, e.g. it starts out '0' at RESET. And if we drop apic_sw_disabled,
> then keeping apic_hw_disabled seems rather pointless.
>
> Removing all three keys is measurable, but the impact is so tiny that I have a
> hard time believing anyone would notice in practice.
>
> To measure, I tweaked KVM to handle CPUID exits in the fastpath and then ran the
> KVM-Unit-Test CPUID microbenchmark (with some minor modifications). Handling
> CPUID in the fastpath makes the kvm_lapic_enabled() call in the innermost run loop
> stick out (that helpers checks all three keys/conditions).
>
> for (;;) {
> /*
> * Assert that vCPU vs. VM APICv state is consistent. An APICv
> * update must kick and wait for all vCPUs before toggling the
> * per-VM state, and responding vCPUs must wait for the update
> * to complete before servicing KVM_REQ_APICV_UPDATE.
> */
> WARN_ON_ONCE((kvm_vcpu_apicv_activated(vcpu) != kvm_vcpu_apicv_active(vcpu)) &&
> (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED));
>
> exit_fastpath = kvm_x86_call(vcpu_run)(vcpu,
> req_immediate_exit);
> if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
> break;
>
> if (kvm_lapic_enabled(vcpu))
> kvm_x86_call(sync_pir_to_irr)(vcpu);
>
> if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
> break;
> }
>
> /* Note, VM-Exits that go down the "slow" path are accounted below. */
> ++vcpu->stat.exits;
> }
>
> With a single vCPU pinned to a single pCPU, the average latency for a CPUID exit
> goes from 1018 => 1027 cycles, plus or minus a few. With 8 vCPUs, no pinning
> (mostly laziness), the average latency goes from 1034 => 1053.
>
> Other flows that check multiple vCPUs, e.g. kvm_irq_delivery_to_apic(), might be
> more affected? The optimized APIC map should help for common cases, but KVM does
> still check if APICs are enabled multiple times when delivering interrupts. And
> that's really my only hesitation: there are checks *everywhere* in KVM.
>
> On the other hand, we lose gobs and gobs of cycles with far less thought. E.g.
> with mitigations on, the latency for a single vCPU jumps all the way to 1600+ cycles.
>
> And while the diff stats are quite nice, the relevant code is low maintenance.
>
> arch/x86/kvm/lapic.c | 41 ++---------------------------------------
> arch/x86/kvm/lapic.h | 19 +++----------------
> arch/x86/kvm/x86.c | 4 +---
> 3 files changed, 6 insertions(+), 58 deletions(-)
>
> Paolo or anyone else... thoughts?
next prev parent reply other threads:[~2024-12-10 1:40 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-21 10:22 [PATCH v2] KVM: x86: Drop the kvm_has_noapic_vcpu optimization Bernhard Kauer
2024-12-10 1:22 ` Sean Christopherson
2024-12-10 1:40 ` Sean Christopherson [this message]
2024-12-10 8:16 ` Bernhard Kauer
2024-12-11 17:16 ` Sean Christopherson
2024-12-12 10:19 ` Bernhard Kauer
2024-12-12 15:16 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1ecILHBlpkiAThl@google.com \
--to=seanjc@google.com \
--cc=bk@alpico.io \
--cc=chao.gao@intel.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.