public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: kvm@vger.kernel.org, Ben Gardon <bgardon@google.com>,
	Bartosz Szczepanek <bsz@amazon.de>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [RFC] KVM: x86: Don't wipe TDP MMU when guest sets %cr4
Date: Tue, 10 Oct 2023 16:25:21 -0700	[thread overview]
Message-ID: <ZSXdYcMUds-DrHAd@google.com> (raw)
In-Reply-To: <b46ee4de968733a69117458e9f8f9d2a6682376f.camel@infradead.org>

On Tue, Oct 10, 2023, David Woodhouse wrote:
> If I understand things correctly, the point of the TDP MMU is to use
> page tables such as EPT for GPA → HPA translations, but let the
> virtualization support in the CPU handle all of the *virtual*
> addressing and page tables, including the non-root mode %cr3/%cr4.
> 
> I have a guest which loves to flip the SMEP bit on and off in %cr4 all
> the time. The guest is actually Xen, in its 'PV shim' mode which
> enables it to support a single PV guest, while running in a true
> hardware virtual machine:
> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00497.html
> 
> The performance is *awful*, since as far as I can tell, on every flip
> KVM flushes the entire EPT. I understand why that might be necessary
> for the mode where KVM is building up a set of shadow page tables to
> directly map GVA → HPA and be loaded into %cr3 of a CPU that doesn't
> support native EPT translations. But I don't understand why the TDP MMU
> would need to do it. Surely we don't have to change anything in the EPT
> just because the stuff in the non-root-mode %cr3/%cr4 changes?
> 
> So I tried this, and it went faster and nothing appears to have blown
> up.
> 
> Am I missing something? Is this stupidly wrong?

Heh, you're in luck, because regardless of what your darn pronoun "this" refers
to, the answer is yes, "this" is stupidly wrong.

The below is stupidly wrong.  KVM needs to at least reconfigure the guest's paging
metadata that is used to translate GVAs to GPAs during emulation.

But the TDP MMU behavior *was* also stupidly wrong.  The reason that two vCPUs
suck less is because KVM would zap SPTEs (EPT roots) if and only if *both* vCPUs
unloaded their roots at the same time.

Commit edbdb43fc96b ("KVM: x86: Preserve TDP MMU roots until they are explicitly
invalidated") should fix the behavior you're seeing.

And if we want to try and make SMEP blazing fast on Intel, we can probably let
the guest write it directly, i.e. give SMEP the same treatment as CR0.WP.  See
commits cf9f4c0eb169 ("KVM: x86/mmu: Refresh CR0.WP prior to checking for emulated
permission faults") and fb509f76acc8 ("KVM: VMX: Make CR0.WP a guest owned bit").

Oh, and if your userspace is doing something silly like constantly creating and
deleting memslots, see commit 0df9dab891ff ("KVM: x86/mmu: Stop zapping invalidated
TDP MMU roots asynchronously").

> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1072,7 +1074,8 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned
> long cr4)
>         if (kvm_x86_ops.set_cr4(vcpu, cr4))
>                 return 1;
>  
> -       kvm_post_set_cr4(vcpu, old_cr4, cr4);
> +       if (!vcpu->kvm->arch.tdp_mmu_enabled)
> +               kvm_post_set_cr4(vcpu, old_cr4, cr4);
>  
>         if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE))
>                 kvm_update_cpuid_runtime(vcpu);
> 
> 
> Also... if I have *two* vCPUs it doesn't go quite as slowly while Xen
> starts Grub and then Grub boots a Linux kernel. Until the kernel brings
> up its second vCPU and *then* it starts going really slowly again. Is
> that because the TDP roots are refcounted, and that idle vCPU holds
> onto the unused one and prevents it from being completely thrown away?
> Until the vCPU stops being idle and starts flipping SMEP on/off on
> Linux←→Xen transitions too?
> 
> In practice, there's not a lot of point in Xen using SMEP when it's
> purely acting as a library for its *one* guest, living in an HVM
> container. The above patch speeds things up but telling Xen not to use
> SMEP at all makes things go *much* faster. But if I'm not being
> *entirely* stupid, there may be some generic improvements for
> KVM+TDPMMU here somewhere so it's worth making a fool of myself by
> asking...?

  parent reply	other threads:[~2023-10-10 23:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10 21:36 [RFC] KVM: x86: Don't wipe TDP MMU when guest sets %cr4 David Woodhouse
2023-10-10 22:27 ` Jim Mattson
2023-10-10 23:25 ` Sean Christopherson [this message]
2023-10-11  8:20   ` David Woodhouse
2023-10-11 16:43     ` Jim Mattson
2023-10-11 17:06     ` Paolo Bonzini
2023-10-14  0:12     ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSXdYcMUds-DrHAd@google.com \
    --to=seanjc@google.com \
    --cc=bgardon@google.com \
    --cc=bp@alien8.de \
    --cc=bsz@amazon.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox