public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] KVM: x86: Don't wipe TDP MMU when guest sets %cr4
@ 2023-10-10 21:36 David Woodhouse
  2023-10-10 22:27 ` Jim Mattson
  2023-10-10 23:25 ` Sean Christopherson
  0 siblings, 2 replies; 7+ messages in thread
From: David Woodhouse @ 2023-10-10 21:36 UTC (permalink / raw)
  To: kvm, Ben Gardon, Szczepanek, Bartosz
  Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin

[-- Attachment #1: Type: text/plain, Size: 2555 bytes --]

If I understand things correctly, the point of the TDP MMU is to use
page tables such as EPT for GPA → HPA translations, but let the
virtualization support in the CPU handle all of the *virtual*
addressing and page tables, including the non-root mode %cr3/%cr4.

I have a guest which loves to flip the SMEP bit on and off in %cr4 all
the time. The guest is actually Xen, in its 'PV shim' mode which
enables it to support a single PV guest, while running in a true
hardware virtual machine:
https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00497.html

The performance is *awful*, since as far as I can tell, on every flip
KVM flushes the entire EPT. I understand why that might be necessary
for the mode where KVM is building up a set of shadow page tables to
directly map GVA → HPA and be loaded into %cr3 of a CPU that doesn't
support native EPT translations. But I don't understand why the TDP MMU
would need to do it. Surely we don't have to change anything in the EPT
just because the stuff in the non-root-mode %cr3/%cr4 changes?

So I tried this, and it went faster and nothing appears to have blown
up.

Am I missing something? Is this stupidly wrong?

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1072,7 +1074,8 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned
long cr4)
        if (kvm_x86_ops.set_cr4(vcpu, cr4))
                return 1;
 
-       kvm_post_set_cr4(vcpu, old_cr4, cr4);
+       if (!vcpu->kvm->arch.tdp_mmu_enabled)
+               kvm_post_set_cr4(vcpu, old_cr4, cr4);
 
        if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE))
                kvm_update_cpuid_runtime(vcpu);


Also... if I have *two* vCPUs it doesn't go quite as slowly while Xen
starts Grub and then Grub boots a Linux kernel. Until the kernel brings
up its second vCPU and *then* it starts going really slowly again. Is
that because the TDP roots are refcounted, and that idle vCPU holds
onto the unused one and prevents it from being completely thrown away?
Until the vCPU stops being idle and starts flipping SMEP on/off on
Linux←→Xen transitions too?

In practice, there's not a lot of point in Xen using SMEP when it's
purely acting as a library for its *one* guest, living in an HVM
container. The above patch speeds things up but telling Xen not to use
SMEP at all makes things go *much* faster. But if I'm not being
*entirely* stupid, there may be some generic improvements for
KVM+TDPMMU here somewhere so it's worth making a fool of myself by
asking...?

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-10-14  0:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-10 21:36 [RFC] KVM: x86: Don't wipe TDP MMU when guest sets %cr4 David Woodhouse
2023-10-10 22:27 ` Jim Mattson
2023-10-10 23:25 ` Sean Christopherson
2023-10-11  8:20   ` David Woodhouse
2023-10-11 16:43     ` Jim Mattson
2023-10-11 17:06     ` Paolo Bonzini
2023-10-14  0:12     ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox