From: Avi Kivity <avi@redhat.com>
To: "Mao, Junjie" <junjie.mao@intel.com>
Cc: "'kvm@vger.kernel.org'" <kvm@vger.kernel.org>
Subject: Re: [PATCH] KVM: x86: Implement PCID/INVPCID for guests with EPT
Date: Thu, 10 May 2012 14:48:21 +0300 [thread overview]
Message-ID: <4FABAB05.1080001@redhat.com> (raw)
In-Reply-To: <EF5A1D57CFBD5A4BA5EB3ED985B6DC6E064239@SHSMSX101.ccr.corp.intel.com>
On 05/10/2012 03:32 AM, Mao, Junjie wrote:
> This patch handles PCID/INVPCID for guests.
>
> Process-context identifiers (PCIDs) are a facility by which a logical processor may cache information for multiple linear-address spaces so that the processor may retain cached information when software switches to a different linear-address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual Volume 3A for details.
>
> For guests with EPT, the PCID feature is enabled and INVPCID behaves as running natively.
> For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD.
>
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 74c9edf..bb9a707 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -52,7 +52,7 @@
> #define CR4_RESERVED_BITS \
> (~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
> | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
> - | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR \
> + | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
> | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_RDWRGSFS \
> | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE))
We should hide cr4.pcide from nested vmx, until we prepare that code to
handle it.
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d2bd719..ba00789 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -413,6 +413,7 @@ struct vcpu_vmx {
> u32 exit_reason;
>
> bool rdtscp_enabled;
> + bool invpcid_enabled;
>
> /* Support for a guest hypervisor (nested VMX) */
> struct nested_vmx nested;
> @@ -839,6 +840,12 @@ static inline bool cpu_has_vmx_rdtscp(void)
> SECONDARY_EXEC_RDTSCP;
> }
>
> +static bool vmx_pcid_supported(void)
> +{
> + /* Enable PCID for non-ept guests may cause performance regression */
Why is that?
> + return enable_ept && (boot_cpu_data.x86_capability[4] & bit(X86_FEATURE_PCID));
> +}
> +
> /*
> * Swap MSR entry in host/guest MSR entry array.
> */
> @@ -4337,8 +4352,14 @@ static int handle_set_cr0(struct kvm_vcpu *vcpu, unsigned long val)
> return 1;
> vmcs_writel(CR0_READ_SHADOW, val);
> return 0;
> - } else
> + } else {
> + unsigned long old_cr0 = kvm_read_cr0(vcpu);
> + if ((old_cr0 & X86_CR0_PG) && !(val & X86_CR0_PG) &&
> + (kvm_read_cr4(vcpu) & X86_CR4_PCIDE))
Use kvm_read_cr4_bits(), it's slightly faster. Also move this to x86.c.
> + return 1;
> +
> return kvm_set_cr0(vcpu, val);
> + }
> }
>
> static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val)
> @@ -4349,8 +4370,26 @@ static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val)
> return 1;
> vmcs_writel(CR4_READ_SHADOW, val);
> return 0;
> - } else
> - return kvm_set_cr4(vcpu, val);
> + } else {
> + unsigned long old_cr4 = kvm_read_cr4(vcpu);
> + int ret = 1;
> +
> + if ((val & X86_CR4_PCIDE) && !(old_cr4 & X86_CR4_PCIDE)) {
> + if (!guest_cpuid_has_pcid(vcpu))
> + return ret;
> +
> + /* PCID can not be enabled when cr3[11:0]!=000H or EFER.LMA=0 */
> + if ((kvm_read_cr3(vcpu) & X86_CR3_PCID_MASK) || !is_long_mode(vcpu))
> + return ret;
> + }
> +
> + ret = kvm_set_cr4(vcpu, val);
> +
> + if (!ret && (!(val & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
> + kvm_mmu_reset_context(vcpu);
> +
> + return ret;
> + }
Move to x86.c please.
> }
>
> /* called to set cr0 as approriate for clts instruction exit. */
> @@ -6420,6 +6459,23 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> }
> }
> }
> +
> + vmx->invpcid_enabled = false;
> + if (vmx_pcid_supported()) {
> + exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
> + if (exec_control & SECONDARY_EXEC_ENABLE_INVPCID) {
> + best = kvm_find_cpuid_entry(vcpu, 0x1, 0);
> + if (best && (best->ecx & bit(X86_FEATURE_PCID)))
> + vmx->invpcid_enabled = true;
> + else {
> + exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
> + vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
> + exec_control);
> + best = kvm_find_cpuid_entry(vcpu, 0x7, 0);
> + best->ecx &= ~bit(X86_FEATURE_INVPCID);
> + }
> + }
> + }
> }
>
>
If we enter a nested guest (which is running without PCID), we need
either to handle INVPCID exits (and inject a #UD) or disable INVPCID in
exec controls. The first is faster since it doesn't involve VMWRITEs.
If we do that, we don't need this code (since it will work for
non-nested guests as well).
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2012-05-10 11:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-10 0:32 [PATCH] KVM: x86: Implement PCID/INVPCID for guests with EPT Mao, Junjie
2012-05-10 11:48 ` Avi Kivity [this message]
2012-05-11 5:58 ` Mao, Junjie
2012-05-13 10:02 ` Avi Kivity
2012-05-14 7:18 ` Mao, Junjie
2012-05-10 11:49 ` Avi Kivity
[not found] ` <CAG7+5M2XSOoHqqpbp0YbjgNNfa6DwrfP+88TwRUbhBDUDH6q6A@mail.gmail.com>
2012-05-14 7:15 ` Mao, Junjie
2012-05-15 2:20 ` Marcelo Tosatti
2012-05-15 3:28 ` Mao, Junjie
2012-05-15 3:27 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FABAB05.1080001@redhat.com \
--to=avi@redhat.com \
--cc=junjie.mao@intel.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox