From: Avi Kivity <avi@redhat.com>
To: "Mao, Junjie" <junjie.mao@intel.com>
Cc: "'kvm@vger.kernel.org'" <kvm@vger.kernel.org>
Subject: Re: [PATCH] KVM: x86: Implement PCID/INVPCID for guests with EPT
Date: Thu, 10 May 2012 14:48:21 +0300 [thread overview]
Message-ID: <4FABAB05.1080001@redhat.com> (raw)
In-Reply-To: <EF5A1D57CFBD5A4BA5EB3ED985B6DC6E064239@SHSMSX101.ccr.corp.intel.com>
On 05/10/2012 03:32 AM, Mao, Junjie wrote:
> This patch handles PCID/INVPCID for guests.
>
> Process-context identifiers (PCIDs) are a facility by which a logical processor may cache information for multiple linear-address spaces so that the processor may retain cached information when software switches to a different linear-address space. Refer to section 4.10.1 in IA32 Intel Software Developer's Manual Volume 3A for details.
>
> For guests with EPT, the PCID feature is enabled and INVPCID behaves as running natively.
> For guests without EPT, the PCID feature is disabled and INVPCID triggers #UD.
>
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 74c9edf..bb9a707 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -52,7 +52,7 @@
> #define CR4_RESERVED_BITS \
> (~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
> | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
> - | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR \
> + | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
> | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_RDWRGSFS \
> | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE))
We should hide cr4.pcide from nested vmx, until we prepare that code to
handle it.
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d2bd719..ba00789 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -413,6 +413,7 @@ struct vcpu_vmx {
> u32 exit_reason;
>
> bool rdtscp_enabled;
> + bool invpcid_enabled;
>
> /* Support for a guest hypervisor (nested VMX) */
> struct nested_vmx nested;
> @@ -839,6 +840,12 @@ static inline bool cpu_has_vmx_rdtscp(void)
> SECONDARY_EXEC_RDTSCP;
> }
>
> +static bool vmx_pcid_supported(void)
> +{
> + /* Enable PCID for non-ept guests may cause performance regression */
Why is that?
> + return enable_ept && (boot_cpu_data.x86_capability[4] & bit(X86_FEATURE_PCID));
> +}
> +
> /*
> * Swap MSR entry in host/guest MSR entry array.
> */
> @@ -4337,8 +4352,14 @@ static int handle_set_cr0(struct kvm_vcpu *vcpu, unsigned long val)
> return 1;
> vmcs_writel(CR0_READ_SHADOW, val);
> return 0;
> - } else
> + } else {
> + unsigned long old_cr0 = kvm_read_cr0(vcpu);
> + if ((old_cr0 & X86_CR0_PG) && !(val & X86_CR0_PG) &&
> + (kvm_read_cr4(vcpu) & X86_CR4_PCIDE))
Use kvm_read_cr4_bits(), it's slightly faster. Also move this to x86.c.
> + return 1;
> +
> return kvm_set_cr0(vcpu, val);
> + }
> }
>
> static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val)
> @@ -4349,8 +4370,26 @@ static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val)
> return 1;
> vmcs_writel(CR4_READ_SHADOW, val);
> return 0;
> - } else
> - return kvm_set_cr4(vcpu, val);
> + } else {
> + unsigned long old_cr4 = kvm_read_cr4(vcpu);
> + int ret = 1;
> +
> + if ((val & X86_CR4_PCIDE) && !(old_cr4 & X86_CR4_PCIDE)) {
> + if (!guest_cpuid_has_pcid(vcpu))
> + return ret;
> +
> + /* PCID can not be enabled when cr3[11:0]!=000H or EFER.LMA=0 */
> + if ((kvm_read_cr3(vcpu) & X86_CR3_PCID_MASK) || !is_long_mode(vcpu))
> + return ret;
> + }
> +
> + ret = kvm_set_cr4(vcpu, val);
> +
> + if (!ret && (!(val & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
> + kvm_mmu_reset_context(vcpu);
> +
> + return ret;
> + }
Move to x86.c please.
> }
>
> /* called to set cr0 as approriate for clts instruction exit. */
> @@ -6420,6 +6459,23 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> }
> }
> }
> +
> + vmx->invpcid_enabled = false;
> + if (vmx_pcid_supported()) {
> + exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
> + if (exec_control & SECONDARY_EXEC_ENABLE_INVPCID) {
> + best = kvm_find_cpuid_entry(vcpu, 0x1, 0);
> + if (best && (best->ecx & bit(X86_FEATURE_PCID)))
> + vmx->invpcid_enabled = true;
> + else {
> + exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
> + vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
> + exec_control);
> + best = kvm_find_cpuid_entry(vcpu, 0x7, 0);
> + best->ecx &= ~bit(X86_FEATURE_INVPCID);
> + }
> + }
> + }
> }
>
>
If we enter a nested guest (which is running without PCID), we need
either to handle INVPCID exits (and inject a #UD) or disable INVPCID in
exec controls. The first is faster since it doesn't involve VMWRITEs.
If we do that, we don't need this code (since it will work for
non-nested guests as well).
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2012-05-10 11:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-10 0:32 [PATCH] KVM: x86: Implement PCID/INVPCID for guests with EPT Mao, Junjie
2012-05-10 11:48 ` Avi Kivity [this message]
2012-05-11 5:58 ` Mao, Junjie
2012-05-13 10:02 ` Avi Kivity
2012-05-14 7:18 ` Mao, Junjie
2012-05-10 11:49 ` Avi Kivity
[not found] ` <CAG7+5M2XSOoHqqpbp0YbjgNNfa6DwrfP+88TwRUbhBDUDH6q6A@mail.gmail.com>
2012-05-14 7:15 ` Mao, Junjie
2012-05-15 2:20 ` Marcelo Tosatti
2012-05-15 3:28 ` Mao, Junjie
2012-05-15 3:27 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FABAB05.1080001@redhat.com \
--to=avi@redhat.com \
--cc=junjie.mao@intel.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.