* [PATCH 0/3] Expose HW APIC virtualization support to HVM guests
@ 2014-03-06 18:31 Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Boris Ostrovsky @ 2014-03-06 18:31 UTC (permalink / raw)
To: jbeulich, keir, jun.nakajima, eddie.dong; +Cc: boris.ostrovsky, xen-devel
HVM guests running on HW that supports HW APIC virtualization features
(APIC-register virtualization, virtual interrupt delivery, etc) may
want to use APIC instead of hvm_pirqs. Since we are not guaranteed to
have these features on VMX (for example, there is a boot option to
turn it off) and there is no such support on SVM we need to make the
guest aware that its APIC accesses may not be so bad.
CPUID seems to be a good way to provide this info to the guest.
Having a guest switch to APIC shows fairly good impact on number of
VMEXITs. For example, with a pass-through NIC, netperf sees almost
half as many. Here are results for 'xentrace -e 0x00083fff -c 2 -D -T 2'
(The guest here essentially turned off XENFEAT_hvm_pirqs but we may
want to use APIC for MSI interrupts only and leave pirqs for gsi).
[root@ovs105 virt]# cat orig |xentrace_format ~/xen/tools/xentrace/formats | awk '{print $5}' | sort | uniq -c
94 cpu_change
13944 HLT
26341 INJ_VIRQ
12054 INTR
30784 INTR_WINDOW
10126 TRAP
124783 VMENTRY
124782 VMEXIT
59217 VMMCALL
35 wrap_buffer
[root@ovs105 virt]# cat apicv |xentrace_format ~/xen/tools/xentrace/formats | awk '{print $5}' | sort | uniq -c
49 cpu_change
16157 HLT
31 INJ_VIRQ
10652 INTR
38 INTR_WINDOW
10 NPF
10286 TRAP
71269 VMENTRY
71269 VMEXIT
34129 VMMCALL
15 wrap_buffer
The difference is even larger when the guest is busy.
These results are in line with what has been reported for KVM. For example
http://events.linuxfoundation.org/sites/events/files/cojp13_natapov.pdf
http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-virt-intel-vt-feat-nakajima.pdf
I am also not sure whether (cpu_has_vmx_apic_reg_virt &
cpu_has_vmx_virtualize_x2apic_mode) is sufficient to declare full HW
APIC support to a guest. The tests show ~95K VMEXITs when virtual
interrupt delivery and posted interrupts are turned off so there
appears to still be some benefit. I suppose we can use another CPUID
bit for these two (although I am not particularly eager to do this).
Boris Ostrovsky (3):
x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19
x86/hvm: Add HVM-specific hypervisor CPUID leaf
x86/hvm: Indicate avaliability of HW support of APIC virtualization
to HVM guests
xen/arch/x86/hvm/hvm.c | 17 +++++++++++++++++
xen/arch/x86/hvm/vmx/vmx.c | 10 ++++++++++
xen/arch/x86/traps.c | 15 ++++++---------
xen/include/asm-x86/hvm/hvm.h | 7 +++++++
xen/include/public/arch-x86/cpuid.h | 8 ++++++++
5 files changed, 48 insertions(+), 9 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19
2014-03-06 18:31 [PATCH 0/3] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
@ 2014-03-06 18:31 ` Boris Ostrovsky
2014-03-07 10:46 ` Jan Beulich
2014-03-06 18:31 ` [PATCH 2/3] x86/hvm: Add HVM-specific hypervisor CPUID leaf Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 3/3] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests Boris Ostrovsky
2 siblings, 1 reply; 6+ messages in thread
From: Boris Ostrovsky @ 2014-03-06 18:31 UTC (permalink / raw)
To: jbeulich, keir, jun.nakajima, eddie.dong; +Cc: boris.ostrovsky, xen-devel
The Solaris bug that commit 80ecb40362365ba77e68fc609de8bd3b7208ae19 addressed
has been fixed and backported to earlier releases.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
xen/arch/x86/traps.c | 11 ++---------
1 files changed, 2 insertions(+), 9 deletions(-)
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index c462317..d8f83a0 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -677,23 +677,16 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
struct domain *d = current->domain;
/* Optionally shift out of the way of Viridian architectural leaves. */
uint32_t base = is_viridian_domain(d) ? 0x40000100 : 0x40000000;
- uint32_t limit;
idx -= base;
- /*
- * Some Solaris PV drivers fail if max > base + 2. Help them out by
- * hiding the PVRDTSCP leaf if PVRDTSCP is disabled.
- */
- limit = (d->arch.tsc_mode < TSC_MODE_PVRDTSCP) ? 2 : 3;
-
- if ( idx > limit )
+ if ( idx > 3 )
return 0;
switch ( idx )
{
case 0:
- *eax = base + limit; /* Largest leaf */
+ *eax = base + 3; /* Largest leaf */
*ebx = XEN_CPUID_SIGNATURE_EBX;
*ecx = XEN_CPUID_SIGNATURE_ECX;
*edx = XEN_CPUID_SIGNATURE_EDX;
--
1.7.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/3] x86/hvm: Add HVM-specific hypervisor CPUID leaf
2014-03-06 18:31 [PATCH 0/3] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
@ 2014-03-06 18:31 ` Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 3/3] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests Boris Ostrovsky
2 siblings, 0 replies; 6+ messages in thread
From: Boris Ostrovsky @ 2014-03-06 18:31 UTC (permalink / raw)
To: jbeulich, keir, jun.nakajima, eddie.dong; +Cc: boris.ostrovsky, xen-devel
CPUID leaf 0x40000004 is for HVM-specific features.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
xen/arch/x86/hvm/hvm.c | 17 +++++++++++++++++
xen/arch/x86/traps.c | 8 ++++++--
xen/include/asm-x86/hvm/hvm.h | 7 +++++++
3 files changed, 30 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 9e85c13..71783de 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2978,6 +2978,23 @@ unsigned long copy_from_user_hvm(void *to, const void *from, unsigned len)
return rc ? len : 0; /* fake a copy_from_user() return code */
}
+void hvm_hypervisor_cpuid_leaf(uint32_t idx, uint32_t sub_idx,
+ uint32_t *eax, uint32_t *ebx,
+ uint32_t *ecx, uint32_t *edx)
+{
+ if ( idx != 4 )
+ return;
+
+ *eax = *ebx = *ecx = *edx = 0;
+ if ( !has_hvm_container_domain(current->domain) ||
+ !hvm_funcs.hypervisor_cpuid_leaf )
+ return;
+
+ hvm_funcs.hypervisor_cpuid_leaf(idx, sub_idx, eax, ebx, ecx, edx);
+
+ return;
+}
+
void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
unsigned int *ecx, unsigned int *edx)
{
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index d8f83a0..13b422b 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -680,13 +680,13 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
idx -= base;
- if ( idx > 3 )
+ if ( idx > 4 )
return 0;
switch ( idx )
{
case 0:
- *eax = base + 3; /* Largest leaf */
+ *eax = base + 4; /* Largest leaf */
*ebx = XEN_CPUID_SIGNATURE_EBX;
*ecx = XEN_CPUID_SIGNATURE_ECX;
*edx = XEN_CPUID_SIGNATURE_EDX;
@@ -715,6 +715,10 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
cpuid_time_leaf( sub_idx, eax, ebx, ecx, edx );
break;
+ case 4:
+ hvm_hypervisor_cpuid_leaf(idx, sub_idx, eax, ebx, ecx, edx);
+ break;
+
default:
BUG();
}
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index dcc3483..e6b1399 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -200,6 +200,10 @@ struct hvm_function_table {
paddr_t *L1_gpa, unsigned int *page_order,
uint8_t *p2m_acc, bool_t access_r,
bool_t access_w, bool_t access_x);
+
+ void (*hypervisor_cpuid_leaf)(uint32_t idx, uint32_t sub_idx,
+ uint32_t *eax, uint32_t *ebx,
+ uint32_t *ecx, uint32_t *edx);
};
extern struct hvm_function_table hvm_funcs;
@@ -336,6 +340,9 @@ static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
#define is_viridian_domain(_d) \
(is_hvm_domain(_d) && ((_d)->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN]))
+void hvm_hypervisor_cpuid_leaf(uint32_t idx, uint32_t sub_idx,
+ uint32_t *eax, uint32_t *ebx,
+ uint32_t *ecx, uint32_t *edx);
void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
unsigned int *ecx, unsigned int *edx);
void hvm_migrate_timers(struct vcpu *v);
--
1.7.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/3] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests
2014-03-06 18:31 [PATCH 0/3] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 2/3] x86/hvm: Add HVM-specific hypervisor CPUID leaf Boris Ostrovsky
@ 2014-03-06 18:31 ` Boris Ostrovsky
2 siblings, 0 replies; 6+ messages in thread
From: Boris Ostrovsky @ 2014-03-06 18:31 UTC (permalink / raw)
To: jbeulich, keir, jun.nakajima, eddie.dong; +Cc: boris.ostrovsky, xen-devel
Set a bit in hypervisor CPUID leaf indicating that HW provides (and the
hypervisor enables) HW support for APIC virtualization.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 10 ++++++++++
xen/include/public/arch-x86/cpuid.h | 8 ++++++++
2 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 8395e86..b68805b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -57,6 +57,7 @@
#include <asm/apic.h>
#include <asm/hvm/nestedhvm.h>
#include <asm/event.h>
+#include <public/arch-x86/cpuid.h>
enum handler_return { HNDL_done, HNDL_unhandled, HNDL_exception_raised };
@@ -1646,6 +1647,14 @@ static void vmx_handle_eoi(u8 vector)
__vmwrite(GUEST_INTR_STATUS, status);
}
+void vmx_hypervisor_cpuid_leaf(uint32_t idx, uint32_t sub_idx,
+ uint32_t *eax, uint32_t *ebx,
+ uint32_t *ecx, uint32_t *edx)
+{
+ if ( cpu_has_vmx_apic_reg_virt && cpu_has_vmx_virtualize_x2apic_mode )
+ *eax |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
+}
+
static struct hvm_function_table __initdata vmx_function_table = {
.name = "VMX",
.cpu_up_prepare = vmx_cpu_up_prepare,
@@ -1703,6 +1712,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
.sync_pir_to_irr = vmx_sync_pir_to_irr,
.handle_eoi = vmx_handle_eoi,
.nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
+ .hypervisor_cpuid_leaf= vmx_hypervisor_cpuid_leaf,
};
const struct hvm_function_table * __init start_vmx(void)
diff --git a/xen/include/public/arch-x86/cpuid.h b/xen/include/public/arch-x86/cpuid.h
index d9bd627..dbbe746 100644
--- a/xen/include/public/arch-x86/cpuid.h
+++ b/xen/include/public/arch-x86/cpuid.h
@@ -65,4 +65,12 @@
#define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
#define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD (1u<<0)
+/*
+ * Leaf 5 (0x40000004)
+ * HVM-specific features
+ */
+
+/* EAX Features */
+#define XEN_HVM_CPUID_APIC_ACCESS_VIRT (1u << 0) /* Virtualized APIC registers */
+
#endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
--
1.7.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19
2014-03-06 18:31 ` [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
@ 2014-03-07 10:46 ` Jan Beulich
2014-03-07 14:37 ` Boris Ostrovsky
0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2014-03-07 10:46 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: keir, eddie.dong, jun.nakajima, xen-devel
>>> On 06.03.14 at 19:31, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
> The Solaris bug that commit 80ecb40362365ba77e68fc609de8bd3b7208ae19
> addressed
> has been fixed and backported to earlier releases.
I don't think this is sufficient justification for the revert: Suppose
someone's still running an un-patched Solaris guest (for a cloud
provider this may even be unknowingly) and then updates the
hypervisor (i.e. by migrating the guest to an updated host). I
can see why you want this reverted for the subsequent patches,
but I'm afraid a different solution will need to be found (if nothing
else, via explicit guest config option).
Jan
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
> xen/arch/x86/traps.c | 11 ++---------
> 1 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index c462317..d8f83a0 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -677,23 +677,16 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
> struct domain *d = current->domain;
> /* Optionally shift out of the way of Viridian architectural leaves. */
> uint32_t base = is_viridian_domain(d) ? 0x40000100 : 0x40000000;
> - uint32_t limit;
>
> idx -= base;
>
> - /*
> - * Some Solaris PV drivers fail if max > base + 2. Help them out by
> - * hiding the PVRDTSCP leaf if PVRDTSCP is disabled.
> - */
> - limit = (d->arch.tsc_mode < TSC_MODE_PVRDTSCP) ? 2 : 3;
> -
> - if ( idx > limit )
> + if ( idx > 3 )
> return 0;
>
> switch ( idx )
> {
> case 0:
> - *eax = base + limit; /* Largest leaf */
> + *eax = base + 3; /* Largest leaf */
> *ebx = XEN_CPUID_SIGNATURE_EBX;
> *ecx = XEN_CPUID_SIGNATURE_ECX;
> *edx = XEN_CPUID_SIGNATURE_EDX;
> --
> 1.7.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19
2014-03-07 10:46 ` Jan Beulich
@ 2014-03-07 14:37 ` Boris Ostrovsky
0 siblings, 0 replies; 6+ messages in thread
From: Boris Ostrovsky @ 2014-03-07 14:37 UTC (permalink / raw)
To: Jan Beulich; +Cc: keir, eddie.dong, jun.nakajima, xen-devel
On 03/07/2014 05:46 AM, Jan Beulich wrote:
>>>> On 06.03.14 at 19:31, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
>> The Solaris bug that commit 80ecb40362365ba77e68fc609de8bd3b7208ae19
>> addressed
>> has been fixed and backported to earlier releases.
> I don't think this is sufficient justification for the revert: Suppose
> someone's still running an un-patched Solaris guest (for a cloud
> provider this may even be unknowingly) and then updates the
> hypervisor (i.e. by migrating the guest to an updated host). I
> can see why you want this reverted for the subsequent patches,
> but I'm afraid a different solution will need to be found (if nothing
> else, via explicit guest config option).
I did check with our Solaris folks and we (Oracle) don't recommend
running versions of Solaris that are susceptible to this issue (not
because of this bug specifically but mostly because those releases are
too old and presumably there are other problems that needed to be
addressed).
Having said that, we could provide a way to reduce the number of leaves
by using the 'cpuid' option in xl.conf. In fact, supporting this from
configuration file should be a useful feature anyway.
-boris
>
> Jan
>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>> xen/arch/x86/traps.c | 11 ++---------
>> 1 files changed, 2 insertions(+), 9 deletions(-)
>>
>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
>> index c462317..d8f83a0 100644
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -677,23 +677,16 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
>> struct domain *d = current->domain;
>> /* Optionally shift out of the way of Viridian architectural leaves. */
>> uint32_t base = is_viridian_domain(d) ? 0x40000100 : 0x40000000;
>> - uint32_t limit;
>>
>> idx -= base;
>>
>> - /*
>> - * Some Solaris PV drivers fail if max > base + 2. Help them out by
>> - * hiding the PVRDTSCP leaf if PVRDTSCP is disabled.
>> - */
>> - limit = (d->arch.tsc_mode < TSC_MODE_PVRDTSCP) ? 2 : 3;
>> -
>> - if ( idx > limit )
>> + if ( idx > 3 )
>> return 0;
>>
>> switch ( idx )
>> {
>> case 0:
>> - *eax = base + limit; /* Largest leaf */
>> + *eax = base + 3; /* Largest leaf */
>> *ebx = XEN_CPUID_SIGNATURE_EBX;
>> *ecx = XEN_CPUID_SIGNATURE_ECX;
>> *edx = XEN_CPUID_SIGNATURE_EDX;
>> --
>> 1.7.1
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-03-07 14:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-06 18:31 [PATCH 0/3] Expose HW APIC virtualization support to HVM guests Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 1/3] x86/hvm: Revert 80ecb40362365ba77e68fc609de8bd3b7208ae19 Boris Ostrovsky
2014-03-07 10:46 ` Jan Beulich
2014-03-07 14:37 ` Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 2/3] x86/hvm: Add HVM-specific hypervisor CPUID leaf Boris Ostrovsky
2014-03-06 18:31 ` [PATCH 3/3] x86/hvm: Indicate avaliability of HW support of APIC virtualization to HVM guests Boris Ostrovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).