* [PATCH] x86/hvm: accelerate I/O intercept handling
@ 2010-03-31 8:47 Andre Przywara
2010-03-31 9:26 ` Cui, Dexuan
0 siblings, 1 reply; 4+ messages in thread
From: Andre Przywara @ 2010-03-31 8:47 UTC (permalink / raw)
To: Keir Fraser, xen-devel
[-- Attachment #1: Type: text/plain, Size: 1463 bytes --]
Hi,
currently we go through the emulator every time a HVM guest does an I/O
port access (in/out). This is unnecessary most of the times, as both VMX
and SVM provide all the necessary information already in the VMCS/VMCB.
String instructions are not covered by this shortcut, but they are quite
rare and we would need to access the guest memory anyway.
This patch decodes the information from VMCB/VMCS and calls a simple
handle_mmio wrapper. In handle_mmio() itself the emulation part will
simply be skipped, this approach avoids code duplication.
Since the vendor specific part is quite trivial, I implemented both the
VMX and SVM part, please check the VMX part for sanity.
I boot-tested both versions and ran some simple benchmarks.
A micro benchmark (hammering an I/O port in a tight loop) shows a
significant performance improvement (down to 66% of the time needed to
handle the intercept on an AMD K8, measured in the guest with TSC).
Even with reading a 1GB file from an emulated IDE harddisk (Dom0 cached)
I could get a 4-5% improvement.
Some guest code (e.g. the TCP stack in some Windows version) exercises
the PM-Timer I/O port (0x1F48) very often (multiple 10,000 times per
second), these workloads also benefit with up to 5% improvement from
this patch.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
[-- Attachment #2: ioio_exit.patch --]
[-- Type: text/x-patch, Size: 8905 bytes --]
diff -r ebd84be3420a xen/arch/x86/hvm/emulate.c
--- a/xen/arch/x86/hvm/emulate.c Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/arch/x86/hvm/emulate.c Wed Mar 31 10:17:30 2010 +0200
@@ -48,7 +48,7 @@
trace_var(event, 0/*!cycles*/, size, buffer);
}
-static int hvmemul_do_io(
+int hvmemul_do_io(
int is_mmio, paddr_t addr, unsigned long *reps, int size,
paddr_t ram_gpa, int dir, int df, void *p_data)
{
diff -r ebd84be3420a xen/arch/x86/hvm/io.c
--- a/xen/arch/x86/hvm/io.c Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/arch/x86/hvm/io.c Wed Mar 31 10:17:30 2010 +0200
@@ -171,10 +171,22 @@
struct hvm_emulate_ctxt ctxt;
struct vcpu *curr = current;
int rc;
+ unsigned long data, reps = 1;
- hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
-
- rc = hvm_emulate_one(&ctxt);
+ if ( curr->arch.hvm_vcpu.io_size == 0 ) {
+ hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+ rc = hvm_emulate_one(&ctxt);
+ } else {
+ if ( curr->arch.hvm_vcpu.io_dir == 0 )
+ data = guest_cpu_user_regs()->eax;
+ rc = hvmemul_do_io(0, curr->arch.hvm_vcpu.io_port, &reps,
+ curr->arch.hvm_vcpu.io_size, 0,
+ curr->arch.hvm_vcpu.io_dir, 0, &data);
+ if ( curr->arch.hvm_vcpu.io_dir == 1 && rc == X86EMUL_OKAY ) {
+ memcpy(&(guest_cpu_user_regs()->eax),
+ &data, curr->arch.hvm_vcpu.io_size);
+ }
+ }
if ( curr->arch.hvm_vcpu.io_state == HVMIO_awaiting_completion )
curr->arch.hvm_vcpu.io_state = HVMIO_handle_mmio_awaiting_completion;
@@ -184,14 +196,21 @@
switch ( rc )
{
case X86EMUL_UNHANDLEABLE:
- gdprintk(XENLOG_WARNING,
- "MMIO emulation failed @ %04x:%lx: "
- "%02x %02x %02x %02x %02x %02x\n",
- hvmemul_get_seg_reg(x86_seg_cs, &ctxt)->sel,
- ctxt.insn_buf_eip,
- ctxt.insn_buf[0], ctxt.insn_buf[1],
- ctxt.insn_buf[2], ctxt.insn_buf[3],
- ctxt.insn_buf[4], ctxt.insn_buf[5]);
+ if ( curr->arch.hvm_vcpu.io_size == 0 )
+ gdprintk(XENLOG_WARNING,
+ "MMIO emulation failed @ %04x:%lx: "
+ "%02x %02x %02x %02x %02x %02x\n",
+ hvmemul_get_seg_reg(x86_seg_cs, &ctxt)->sel,
+ ctxt.insn_buf_eip,
+ ctxt.insn_buf[0], ctxt.insn_buf[1],
+ ctxt.insn_buf[2], ctxt.insn_buf[3],
+ ctxt.insn_buf[4], ctxt.insn_buf[5]);
+ else
+ gdprintk(XENLOG_WARNING,
+ "I/O emulation failed: %s 0x%04x, %i bytes, data=%08lx\n",
+ curr->arch.hvm_vcpu.io_dir ? "in" : "out",
+ curr->arch.hvm_vcpu.io_port,
+ curr->arch.hvm_vcpu.io_size, data);
return 0;
case X86EMUL_EXCEPTION:
if ( ctxt.exn_pending )
@@ -201,9 +220,15 @@
break;
}
- hvm_emulate_writeback(&ctxt);
+ if ( curr->arch.hvm_vcpu.io_size == 0 )
+ hvm_emulate_writeback(&ctxt);
+ else
+ curr->arch.hvm_vcpu.io_size = 0;
- return 1;
+ if (rc == X86EMUL_RETRY)
+ return rc;
+ else
+ return 1;
}
int handle_mmio_with_translation(unsigned long gva, unsigned long gpfn)
@@ -213,6 +238,14 @@
return handle_mmio();
}
+int handle_mmio_decoded(uint16_t port, int size, int dir)
+{
+ current->arch.hvm_vcpu.io_port = port;
+ current->arch.hvm_vcpu.io_size = size;
+ current->arch.hvm_vcpu.io_dir = dir;
+ return handle_mmio();
+}
+
void hvm_io_assist(void)
{
struct vcpu *curr = current;
diff -r ebd84be3420a xen/arch/x86/hvm/svm/emulate.c
--- a/xen/arch/x86/hvm/svm/emulate.c Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/arch/x86/hvm/svm/emulate.c Wed Mar 31 10:17:30 2010 +0200
@@ -149,6 +149,9 @@
if ( (inst_len = svm_nextrip_insn_length(v)) != 0 )
return inst_len;
+ if ( vmcb->exitcode == VMEXIT_IOIO )
+ return vmcb->exitinfo2 - vmcb->rip;
+
/* Fetch up to the next page break; we'll fetch from the next page
* later if we have to. */
fetch_addr = svm_rip2pointer(v);
diff -r ebd84be3420a xen/arch/x86/hvm/svm/svm.c
--- a/xen/arch/x86/hvm/svm/svm.c Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/arch/x86/hvm/svm/svm.c Wed Mar 31 10:17:30 2010 +0200
@@ -42,6 +42,7 @@
#include <asm/hvm/hvm.h>
#include <asm/hvm/support.h>
#include <asm/hvm/io.h>
+#include <asm/hvm/emulate.h>
#include <asm/hvm/svm/asid.h>
#include <asm/hvm/svm/svm.h>
#include <asm/hvm/svm/vmcb.h>
@@ -1244,6 +1245,23 @@
__update_guest_eip(regs, inst_len);
}
+static int svm_vmexit_do_io(struct vmcb_struct *vmcb,
+ struct cpu_user_regs *regs)
+{
+ uint16_t port;
+ int bytes, dir;
+ int rc;
+
+ port = (vmcb->exitinfo1 >> 16) & 0xFFFF;
+ bytes = ((vmcb->exitinfo1 >> 4) & 0x07);
+ dir = (vmcb->exitinfo1 & 1) ? IOREQ_READ : IOREQ_WRITE;
+
+ rc = handle_mmio_decoded(port, bytes, dir);
+ if ( rc != X86EMUL_RETRY )
+ __update_guest_eip(regs, vmcb->exitinfo2 - vmcb->rip);
+ return rc;
+}
+
static void svm_invlpg_intercept(unsigned long vaddr)
{
struct vcpu *curr = current;
@@ -1450,11 +1468,17 @@
svm_vmexit_do_hlt(vmcb, regs);
break;
+ case VMEXIT_IOIO:
+ if ( ( vmcb->exitinfo1 & ( 1 << 2 ) ) == 0 ) {
+ if ( !svm_vmexit_do_io(vmcb, regs) )
+ hvm_inject_exception(TRAP_gp_fault, 0, 0);
+ break;
+ }
+ /* fallthrough to emulation if a string instruction */
case VMEXIT_CR0_READ ... VMEXIT_CR15_READ:
case VMEXIT_CR0_WRITE ... VMEXIT_CR15_WRITE:
case VMEXIT_INVLPG:
case VMEXIT_INVLPGA:
- case VMEXIT_IOIO:
if ( !handle_mmio() )
hvm_inject_exception(TRAP_gp_fault, 0, 0);
break;
diff -r ebd84be3420a xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/arch/x86/hvm/vmx/vmx.c Wed Mar 31 10:17:30 2010 +0200
@@ -1721,6 +1721,25 @@
return 1;
}
+static int vmx_io_intercept(unsigned long exit_qualification,
+ struct cpu_user_regs *regs)
+{
+ uint16_t port;
+ int bytes, dir;
+ int rc;
+ int inst_len;
+
+ port = (exit_qualification >> 16) & 0xFFFF;
+ bytes = (exit_qualification & 0x07) + 1;
+ dir = (exit_qualification & 0x08) ? IOREQ_READ : IOREQ_WRITE;
+
+ inst_len = __get_instruction_length();
+ rc = handle_mmio_decoded(port, bytes, dir);
+ if ( rc != X86EMUL_RETRY)
+ __update_guest_eip(inst_len);
+ return rc;
+}
+
static const struct lbr_info {
u32 base, count;
} p4_lbr[] = {
@@ -2589,8 +2608,14 @@
break;
case EXIT_REASON_IO_INSTRUCTION:
- if ( !handle_mmio() )
- vmx_inject_hw_exception(TRAP_gp_fault, 0);
+ exit_qualification = __vmread(EXIT_QUALIFICATION);
+ if (exit_qualification & 0x10) {
+ if ( !handle_mmio() )
+ vmx_inject_hw_exception(TRAP_gp_fault, 0);
+ } else {
+ if ( !vmx_io_intercept(exit_qualification, regs) )
+ vmx_inject_hw_exception(TRAP_gp_fault, 0);
+ }
break;
case EXIT_REASON_INVD:
diff -r ebd84be3420a xen/include/asm-x86/hvm/emulate.h
--- a/xen/include/asm-x86/hvm/emulate.h Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/include/asm-x86/hvm/emulate.h Wed Mar 31 10:17:30 2010 +0200
@@ -46,4 +46,8 @@
enum x86_segment seg,
struct hvm_emulate_ctxt *hvmemul_ctxt);
+int hvmemul_do_io(
+ int is_mmio, paddr_t addr, unsigned long *reps, int size,
+ paddr_t ram_gpa, int dir, int df, void *p_data);
+
#endif /* __ASM_X86_HVM_EMULATE_H__ */
diff -r ebd84be3420a xen/include/asm-x86/hvm/io.h
--- a/xen/include/asm-x86/hvm/io.h Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/include/asm-x86/hvm/io.h Wed Mar 31 10:17:30 2010 +0200
@@ -100,6 +100,7 @@
void send_invalidate_req(void);
int handle_mmio(void);
int handle_mmio_with_translation(unsigned long gva, unsigned long gpfn);
+int handle_mmio_decoded(uint16_t port, int size, int dir);
void hvm_interrupt_post(struct vcpu *v, int vector, int type);
void hvm_io_assist(void);
void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
diff -r ebd84be3420a xen/include/asm-x86/hvm/vcpu.h
--- a/xen/include/asm-x86/hvm/vcpu.h Tue Mar 30 18:31:39 2010 +0100
+++ b/xen/include/asm-x86/hvm/vcpu.h Wed Mar 31 10:17:30 2010 +0200
@@ -103,6 +103,10 @@
*/
unsigned long mmio_gva;
unsigned long mmio_gpfn;
+ uint16_t io_port;
+ int io_size;
+ unsigned io_dir;
+
/* Callback into x86_emulate when emulating FPU/MMX/XMM instructions. */
void (*fpu_exception_callback)(void *, struct cpu_user_regs *);
void *fpu_exception_callback_arg;
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH] x86/hvm: accelerate I/O intercept handling
2010-03-31 8:47 [PATCH] x86/hvm: accelerate I/O intercept handling Andre Przywara
@ 2010-03-31 9:26 ` Cui, Dexuan
2010-03-31 10:08 ` Andre Przywara
0 siblings, 1 reply; 4+ messages in thread
From: Cui, Dexuan @ 2010-03-31 9:26 UTC (permalink / raw)
To: Andre Przywara, Keir Fraser, xen-devel
[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]
Actually in 15425:6e934c799051, VMX once enabled the feature, but IIRC, later Keir removed that in some cleanup patches and let the EXIT_REASON_IO_INSTRUCTION handler invoke handle_mmio() also -- I don't remember how people commented the slowness of IN/OUT emulation caused the change...
One thing is: IIRC, old Intel CPUs don't supply the info, but your patch doesn't check that... so your patch can break the CPUs. Please refer to 15425.
Thanks,
--- Dexuan
-----Original Message-----
From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Andre Przywara
Sent: 2010年3月31日 16:48
To: Keir Fraser; xen-devel
Subject: [Xen-devel] [PATCH] x86/hvm: accelerate I/O intercept handling
Hi,
currently we go through the emulator every time a HVM guest does an I/O
port access (in/out). This is unnecessary most of the times, as both VMX
and SVM provide all the necessary information already in the VMCS/VMCB.
String instructions are not covered by this shortcut, but they are quite
rare and we would need to access the guest memory anyway.
This patch decodes the information from VMCB/VMCS and calls a simple
handle_mmio wrapper. In handle_mmio() itself the emulation part will
simply be skipped, this approach avoids code duplication.
Since the vendor specific part is quite trivial, I implemented both the
VMX and SVM part, please check the VMX part for sanity.
I boot-tested both versions and ran some simple benchmarks.
A micro benchmark (hammering an I/O port in a tight loop) shows a
significant performance improvement (down to 66% of the time needed to
handle the intercept on an AMD K8, measured in the guest with TSC).
Even with reading a 1GB file from an emulated IDE harddisk (Dom0 cached)
I could get a 4-5% improvement.
Some guest code (e.g. the TCP stack in some Windows version) exercises
the PM-Timer I/O port (0x1F48) very often (multiple 10,000 times per
second), these workloads also benefit with up to 5% improvement from
this patch.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86/hvm: accelerate I/O intercept handling
2010-03-31 9:26 ` Cui, Dexuan
@ 2010-03-31 10:08 ` Andre Przywara
2010-04-01 5:15 ` Cui, Dexuan
0 siblings, 1 reply; 4+ messages in thread
From: Andre Przywara @ 2010-03-31 10:08 UTC (permalink / raw)
To: Cui, Dexuan; +Cc: xen-devel, Keir Fraser
Cui, Dexuan wrote:
> Actually in 15425:6e934c799051, VMX once enabled the feature, but IIRC,
> later Keir removed that in some cleanup patches and let the
> EXIT_REASON_IO_INSTRUCTION handler invoke handle_mmio() also -- I don't
> remember how people commented the slowness of IN/OUT emulation caused
> the change...
If I got this correctly, 15425 introduces only the OUTS/INS segment
decoding feature, something that SVM does not have (yet ;-)
As I said in my mail, I doubt the usefulness of the shortcut in this
case, since it would require to access the guest memory and is used very
rarely. So for the sake of a saner implementation I would avoid
utilizing this feature, it simplifies the code much and avoids code
duplication. That's why my code just implements non-string instructions.
> One thing is: IIRC, old Intel CPUs don't supply the info, but your
> patch doesn't check that... so your patch can break the CPUs.
> Please refer to 15425.
But this is only true for the segment decoding part, right? Appendix G
of the Intel 3B manual only speaks of _this_ feature protected by bit 54
of IA32_VMX_BASIC MSR. Is the information shown in table 23-5 in section
23.2.1 not valid on all CPUs? If not, where can I check the support for
this?
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH] x86/hvm: accelerate I/O intercept handling
2010-03-31 10:08 ` Andre Przywara
@ 2010-04-01 5:15 ` Cui, Dexuan
0 siblings, 0 replies; 4+ messages in thread
From: Cui, Dexuan @ 2010-04-01 5:15 UTC (permalink / raw)
To: Andre Przywara; +Cc: xen-devel, Keir Fraser
Andre Przywara wrote:
> Cui, Dexuan wrote:
>> Actually in 15425:6e934c799051, VMX once enabled the feature, but
>> IIRC, later Keir removed that in some cleanup patches and let the
>> EXIT_REASON_IO_INSTRUCTION handler invoke handle_mmio() also -- I
>> don't remember how people commented the slowness of IN/OUT emulation
>> caused the change...
>
> If I got this correctly, 15425 introduces only the OUTS/INS segment
Oh... sorry, my memory was bad...
> decoding feature, something that SVM does not have (yet ;-)
> As I said in my mail, I doubt the usefulness of the shortcut in this
> case, since it would require to access the guest memory and is used
> very rarely. So for the sake of a saner implementation I would avoid
> utilizing this feature, it simplifies the code much and avoids code
> duplication. That's why my code just implements non-string
> instructions.
>
>> One thing is: IIRC, old Intel CPUs don't supply the info, but your
> > patch doesn't check that... so your patch can break the CPUs.
> > Please refer to 15425.
> But this is only true for the segment decoding part, right? Appendix G
> of the Intel 3B manual only speaks of _this_ feature protected by bit
> 54 of IA32_VMX_BASIC MSR. Is the information shown in table 23-5 in
> section
> 23.2.1 not valid on all CPUs? If not, where can I check the support
> for this?
I mistook IN/OUT for INS/OUTS here. Sorry.
The patch looks good to me.
Thanks,
-- Dexuan
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-04-01 5:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-31 8:47 [PATCH] x86/hvm: accelerate I/O intercept handling Andre Przywara
2010-03-31 9:26 ` Cui, Dexuan
2010-03-31 10:08 ` Andre Przywara
2010-04-01 5:15 ` Cui, Dexuan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.