* [PATCH v3 0/7] Streamline arithmetic instruction emulation
@ 2013-01-04 14:18 Avi Kivity
2013-01-04 14:18 ` [PATCH v3 1/7] KVM: x86 emulator: framework for streamlining arithmetic opcodes Avi Kivity
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
The current arithmetic instruction emulation is fairly clumsy: after
decode, each instruction gets a switch (size), and for every size
we fetch the operands, prepare flags, emulate the instruction, then store
back the flags and operands.
This patchset simplifies things by moving everything into common code
except the instruction itself. All the pre- and post- processing is
coded just once. The per-instrution code looks like:
add %bl, %al
ret
add %bx, %ax
ret
add %ebx, %eax
ret
add %rbx, %rax
ret
The savings in size, for the ten instructions converted in this patchset,
are fairly large:
text data bss dec hex filename
63724 0 0 63724 f8ec arch/x86/kvm/emulate.o.before
61268 0 0 61268 ef54 arch/x86/kvm/emulate.o.after
- around 2500 bytes.
v3: fix reversed operand order in 2-operand macro
v2: rebased
Avi Kivity (7):
KVM: x86 emulator: framework for streamlining arithmetic opcodes
KVM: x86 emulator: Support for declaring single operand fastops
KVM: x86 emulator: introduce NoWrite flag
KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite
KVM: x86 emulator: convert NOT, NEG to fastop
KVM: x86 emulator: add macros for defining 2-operand fastop emulation
KVM: x86 emulator: convert basic ALU ops to fastop
arch/x86/kvm/emulate.c | 215 +++++++++++++++++++++++++++----------------------
1 file changed, 120 insertions(+), 95 deletions(-)
--
1.8.0.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3 1/7] KVM: x86 emulator: framework for streamlining arithmetic opcodes
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
@ 2013-01-04 14:18 ` Avi Kivity
2013-01-04 14:18 ` [PATCH v3 2/7] KVM: x86 emulator: Support for declaring single operand fastops Avi Kivity
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
We emulate arithmetic opcodes by executing a "similar" (same operation,
different operands) on the cpu. This ensures accurate emulation, esp. wrt.
eflags. However, the prologue and epilogue around the opcode is fairly long,
consisting of a switch (for the operand size) and code to load and save the
operands. This is repeated for every opcode.
This patch introduces an alternative way to emulate arithmetic opcodes.
Instead of the above, we have four (three on i386) functions consisting
of just the opcode and a ret; one for each operand size. For example:
.align 8
em_notb:
not %al
ret
.align 8
em_notw:
not %ax
ret
.align 8
em_notl:
not %eax
ret
.align 8
em_notq:
not %rax
ret
The prologue and epilogue are shared across all opcodes. Note the functions
use a special calling convention; notably eflags is an input/output parameter
and is not clobbered. Rather than dispatching the four functions through a
jump table, the functions are declared as a constant size (8) so their address
can be calculated.
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
---
arch/x86/kvm/emulate.c | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 53c5ad6..dd71567 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -149,6 +149,7 @@
#define Aligned ((u64)1 << 41) /* Explicitly aligned (e.g. MOVDQA) */
#define Unaligned ((u64)1 << 42) /* Explicitly unaligned (e.g. MOVDQU) */
#define Avx ((u64)1 << 43) /* Advanced Vector Extensions */
+#define Fastop ((u64)1 << 44) /* Use opcode::u.fastop */
#define X2(x...) x, x
#define X3(x...) X2(x), x
@@ -159,6 +160,27 @@
#define X8(x...) X4(x), X4(x)
#define X16(x...) X8(x), X8(x)
+#define NR_FASTOP (ilog2(sizeof(ulong)) + 1)
+#define FASTOP_SIZE 8
+
+/*
+ * fastop functions have a special calling convention:
+ *
+ * dst: [rdx]:rax (in/out)
+ * src: rbx (in/out)
+ * src2: rcx (in)
+ * flags: rflags (in/out)
+ *
+ * Moreover, they are all exactly FASTOP_SIZE bytes long, so functions for
+ * different operand sizes can be reached by calculation, rather than a jump
+ * table (which would be bigger than the code).
+ *
+ * fastop functions are declared as taking a never-defined fastop parameter,
+ * so they can't be called from C directly.
+ */
+
+struct fastop;
+
struct opcode {
u64 flags : 56;
u64 intercept : 8;
@@ -168,6 +190,7 @@ struct opcode {
const struct group_dual *gdual;
const struct gprefix *gprefix;
const struct escape *esc;
+ void (*fastop)(struct fastop *fake);
} u;
int (*check_perm)(struct x86_emulate_ctxt *ctxt);
};
@@ -3646,6 +3669,7 @@ static int check_perm_out(struct x86_emulate_ctxt *ctxt)
#define GD(_f, _g) { .flags = ((_f) | GroupDual | ModRM), .u.gdual = (_g) }
#define E(_f, _e) { .flags = ((_f) | Escape | ModRM), .u.esc = (_e) }
#define I(_f, _e) { .flags = (_f), .u.execute = (_e) }
+#define F(_f, _e) { .flags = (_f) | Fastop, .u.fastop = (_e) }
#define II(_f, _e, _i) \
{ .flags = (_f), .u.execute = (_e), .intercept = x86_intercept_##_i }
#define IIP(_f, _e, _i, _p) \
@@ -4502,6 +4526,16 @@ static void fetch_possible_mmx_operand(struct x86_emulate_ctxt *ctxt,
read_mmx_reg(ctxt, &op->mm_val, op->addr.mm);
}
+static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
+{
+ ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
+ fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
+ asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n"
+ : "+a"(ctxt->dst.val), "+b"(ctxt->src.val), [flags]"+D"(flags)
+ : "c"(ctxt->src2.val), [fastop]"S"(fop));
+ ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
+ return X86EMUL_CONTINUE;
+}
int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
{
@@ -4631,6 +4665,13 @@ special_insn:
}
if (ctxt->execute) {
+ if (ctxt->d & Fastop) {
+ void (*fop)(struct fastop *) = (void *)ctxt->execute;
+ rc = fastop(ctxt, fop);
+ if (rc != X86EMUL_CONTINUE)
+ goto done;
+ goto writeback;
+ }
rc = ctxt->execute(ctxt);
if (rc != X86EMUL_CONTINUE)
goto done;
--
1.8.0.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 2/7] KVM: x86 emulator: Support for declaring single operand fastops
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
2013-01-04 14:18 ` [PATCH v3 1/7] KVM: x86 emulator: framework for streamlining arithmetic opcodes Avi Kivity
@ 2013-01-04 14:18 ` Avi Kivity
2013-01-04 14:18 ` [PATCH v3 3/7] KVM: x86 emulator: introduce NoWrite flag Avi Kivity
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
---
arch/x86/kvm/emulate.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index dd71567..42c53c8 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -24,6 +24,7 @@
#include "kvm_cache_regs.h"
#include <linux/module.h>
#include <asm/kvm_emulate.h>
+#include <linux/stringify.h>
#include "x86.h"
#include "tss.h"
@@ -439,6 +440,30 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
} \
} while (0)
+#define FOP_ALIGN ".align " __stringify(FASTOP_SIZE) " \n\t"
+#define FOP_RET "ret \n\t"
+
+#define FOP_START(op) \
+ extern void em_##op(struct fastop *fake); \
+ asm(".pushsection .text, \"ax\" \n\t" \
+ ".global em_" #op " \n\t" \
+ FOP_ALIGN \
+ "em_" #op ": \n\t"
+
+#define FOP_END \
+ ".popsection")
+
+#define FOP1E(op, dst) \
+ FOP_ALIGN #op " %" #dst " \n\t" FOP_RET
+
+#define FASTOP1(op) \
+ FOP_START(op) \
+ FOP1E(op##b, al) \
+ FOP1E(op##w, ax) \
+ FOP1E(op##l, eax) \
+ ON64(FOP1E(op##q, rax)) \
+ FOP_END
+
#define __emulate_1op_rax_rdx(ctxt, _op, _suffix, _ex) \
do { \
unsigned long _tmp; \
--
1.8.0.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 3/7] KVM: x86 emulator: introduce NoWrite flag
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
2013-01-04 14:18 ` [PATCH v3 1/7] KVM: x86 emulator: framework for streamlining arithmetic opcodes Avi Kivity
2013-01-04 14:18 ` [PATCH v3 2/7] KVM: x86 emulator: Support for declaring single operand fastops Avi Kivity
@ 2013-01-04 14:18 ` Avi Kivity
2013-01-04 14:18 ` [PATCH v3 4/7] KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite Avi Kivity
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
Instead of disabling writeback via OP_NONE, just specify NoWrite.
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
---
arch/x86/kvm/emulate.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 42c53c8..fe113fb 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -151,6 +151,7 @@
#define Unaligned ((u64)1 << 42) /* Explicitly unaligned (e.g. MOVDQU) */
#define Avx ((u64)1 << 43) /* Advanced Vector Extensions */
#define Fastop ((u64)1 << 44) /* Use opcode::u.fastop */
+#define NoWrite ((u64)1 << 45) /* No writeback */
#define X2(x...) x, x
#define X3(x...) X2(x), x
@@ -1633,6 +1634,9 @@ static int writeback(struct x86_emulate_ctxt *ctxt)
{
int rc;
+ if (ctxt->d & NoWrite)
+ return X86EMUL_CONTINUE;
+
switch (ctxt->dst.type) {
case OP_REG:
write_register_operand(&ctxt->dst);
--
1.8.0.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 4/7] KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
` (2 preceding siblings ...)
2013-01-04 14:18 ` [PATCH v3 3/7] KVM: x86 emulator: introduce NoWrite flag Avi Kivity
@ 2013-01-04 14:18 ` Avi Kivity
2013-01-04 14:18 ` [PATCH v3 5/7] KVM: x86 emulator: convert NOT, NEG to fastop Avi Kivity
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
---
arch/x86/kvm/emulate.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index fe113fb..2af0c44 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3069,16 +3069,12 @@ static int em_xor(struct x86_emulate_ctxt *ctxt)
static int em_cmp(struct x86_emulate_ctxt *ctxt)
{
emulate_2op_SrcV(ctxt, "cmp");
- /* Disable writeback. */
- ctxt->dst.type = OP_NONE;
return X86EMUL_CONTINUE;
}
static int em_test(struct x86_emulate_ctxt *ctxt)
{
emulate_2op_SrcV(ctxt, "test");
- /* Disable writeback. */
- ctxt->dst.type = OP_NONE;
return X86EMUL_CONTINUE;
}
@@ -3747,7 +3743,7 @@ static const struct opcode group1[] = {
I(Lock | PageTable, em_and),
I(Lock, em_sub),
I(Lock, em_xor),
- I(0, em_cmp),
+ I(NoWrite, em_cmp),
};
static const struct opcode group1A[] = {
@@ -3755,8 +3751,8 @@ static const struct opcode group1A[] = {
};
static const struct opcode group3[] = {
- I(DstMem | SrcImm, em_test),
- I(DstMem | SrcImm, em_test),
+ I(DstMem | SrcImm | NoWrite, em_test),
+ I(DstMem | SrcImm | NoWrite, em_test),
I(DstMem | SrcNone | Lock, em_not),
I(DstMem | SrcNone | Lock, em_neg),
I(SrcMem, em_mul_ex),
@@ -3920,7 +3916,7 @@ static const struct opcode opcode_table[256] = {
/* 0x30 - 0x37 */
I6ALU(Lock, em_xor), N, N,
/* 0x38 - 0x3F */
- I6ALU(0, em_cmp), N, N,
+ I6ALU(NoWrite, em_cmp), N, N,
/* 0x40 - 0x4F */
X16(D(DstReg)),
/* 0x50 - 0x57 */
@@ -3946,7 +3942,7 @@ static const struct opcode opcode_table[256] = {
G(DstMem | SrcImm, group1),
G(ByteOp | DstMem | SrcImm | No64, group1),
G(DstMem | SrcImmByte, group1),
- I2bv(DstMem | SrcReg | ModRM, em_test),
+ I2bv(DstMem | SrcReg | ModRM | NoWrite, em_test),
I2bv(DstMem | SrcReg | ModRM | Lock | PageTable, em_xchg),
/* 0x88 - 0x8F */
I2bv(DstMem | SrcReg | ModRM | Mov | PageTable, em_mov),
@@ -3966,12 +3962,12 @@ static const struct opcode opcode_table[256] = {
I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
I2bv(SrcSI | DstDI | Mov | String, em_mov),
- I2bv(SrcSI | DstDI | String, em_cmp),
+ I2bv(SrcSI | DstDI | String | NoWrite, em_cmp),
/* 0xA8 - 0xAF */
- I2bv(DstAcc | SrcImm, em_test),
+ I2bv(DstAcc | SrcImm | NoWrite, em_test),
I2bv(SrcAcc | DstDI | Mov | String, em_mov),
I2bv(SrcSI | DstAcc | Mov | String, em_mov),
- I2bv(SrcAcc | DstDI | String, em_cmp),
+ I2bv(SrcAcc | DstDI | String | NoWrite, em_cmp),
/* 0xB0 - 0xB7 */
X8(I(ByteOp | DstReg | SrcImm | Mov, em_mov)),
/* 0xB8 - 0xBF */
--
1.8.0.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 5/7] KVM: x86 emulator: convert NOT, NEG to fastop
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
` (3 preceding siblings ...)
2013-01-04 14:18 ` [PATCH v3 4/7] KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite Avi Kivity
@ 2013-01-04 14:18 ` Avi Kivity
2013-01-04 14:18 ` [PATCH v3 6/7] KVM: x86 emulator: add macros for defining 2-operand fastop emulation Avi Kivity
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
---
arch/x86/kvm/emulate.c | 17 ++++-------------
1 file changed, 4 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 2af0c44..09dbdc5 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2050,17 +2050,8 @@ static int em_grp2(struct x86_emulate_ctxt *ctxt)
return X86EMUL_CONTINUE;
}
-static int em_not(struct x86_emulate_ctxt *ctxt)
-{
- ctxt->dst.val = ~ctxt->dst.val;
- return X86EMUL_CONTINUE;
-}
-
-static int em_neg(struct x86_emulate_ctxt *ctxt)
-{
- emulate_1op(ctxt, "neg");
- return X86EMUL_CONTINUE;
-}
+FASTOP1(not);
+FASTOP1(neg);
static int em_mul_ex(struct x86_emulate_ctxt *ctxt)
{
@@ -3753,8 +3744,8 @@ static const struct opcode group1A[] = {
static const struct opcode group3[] = {
I(DstMem | SrcImm | NoWrite, em_test),
I(DstMem | SrcImm | NoWrite, em_test),
- I(DstMem | SrcNone | Lock, em_not),
- I(DstMem | SrcNone | Lock, em_neg),
+ F(DstMem | SrcNone | Lock, em_not),
+ F(DstMem | SrcNone | Lock, em_neg),
I(SrcMem, em_mul_ex),
I(SrcMem, em_imul_ex),
I(SrcMem, em_div_ex),
--
1.8.0.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 6/7] KVM: x86 emulator: add macros for defining 2-operand fastop emulation
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
` (4 preceding siblings ...)
2013-01-04 14:18 ` [PATCH v3 5/7] KVM: x86 emulator: convert NOT, NEG to fastop Avi Kivity
@ 2013-01-04 14:18 ` Avi Kivity
2013-01-04 14:18 ` [PATCH v3 7/7] KVM: x86 emulator: convert basic ALU ops to fastop Avi Kivity
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
---
arch/x86/kvm/emulate.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 09dbdc5..3b5d4dd 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -465,6 +465,17 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
ON64(FOP1E(op##q, rax)) \
FOP_END
+#define FOP2E(op, dst, src) \
+ FOP_ALIGN #op " %" #src ", %" #dst " \n\t" FOP_RET
+
+#define FASTOP2(op) \
+ FOP_START(op) \
+ FOP2E(op##b, al, bl) \
+ FOP2E(op##w, ax, bx) \
+ FOP2E(op##l, eax, ebx) \
+ ON64(FOP2E(op##q, rax, rbx)) \
+ FOP_END
+
#define __emulate_1op_rax_rdx(ctxt, _op, _suffix, _ex) \
do { \
unsigned long _tmp; \
@@ -3696,6 +3707,7 @@ static int check_perm_out(struct x86_emulate_ctxt *ctxt)
#define D2bv(_f) D((_f) | ByteOp), D(_f)
#define D2bvIP(_f, _i, _p) DIP((_f) | ByteOp, _i, _p), DIP(_f, _i, _p)
#define I2bv(_f, _e) I((_f) | ByteOp, _e), I(_f, _e)
+#define F2bv(_f, _e) F((_f) | ByteOp, _e), F(_f, _e)
#define I2bvIP(_f, _e, _i, _p) \
IIP((_f) | ByteOp, _e, _i, _p), IIP(_f, _e, _i, _p)
--
1.8.0.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 7/7] KVM: x86 emulator: convert basic ALU ops to fastop
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
` (5 preceding siblings ...)
2013-01-04 14:18 ` [PATCH v3 6/7] KVM: x86 emulator: add macros for defining 2-operand fastop emulation Avi Kivity
@ 2013-01-04 14:18 ` Avi Kivity
2013-01-08 11:38 ` [PATCH v3 0/7] Streamline arithmetic instruction emulation Gleb Natapov
2013-01-10 17:33 ` Marcelo Tosatti
8 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2013-01-04 14:18 UTC (permalink / raw)
To: Marcelo Tosatti, Gleb Natapov; +Cc: kvm
Opcodes:
TEST
CMP
ADD
ADC
SUB
SBB
XOR
OR
AND
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
---
arch/x86/kvm/emulate.c | 112 +++++++++++++++----------------------------------
1 file changed, 34 insertions(+), 78 deletions(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 3b5d4dd..619a33d 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3026,59 +3026,15 @@ static int em_ret_near_imm(struct x86_emulate_ctxt *ctxt)
return X86EMUL_CONTINUE;
}
-static int em_add(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "add");
- return X86EMUL_CONTINUE;
-}
-
-static int em_or(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "or");
- return X86EMUL_CONTINUE;
-}
-
-static int em_adc(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "adc");
- return X86EMUL_CONTINUE;
-}
-
-static int em_sbb(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "sbb");
- return X86EMUL_CONTINUE;
-}
-
-static int em_and(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "and");
- return X86EMUL_CONTINUE;
-}
-
-static int em_sub(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "sub");
- return X86EMUL_CONTINUE;
-}
-
-static int em_xor(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "xor");
- return X86EMUL_CONTINUE;
-}
-
-static int em_cmp(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "cmp");
- return X86EMUL_CONTINUE;
-}
-
-static int em_test(struct x86_emulate_ctxt *ctxt)
-{
- emulate_2op_SrcV(ctxt, "test");
- return X86EMUL_CONTINUE;
-}
+FASTOP2(add);
+FASTOP2(or);
+FASTOP2(adc);
+FASTOP2(sbb);
+FASTOP2(and);
+FASTOP2(sub);
+FASTOP2(xor);
+FASTOP2(cmp);
+FASTOP2(test);
static int em_xchg(struct x86_emulate_ctxt *ctxt)
{
@@ -3711,9 +3667,9 @@ static int check_perm_out(struct x86_emulate_ctxt *ctxt)
#define I2bvIP(_f, _e, _i, _p) \
IIP((_f) | ByteOp, _e, _i, _p), IIP(_f, _e, _i, _p)
-#define I6ALU(_f, _e) I2bv((_f) | DstMem | SrcReg | ModRM, _e), \
- I2bv(((_f) | DstReg | SrcMem | ModRM) & ~Lock, _e), \
- I2bv(((_f) & ~Lock) | DstAcc | SrcImm, _e)
+#define F6ALU(_f, _e) F2bv((_f) | DstMem | SrcReg | ModRM, _e), \
+ F2bv(((_f) | DstReg | SrcMem | ModRM) & ~Lock, _e), \
+ F2bv(((_f) & ~Lock) | DstAcc | SrcImm, _e)
static const struct opcode group7_rm1[] = {
DI(SrcNone | Priv, monitor),
@@ -3739,14 +3695,14 @@ static const struct opcode group7_rm7[] = {
};
static const struct opcode group1[] = {
- I(Lock, em_add),
- I(Lock | PageTable, em_or),
- I(Lock, em_adc),
- I(Lock, em_sbb),
- I(Lock | PageTable, em_and),
- I(Lock, em_sub),
- I(Lock, em_xor),
- I(NoWrite, em_cmp),
+ F(Lock, em_add),
+ F(Lock | PageTable, em_or),
+ F(Lock, em_adc),
+ F(Lock, em_sbb),
+ F(Lock | PageTable, em_and),
+ F(Lock, em_sub),
+ F(Lock, em_xor),
+ F(NoWrite, em_cmp),
};
static const struct opcode group1A[] = {
@@ -3754,8 +3710,8 @@ static const struct opcode group1A[] = {
};
static const struct opcode group3[] = {
- I(DstMem | SrcImm | NoWrite, em_test),
- I(DstMem | SrcImm | NoWrite, em_test),
+ F(DstMem | SrcImm | NoWrite, em_test),
+ F(DstMem | SrcImm | NoWrite, em_test),
F(DstMem | SrcNone | Lock, em_not),
F(DstMem | SrcNone | Lock, em_neg),
I(SrcMem, em_mul_ex),
@@ -3897,29 +3853,29 @@ static const struct escape escape_dd = { {
static const struct opcode opcode_table[256] = {
/* 0x00 - 0x07 */
- I6ALU(Lock, em_add),
+ F6ALU(Lock, em_add),
I(ImplicitOps | Stack | No64 | Src2ES, em_push_sreg),
I(ImplicitOps | Stack | No64 | Src2ES, em_pop_sreg),
/* 0x08 - 0x0F */
- I6ALU(Lock | PageTable, em_or),
+ F6ALU(Lock | PageTable, em_or),
I(ImplicitOps | Stack | No64 | Src2CS, em_push_sreg),
N,
/* 0x10 - 0x17 */
- I6ALU(Lock, em_adc),
+ F6ALU(Lock, em_adc),
I(ImplicitOps | Stack | No64 | Src2SS, em_push_sreg),
I(ImplicitOps | Stack | No64 | Src2SS, em_pop_sreg),
/* 0x18 - 0x1F */
- I6ALU(Lock, em_sbb),
+ F6ALU(Lock, em_sbb),
I(ImplicitOps | Stack | No64 | Src2DS, em_push_sreg),
I(ImplicitOps | Stack | No64 | Src2DS, em_pop_sreg),
/* 0x20 - 0x27 */
- I6ALU(Lock | PageTable, em_and), N, N,
+ F6ALU(Lock | PageTable, em_and), N, N,
/* 0x28 - 0x2F */
- I6ALU(Lock, em_sub), N, I(ByteOp | DstAcc | No64, em_das),
+ F6ALU(Lock, em_sub), N, I(ByteOp | DstAcc | No64, em_das),
/* 0x30 - 0x37 */
- I6ALU(Lock, em_xor), N, N,
+ F6ALU(Lock, em_xor), N, N,
/* 0x38 - 0x3F */
- I6ALU(NoWrite, em_cmp), N, N,
+ F6ALU(NoWrite, em_cmp), N, N,
/* 0x40 - 0x4F */
X16(D(DstReg)),
/* 0x50 - 0x57 */
@@ -3945,7 +3901,7 @@ static const struct opcode opcode_table[256] = {
G(DstMem | SrcImm, group1),
G(ByteOp | DstMem | SrcImm | No64, group1),
G(DstMem | SrcImmByte, group1),
- I2bv(DstMem | SrcReg | ModRM | NoWrite, em_test),
+ F2bv(DstMem | SrcReg | ModRM | NoWrite, em_test),
I2bv(DstMem | SrcReg | ModRM | Lock | PageTable, em_xchg),
/* 0x88 - 0x8F */
I2bv(DstMem | SrcReg | ModRM | Mov | PageTable, em_mov),
@@ -3965,12 +3921,12 @@ static const struct opcode opcode_table[256] = {
I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
I2bv(SrcSI | DstDI | Mov | String, em_mov),
- I2bv(SrcSI | DstDI | String | NoWrite, em_cmp),
+ F2bv(SrcSI | DstDI | String | NoWrite, em_cmp),
/* 0xA8 - 0xAF */
- I2bv(DstAcc | SrcImm | NoWrite, em_test),
+ F2bv(DstAcc | SrcImm | NoWrite, em_test),
I2bv(SrcAcc | DstDI | Mov | String, em_mov),
I2bv(SrcSI | DstAcc | Mov | String, em_mov),
- I2bv(SrcAcc | DstDI | String | NoWrite, em_cmp),
+ F2bv(SrcAcc | DstDI | String | NoWrite, em_cmp),
/* 0xB0 - 0xB7 */
X8(I(ByteOp | DstReg | SrcImm | Mov, em_mov)),
/* 0xB8 - 0xBF */
--
1.8.0.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/7] Streamline arithmetic instruction emulation
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
` (6 preceding siblings ...)
2013-01-04 14:18 ` [PATCH v3 7/7] KVM: x86 emulator: convert basic ALU ops to fastop Avi Kivity
@ 2013-01-08 11:38 ` Gleb Natapov
2013-01-10 17:33 ` Marcelo Tosatti
8 siblings, 0 replies; 10+ messages in thread
From: Gleb Natapov @ 2013-01-08 11:38 UTC (permalink / raw)
To: Avi Kivity; +Cc: Marcelo Tosatti, kvm
On Fri, Jan 04, 2013 at 04:18:47PM +0200, Avi Kivity wrote:
> The current arithmetic instruction emulation is fairly clumsy: after
> decode, each instruction gets a switch (size), and for every size
> we fetch the operands, prepare flags, emulate the instruction, then store
> back the flags and operands.
>
> This patchset simplifies things by moving everything into common code
> except the instruction itself. All the pre- and post- processing is
> coded just once. The per-instrution code looks like:
>
> add %bl, %al
> ret
>
> add %bx, %ax
> ret
>
> add %ebx, %eax
> ret
>
> add %rbx, %rax
> ret
>
> The savings in size, for the ten instructions converted in this patchset,
> are fairly large:
>
> text data bss dec hex filename
> 63724 0 0 63724 f8ec arch/x86/kvm/emulate.o.before
> 61268 0 0 61268 ef54 arch/x86/kvm/emulate.o.after
>
> - around 2500 bytes.
>
> v3: fix reversed operand order in 2-operand macro
>
> v2: rebased
>
Acked-by: Gleb Natapov <gleb@redhat.com>
--
Gleb.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/7] Streamline arithmetic instruction emulation
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
` (7 preceding siblings ...)
2013-01-08 11:38 ` [PATCH v3 0/7] Streamline arithmetic instruction emulation Gleb Natapov
@ 2013-01-10 17:33 ` Marcelo Tosatti
8 siblings, 0 replies; 10+ messages in thread
From: Marcelo Tosatti @ 2013-01-10 17:33 UTC (permalink / raw)
To: Avi Kivity; +Cc: Gleb Natapov, kvm
On Fri, Jan 04, 2013 at 04:18:47PM +0200, Avi Kivity wrote:
> The current arithmetic instruction emulation is fairly clumsy: after
> decode, each instruction gets a switch (size), and for every size
> we fetch the operands, prepare flags, emulate the instruction, then store
> back the flags and operands.
>
> This patchset simplifies things by moving everything into common code
> except the instruction itself. All the pre- and post- processing is
> coded just once. The per-instrution code looks like:
>
> add %bl, %al
> ret
>
> add %bx, %ax
> ret
>
> add %ebx, %eax
> ret
>
> add %rbx, %rax
> ret
>
> The savings in size, for the ten instructions converted in this patchset,
> are fairly large:
>
> text data bss dec hex filename
> 63724 0 0 63724 f8ec arch/x86/kvm/emulate.o.before
> 61268 0 0 61268 ef54 arch/x86/kvm/emulate.o.after
>
> - around 2500 bytes.
>
> v3: fix reversed operand order in 2-operand macro
>
> v2: rebased
>
> Avi Kivity (7):
> KVM: x86 emulator: framework for streamlining arithmetic opcodes
> KVM: x86 emulator: Support for declaring single operand fastops
> KVM: x86 emulator: introduce NoWrite flag
> KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite
> KVM: x86 emulator: convert NOT, NEG to fastop
> KVM: x86 emulator: add macros for defining 2-operand fastop emulation
> KVM: x86 emulator: convert basic ALU ops to fastop
>
> arch/x86/kvm/emulate.c | 215 +++++++++++++++++++++++++++----------------------
> 1 file changed, 120 insertions(+), 95 deletions(-)
Applied, thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-01-10 17:52 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-04 14:18 [PATCH v3 0/7] Streamline arithmetic instruction emulation Avi Kivity
2013-01-04 14:18 ` [PATCH v3 1/7] KVM: x86 emulator: framework for streamlining arithmetic opcodes Avi Kivity
2013-01-04 14:18 ` [PATCH v3 2/7] KVM: x86 emulator: Support for declaring single operand fastops Avi Kivity
2013-01-04 14:18 ` [PATCH v3 3/7] KVM: x86 emulator: introduce NoWrite flag Avi Kivity
2013-01-04 14:18 ` [PATCH v3 4/7] KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite Avi Kivity
2013-01-04 14:18 ` [PATCH v3 5/7] KVM: x86 emulator: convert NOT, NEG to fastop Avi Kivity
2013-01-04 14:18 ` [PATCH v3 6/7] KVM: x86 emulator: add macros for defining 2-operand fastop emulation Avi Kivity
2013-01-04 14:18 ` [PATCH v3 7/7] KVM: x86 emulator: convert basic ALU ops to fastop Avi Kivity
2013-01-08 11:38 ` [PATCH v3 0/7] Streamline arithmetic instruction emulation Gleb Natapov
2013-01-10 17:33 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox