* [PATCH 0/3] x86emul: further misc small adjustments
@ 2016-08-16 9:30 Jan Beulich
2016-08-16 9:32 ` [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV} Jan Beulich
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Jan Beulich @ 2016-08-16 9:30 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
Along the lines of and on top of
https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg01852.html
here are three more.
1: use DstEax also for {,I}{MUL,DIV}
2: don't open code EFLAGS handling for 2-operand IMUL
3: re-order main 2-byte opcode switch() statement
Signed-off-by: Jan Beulich <jbeulich@suse.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV}
2016-08-16 9:30 [PATCH 0/3] x86emul: further misc small adjustments Jan Beulich
@ 2016-08-16 9:32 ` Jan Beulich
2016-08-16 14:08 ` Andrew Cooper
2016-08-16 9:33 ` [PATCH 2/3] x86emul: don't open code EFLAGS handling for 2-operand IMUL Jan Beulich
2016-08-16 9:34 ` [PATCH 3/3] x86emul: re-order main 2-byte opcode switch() statement Jan Beulich
2 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2016-08-16 9:32 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
[-- Attachment #1: Type: text/plain, Size: 2248 bytes --]
Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
possible"): While it avoids just a few instructions, we should
nevertheless make use of generic code as much as possible.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1738,7 +1738,7 @@ x86_emulate(
case 5: /* imul */
case 6: /* div */
case 7: /* idiv */
- d = (d & (ByteOp | ModRM)) | DstImplicit | SrcMem;
+ d = (d & (ByteOp | ModRM)) | DstEax | SrcMem;
break;
}
break;
@@ -3554,11 +3554,8 @@ x86_emulate(
emulate_1op("neg", dst, _regs.eflags);
break;
case 4: /* mul */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- dst.val = *dst.reg;
_regs.eflags &= ~(EFLG_OF|EFLG_CF);
- switch ( dst.bytes = src.bytes )
+ switch ( dst.bytes )
{
case 1:
dst.val = (uint8_t)dst.val;
@@ -3594,10 +3591,6 @@ x86_emulate(
}
break;
case 5: /* imul */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- dst.val = *dst.reg;
- dst.bytes = src.bytes;
imul:
_regs.eflags &= ~(EFLG_OF|EFLG_CF);
switch ( dst.bytes )
@@ -3639,9 +3632,7 @@ x86_emulate(
}
break;
case 6: /* div */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- switch ( dst.bytes = src.bytes )
+ switch ( src.bytes )
{
case 1:
u[0] = (uint16_t)_regs.eax;
@@ -3686,9 +3677,7 @@ x86_emulate(
}
break;
case 7: /* idiv */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- switch ( dst.bytes = src.bytes )
+ switch ( src.bytes )
{
case 1:
u[0] = (int16_t)_regs.eax;
[-- Attachment #2: x86emul-use-DstEax-mul-div.patch --]
[-- Type: text/plain, Size: 2288 bytes --]
x86emul: use DstEax also for {,I}{MUL,DIV}
Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
possible"): While it avoids just a few instructions, we should
nevertheless make use of generic code as much as possible.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1738,7 +1738,7 @@ x86_emulate(
case 5: /* imul */
case 6: /* div */
case 7: /* idiv */
- d = (d & (ByteOp | ModRM)) | DstImplicit | SrcMem;
+ d = (d & (ByteOp | ModRM)) | DstEax | SrcMem;
break;
}
break;
@@ -3554,11 +3554,8 @@ x86_emulate(
emulate_1op("neg", dst, _regs.eflags);
break;
case 4: /* mul */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- dst.val = *dst.reg;
_regs.eflags &= ~(EFLG_OF|EFLG_CF);
- switch ( dst.bytes = src.bytes )
+ switch ( dst.bytes )
{
case 1:
dst.val = (uint8_t)dst.val;
@@ -3594,10 +3591,6 @@ x86_emulate(
}
break;
case 5: /* imul */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- dst.val = *dst.reg;
- dst.bytes = src.bytes;
imul:
_regs.eflags &= ~(EFLG_OF|EFLG_CF);
switch ( dst.bytes )
@@ -3639,9 +3632,7 @@ x86_emulate(
}
break;
case 6: /* div */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- switch ( dst.bytes = src.bytes )
+ switch ( src.bytes )
{
case 1:
u[0] = (uint16_t)_regs.eax;
@@ -3686,9 +3677,7 @@ x86_emulate(
}
break;
case 7: /* idiv */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- switch ( dst.bytes = src.bytes )
+ switch ( src.bytes )
{
case 1:
u[0] = (int16_t)_regs.eax;
[-- Attachment #3: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/3] x86emul: don't open code EFLAGS handling for 2-operand IMUL
2016-08-16 9:30 [PATCH 0/3] x86emul: further misc small adjustments Jan Beulich
2016-08-16 9:32 ` [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV} Jan Beulich
@ 2016-08-16 9:33 ` Jan Beulich
2016-08-16 14:20 ` Andrew Cooper
2016-08-16 9:34 ` [PATCH 3/3] x86emul: re-order main 2-byte opcode switch() statement Jan Beulich
2 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2016-08-16 9:33 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
[-- Attachment #1: Type: text/plain, Size: 7363 bytes --]
Slightly extending the emulate_2op*() macro machinery makes it usable
for IMUL r,r/m too, which has the benefit of smaller source code and
the EFLAGS output being guaranteed to match actual hardware behavior.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -467,28 +467,29 @@ typedef union {
"orl %"_LO32 _tmp",%"_LO32 _sav"; "
/* Raw emulation: instruction has two explicit operands. */
-#define __emulate_2op_nobyte(_op,_src,_dst,_eflags,_wx,_wy,_lx,_ly,_qx,_qy)\
+#define __emulate_2op_nobyte(_op,_src,_dst,_eflags, wsx,wsy,wdx,wdy, \
+ lsx,lsy,ldx,ldy, qsx,qsy,qdx,qdy) \
do{ unsigned long _tmp; \
switch ( (_dst).bytes ) \
{ \
case 2: \
asm volatile ( \
_PRE_EFLAGS("0","4","2") \
- _op"w %"_wx"3,%1; " \
+ _op"w %"wsx"3,%"wdx"1; " \
_POST_EFLAGS("0","4","2") \
- : "+g" (_eflags), "+m" ((_dst).val), "=&r" (_tmp) \
- : _wy ((_src).val), "i" (EFLAGS_MASK) ); \
+ : "+g" (_eflags), "+" wdy ((_dst).val), "=&r" (_tmp) \
+ : wsy ((_src).val), "i" (EFLAGS_MASK) ); \
break; \
case 4: \
asm volatile ( \
_PRE_EFLAGS("0","4","2") \
- _op"l %"_lx"3,%1; " \
+ _op"l %"lsx"3,%"ldx"1; " \
_POST_EFLAGS("0","4","2") \
- : "+g" (_eflags), "+m" ((_dst).val), "=&r" (_tmp) \
- : _ly ((_src).val), "i" (EFLAGS_MASK) ); \
+ : "+g" (_eflags), "+" ldy ((_dst).val), "=&r" (_tmp) \
+ : lsy ((_src).val), "i" (EFLAGS_MASK) ); \
break; \
case 8: \
- __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy); \
+ __emulate_2op_8byte(_op, _src, _dst, _eflags, qsx, qsy, qdx, qdy); \
break; \
} \
} while (0)
@@ -505,7 +506,8 @@ do{ unsigned long _tmp;
: _by ((_src).val), "i" (EFLAGS_MASK) ); \
break; \
default: \
- __emulate_2op_nobyte(_op,_src,_dst,_eflags,_wx,_wy,_lx,_ly,_qx,_qy);\
+ __emulate_2op_nobyte(_op,_src,_dst,_eflags, _wx,_wy,"","m", \
+ _lx,_ly,"","m", _qx,_qy,"","m"); \
break; \
} \
} while (0)
@@ -519,8 +521,12 @@ do{ unsigned long _tmp;
"b", "q", "w", "r", _LO32, "r", "", "r")
/* Source operand is word, long or quad sized. */
#define emulate_2op_SrcV_nobyte(_op, _src, _dst, _eflags) \
- __emulate_2op_nobyte(_op, _src, _dst, _eflags, \
- "w", "r", _LO32, "r", "", "r")
+ __emulate_2op_nobyte(_op, _src, _dst, _eflags, "w", "r", "", "m", \
+ _LO32, "r", "", "m", "", "r", "", "m")
+/* Operands are word, long or quad sized and source may be in memory. */
+#define emulate_2op_SrcV_srcmem(_op, _src, _dst, _eflags) \
+ __emulate_2op_nobyte(_op, _src, _dst, _eflags, "", "m", "w", "r", \
+ "", "m", _LO32, "r", "", "m", "", "r")
/* Instruction has only one explicit operand (no source operand). */
#define emulate_1op(_op,_dst,_eflags) \
@@ -559,13 +565,13 @@ do{ unsigned long _tmp;
/* Emulate an instruction with quadword operands (x86/64 only). */
#if defined(__x86_64__)
-#define __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy) \
+#define __emulate_2op_8byte(_op, _src, _dst, _eflags, qsx, qsy, qdx, qdy) \
do{ asm volatile ( \
_PRE_EFLAGS("0","4","2") \
- _op"q %"_qx"3,%1; " \
+ _op"q %"qsx"3,%"qdx"1; " \
_POST_EFLAGS("0","4","2") \
- : "+g" (_eflags), "+m" ((_dst).val), "=&r" (_tmp) \
- : _qy ((_src).val), "i" (EFLAGS_MASK) ); \
+ : "+g" (_eflags), "+" qdy ((_dst).val), "=&r" (_tmp) \
+ : qsy ((_src).val), "i" (EFLAGS_MASK) ); \
} while (0)
#define __emulate_1op_8byte(_op, _dst, _eflags) \
do{ asm volatile ( \
@@ -576,7 +582,7 @@ do{ asm volatile (
: "i" (EFLAGS_MASK) ); \
} while (0)
#elif defined(__i386__)
-#define __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy)
+#define __emulate_2op_8byte(_op, _src, _dst, _eflags, qsx, qsy, qdx, qdy)
#define __emulate_1op_8byte(_op, _dst, _eflags)
#endif /* __i386__ */
@@ -4575,31 +4581,7 @@ x86_emulate(
break;
case 0xaf: /* imul */
- _regs.eflags &= ~(EFLG_OF|EFLG_CF);
- switch ( dst.bytes )
- {
- case 2:
- dst.val = ((uint32_t)(int16_t)src.val *
- (uint32_t)(int16_t)dst.val);
- if ( (int16_t)dst.val != (uint32_t)dst.val )
- _regs.eflags |= EFLG_OF|EFLG_CF;
- break;
-#ifdef __x86_64__
- case 4:
- dst.val = ((uint64_t)(int32_t)src.val *
- (uint64_t)(int32_t)dst.val);
- if ( (int32_t)dst.val != dst.val )
- _regs.eflags |= EFLG_OF|EFLG_CF;
- break;
-#endif
- default: {
- unsigned long m[2] = { src.val, dst.val };
- if ( imul_dbl(m) )
- _regs.eflags |= EFLG_OF|EFLG_CF;
- dst.val = m[0];
- break;
- }
- }
+ emulate_2op_SrcV_srcmem("imul", src, dst, _regs.eflags);
break;
case 0xb2: /* lss */
[-- Attachment #2: x86emul-imul-2op.patch --]
[-- Type: text/plain, Size: 7422 bytes --]
x86emul: don't open code EFLAGS handling for 2-operand IMUL
Slightly extending the emulate_2op*() macro machinery makes it usable
for IMUL r,r/m too, which has the benefit of smaller source code and
the EFLAGS output being guaranteed to match actual hardware behavior.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -467,28 +467,29 @@ typedef union {
"orl %"_LO32 _tmp",%"_LO32 _sav"; "
/* Raw emulation: instruction has two explicit operands. */
-#define __emulate_2op_nobyte(_op,_src,_dst,_eflags,_wx,_wy,_lx,_ly,_qx,_qy)\
+#define __emulate_2op_nobyte(_op,_src,_dst,_eflags, wsx,wsy,wdx,wdy, \
+ lsx,lsy,ldx,ldy, qsx,qsy,qdx,qdy) \
do{ unsigned long _tmp; \
switch ( (_dst).bytes ) \
{ \
case 2: \
asm volatile ( \
_PRE_EFLAGS("0","4","2") \
- _op"w %"_wx"3,%1; " \
+ _op"w %"wsx"3,%"wdx"1; " \
_POST_EFLAGS("0","4","2") \
- : "+g" (_eflags), "+m" ((_dst).val), "=&r" (_tmp) \
- : _wy ((_src).val), "i" (EFLAGS_MASK) ); \
+ : "+g" (_eflags), "+" wdy ((_dst).val), "=&r" (_tmp) \
+ : wsy ((_src).val), "i" (EFLAGS_MASK) ); \
break; \
case 4: \
asm volatile ( \
_PRE_EFLAGS("0","4","2") \
- _op"l %"_lx"3,%1; " \
+ _op"l %"lsx"3,%"ldx"1; " \
_POST_EFLAGS("0","4","2") \
- : "+g" (_eflags), "+m" ((_dst).val), "=&r" (_tmp) \
- : _ly ((_src).val), "i" (EFLAGS_MASK) ); \
+ : "+g" (_eflags), "+" ldy ((_dst).val), "=&r" (_tmp) \
+ : lsy ((_src).val), "i" (EFLAGS_MASK) ); \
break; \
case 8: \
- __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy); \
+ __emulate_2op_8byte(_op, _src, _dst, _eflags, qsx, qsy, qdx, qdy); \
break; \
} \
} while (0)
@@ -505,7 +506,8 @@ do{ unsigned long _tmp;
: _by ((_src).val), "i" (EFLAGS_MASK) ); \
break; \
default: \
- __emulate_2op_nobyte(_op,_src,_dst,_eflags,_wx,_wy,_lx,_ly,_qx,_qy);\
+ __emulate_2op_nobyte(_op,_src,_dst,_eflags, _wx,_wy,"","m", \
+ _lx,_ly,"","m", _qx,_qy,"","m"); \
break; \
} \
} while (0)
@@ -519,8 +521,12 @@ do{ unsigned long _tmp;
"b", "q", "w", "r", _LO32, "r", "", "r")
/* Source operand is word, long or quad sized. */
#define emulate_2op_SrcV_nobyte(_op, _src, _dst, _eflags) \
- __emulate_2op_nobyte(_op, _src, _dst, _eflags, \
- "w", "r", _LO32, "r", "", "r")
+ __emulate_2op_nobyte(_op, _src, _dst, _eflags, "w", "r", "", "m", \
+ _LO32, "r", "", "m", "", "r", "", "m")
+/* Operands are word, long or quad sized and source may be in memory. */
+#define emulate_2op_SrcV_srcmem(_op, _src, _dst, _eflags) \
+ __emulate_2op_nobyte(_op, _src, _dst, _eflags, "", "m", "w", "r", \
+ "", "m", _LO32, "r", "", "m", "", "r")
/* Instruction has only one explicit operand (no source operand). */
#define emulate_1op(_op,_dst,_eflags) \
@@ -559,13 +565,13 @@ do{ unsigned long _tmp;
/* Emulate an instruction with quadword operands (x86/64 only). */
#if defined(__x86_64__)
-#define __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy) \
+#define __emulate_2op_8byte(_op, _src, _dst, _eflags, qsx, qsy, qdx, qdy) \
do{ asm volatile ( \
_PRE_EFLAGS("0","4","2") \
- _op"q %"_qx"3,%1; " \
+ _op"q %"qsx"3,%"qdx"1; " \
_POST_EFLAGS("0","4","2") \
- : "+g" (_eflags), "+m" ((_dst).val), "=&r" (_tmp) \
- : _qy ((_src).val), "i" (EFLAGS_MASK) ); \
+ : "+g" (_eflags), "+" qdy ((_dst).val), "=&r" (_tmp) \
+ : qsy ((_src).val), "i" (EFLAGS_MASK) ); \
} while (0)
#define __emulate_1op_8byte(_op, _dst, _eflags) \
do{ asm volatile ( \
@@ -576,7 +582,7 @@ do{ asm volatile (
: "i" (EFLAGS_MASK) ); \
} while (0)
#elif defined(__i386__)
-#define __emulate_2op_8byte(_op, _src, _dst, _eflags, _qx, _qy)
+#define __emulate_2op_8byte(_op, _src, _dst, _eflags, qsx, qsy, qdx, qdy)
#define __emulate_1op_8byte(_op, _dst, _eflags)
#endif /* __i386__ */
@@ -4575,31 +4581,7 @@ x86_emulate(
break;
case 0xaf: /* imul */
- _regs.eflags &= ~(EFLG_OF|EFLG_CF);
- switch ( dst.bytes )
- {
- case 2:
- dst.val = ((uint32_t)(int16_t)src.val *
- (uint32_t)(int16_t)dst.val);
- if ( (int16_t)dst.val != (uint32_t)dst.val )
- _regs.eflags |= EFLG_OF|EFLG_CF;
- break;
-#ifdef __x86_64__
- case 4:
- dst.val = ((uint64_t)(int32_t)src.val *
- (uint64_t)(int32_t)dst.val);
- if ( (int32_t)dst.val != dst.val )
- _regs.eflags |= EFLG_OF|EFLG_CF;
- break;
-#endif
- default: {
- unsigned long m[2] = { src.val, dst.val };
- if ( imul_dbl(m) )
- _regs.eflags |= EFLG_OF|EFLG_CF;
- dst.val = m[0];
- break;
- }
- }
+ emulate_2op_SrcV_srcmem("imul", src, dst, _regs.eflags);
break;
case 0xb2: /* lss */
[-- Attachment #3: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/3] x86emul: re-order main 2-byte opcode switch() statement
2016-08-16 9:30 [PATCH 0/3] x86emul: further misc small adjustments Jan Beulich
2016-08-16 9:32 ` [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV} Jan Beulich
2016-08-16 9:33 ` [PATCH 2/3] x86emul: don't open code EFLAGS handling for 2-operand IMUL Jan Beulich
@ 2016-08-16 9:34 ` Jan Beulich
2016-08-16 14:21 ` Andrew Cooper
2 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2016-08-16 9:34 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
[-- Attachment #1: Type: text/plain, Size: 4297 bytes --]
This was meant to be numerically sorted (with reasonable exceptions),
but we've manage to diverge from that.
No functional change, only code movement.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -4488,33 +4488,6 @@ x86_emulate(
break;
}
- case 0xa8: /* push %%gs */
- src.val = x86_seg_gs;
- goto push_seg;
-
- case 0xa9: /* pop %%gs */
- src.val = x86_seg_gs;
- goto pop_seg;
-
- case 0xb0 ... 0xb1: /* cmpxchg */
- /* Save real source value, then compare EAX against destination. */
- src.orig_val = src.val;
- src.val = _regs.eax;
- /* cmp: %%eax - dst ==> dst and src swapped for macro invocation */
- emulate_2op_SrcV("cmp", dst, src, _regs.eflags);
- if ( _regs.eflags & EFLG_ZF )
- {
- /* Success: write back to memory. */
- dst.val = src.orig_val;
- }
- else
- {
- /* Failure: write the value we saw to EAX. */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- }
- break;
-
case 0xa3: bt: /* bt */
emulate_2op_SrcV_nobyte("bt", src, dst, _regs.eflags);
dst.type = OP_NONE;
@@ -4557,9 +4530,13 @@ x86_emulate(
break;
}
- case 0xb3: btr: /* btr */
- emulate_2op_SrcV_nobyte("btr", src, dst, _regs.eflags);
- break;
+ case 0xa8: /* push %%gs */
+ src.val = x86_seg_gs;
+ goto push_seg;
+
+ case 0xa9: /* pop %%gs */
+ src.val = x86_seg_gs;
+ goto pop_seg;
case 0xab: bts: /* bts */
emulate_2op_SrcV_nobyte("bts", src, dst, _regs.eflags);
@@ -4584,10 +4561,33 @@ x86_emulate(
emulate_2op_SrcV_srcmem("imul", src, dst, _regs.eflags);
break;
+ case 0xb0 ... 0xb1: /* cmpxchg */
+ /* Save real source value, then compare EAX against destination. */
+ src.orig_val = src.val;
+ src.val = _regs.eax;
+ /* cmp: %%eax - dst ==> dst and src swapped for macro invocation */
+ emulate_2op_SrcV("cmp", dst, src, _regs.eflags);
+ if ( _regs.eflags & EFLG_ZF )
+ {
+ /* Success: write back to memory. */
+ dst.val = src.orig_val;
+ }
+ else
+ {
+ /* Failure: write the value we saw to EAX. */
+ dst.type = OP_REG;
+ dst.reg = (unsigned long *)&_regs.eax;
+ }
+ break;
+
case 0xb2: /* lss */
dst.val = x86_seg_ss;
goto les;
+ case 0xb3: btr: /* btr */
+ emulate_2op_SrcV_nobyte("btr", src, dst, _regs.eflags);
+ break;
+
case 0xb4: /* lfs */
dst.val = x86_seg_fs;
goto les;
@@ -4603,6 +4603,25 @@ x86_emulate(
dst.val = (uint8_t)src.val;
break;
+ case 0xb7: /* movzx rm16,r{16,32,64} */
+ dst.val = (uint16_t)src.val;
+ break;
+
+ case 0xba: /* Grp8 */
+ switch ( modrm_reg & 7 )
+ {
+ case 4: goto bt;
+ case 5: goto bts;
+ case 6: goto btr;
+ case 7: goto btc;
+ default: generate_exception_if(1, EXC_UD, -1);
+ }
+ break;
+
+ case 0xbb: btc: /* btc */
+ emulate_2op_SrcV_nobyte("btc", src, dst, _regs.eflags);
+ break;
+
case 0xbc: /* bsf or tzcnt */ {
bool_t zf;
@@ -4671,25 +4690,6 @@ x86_emulate(
break;
}
- case 0xb7: /* movzx rm16,r{16,32,64} */
- dst.val = (uint16_t)src.val;
- break;
-
- case 0xbb: btc: /* btc */
- emulate_2op_SrcV_nobyte("btc", src, dst, _regs.eflags);
- break;
-
- case 0xba: /* Grp8 */
- switch ( modrm_reg & 7 )
- {
- case 4: goto bt;
- case 5: goto bts;
- case 6: goto btr;
- case 7: goto btc;
- default: generate_exception_if(1, EXC_UD, -1);
- }
- break;
-
case 0xbe: /* movsx rm8,r{16,32,64} */
/* Recompute DstReg as we may have decoded AH/BH/CH/DH. */
dst.reg = decode_register(modrm_reg, &_regs, 0);
[-- Attachment #2: x86emul-twobyte-reorder.patch --]
[-- Type: text/plain, Size: 4352 bytes --]
x86emul: re-order main 2-byte opcode switch() statement
This was meant to be numerically sorted (with reasonable exceptions),
but we've manage to diverge from that.
No functional change, only code movement.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -4488,33 +4488,6 @@ x86_emulate(
break;
}
- case 0xa8: /* push %%gs */
- src.val = x86_seg_gs;
- goto push_seg;
-
- case 0xa9: /* pop %%gs */
- src.val = x86_seg_gs;
- goto pop_seg;
-
- case 0xb0 ... 0xb1: /* cmpxchg */
- /* Save real source value, then compare EAX against destination. */
- src.orig_val = src.val;
- src.val = _regs.eax;
- /* cmp: %%eax - dst ==> dst and src swapped for macro invocation */
- emulate_2op_SrcV("cmp", dst, src, _regs.eflags);
- if ( _regs.eflags & EFLG_ZF )
- {
- /* Success: write back to memory. */
- dst.val = src.orig_val;
- }
- else
- {
- /* Failure: write the value we saw to EAX. */
- dst.type = OP_REG;
- dst.reg = (unsigned long *)&_regs.eax;
- }
- break;
-
case 0xa3: bt: /* bt */
emulate_2op_SrcV_nobyte("bt", src, dst, _regs.eflags);
dst.type = OP_NONE;
@@ -4557,9 +4530,13 @@ x86_emulate(
break;
}
- case 0xb3: btr: /* btr */
- emulate_2op_SrcV_nobyte("btr", src, dst, _regs.eflags);
- break;
+ case 0xa8: /* push %%gs */
+ src.val = x86_seg_gs;
+ goto push_seg;
+
+ case 0xa9: /* pop %%gs */
+ src.val = x86_seg_gs;
+ goto pop_seg;
case 0xab: bts: /* bts */
emulate_2op_SrcV_nobyte("bts", src, dst, _regs.eflags);
@@ -4584,10 +4561,33 @@ x86_emulate(
emulate_2op_SrcV_srcmem("imul", src, dst, _regs.eflags);
break;
+ case 0xb0 ... 0xb1: /* cmpxchg */
+ /* Save real source value, then compare EAX against destination. */
+ src.orig_val = src.val;
+ src.val = _regs.eax;
+ /* cmp: %%eax - dst ==> dst and src swapped for macro invocation */
+ emulate_2op_SrcV("cmp", dst, src, _regs.eflags);
+ if ( _regs.eflags & EFLG_ZF )
+ {
+ /* Success: write back to memory. */
+ dst.val = src.orig_val;
+ }
+ else
+ {
+ /* Failure: write the value we saw to EAX. */
+ dst.type = OP_REG;
+ dst.reg = (unsigned long *)&_regs.eax;
+ }
+ break;
+
case 0xb2: /* lss */
dst.val = x86_seg_ss;
goto les;
+ case 0xb3: btr: /* btr */
+ emulate_2op_SrcV_nobyte("btr", src, dst, _regs.eflags);
+ break;
+
case 0xb4: /* lfs */
dst.val = x86_seg_fs;
goto les;
@@ -4603,6 +4603,25 @@ x86_emulate(
dst.val = (uint8_t)src.val;
break;
+ case 0xb7: /* movzx rm16,r{16,32,64} */
+ dst.val = (uint16_t)src.val;
+ break;
+
+ case 0xba: /* Grp8 */
+ switch ( modrm_reg & 7 )
+ {
+ case 4: goto bt;
+ case 5: goto bts;
+ case 6: goto btr;
+ case 7: goto btc;
+ default: generate_exception_if(1, EXC_UD, -1);
+ }
+ break;
+
+ case 0xbb: btc: /* btc */
+ emulate_2op_SrcV_nobyte("btc", src, dst, _regs.eflags);
+ break;
+
case 0xbc: /* bsf or tzcnt */ {
bool_t zf;
@@ -4671,25 +4690,6 @@ x86_emulate(
break;
}
- case 0xb7: /* movzx rm16,r{16,32,64} */
- dst.val = (uint16_t)src.val;
- break;
-
- case 0xbb: btc: /* btc */
- emulate_2op_SrcV_nobyte("btc", src, dst, _regs.eflags);
- break;
-
- case 0xba: /* Grp8 */
- switch ( modrm_reg & 7 )
- {
- case 4: goto bt;
- case 5: goto bts;
- case 6: goto btr;
- case 7: goto btc;
- default: generate_exception_if(1, EXC_UD, -1);
- }
- break;
-
case 0xbe: /* movsx rm8,r{16,32,64} */
/* Recompute DstReg as we may have decoded AH/BH/CH/DH. */
dst.reg = decode_register(modrm_reg, &_regs, 0);
[-- Attachment #3: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV}
2016-08-16 9:32 ` [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV} Jan Beulich
@ 2016-08-16 14:08 ` Andrew Cooper
2016-08-16 14:57 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Cooper @ 2016-08-16 14:08 UTC (permalink / raw)
To: Jan Beulich, xen-devel
On 16/08/16 10:32, Jan Beulich wrote:
> Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
> possible"): While it avoids just a few instructions, we should
> nevertheless make use of generic code as much as possible.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
This does reduce the amount of code, but it isn't strictly true. The
mul and div instructions are DstEaxEdx, as are a number of other
instructions.
We shouldn't end up with special casing the eax part because we have an
easy literal for it, but leaving the edx hard coded because that is
easier to express in the current code.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] x86emul: don't open code EFLAGS handling for 2-operand IMUL
2016-08-16 9:33 ` [PATCH 2/3] x86emul: don't open code EFLAGS handling for 2-operand IMUL Jan Beulich
@ 2016-08-16 14:20 ` Andrew Cooper
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Cooper @ 2016-08-16 14:20 UTC (permalink / raw)
To: Jan Beulich, xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 464 bytes --]
On 16/08/16 10:33, Jan Beulich wrote:
> Slightly extending the emulate_2op*() macro machinery makes it usable
> for IMUL r,r/m too, which has the benefit of smaller source code and
> the EFLAGS output being guaranteed to match actual hardware behavior.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> (I think -
reviewing this just re-enforces my view of how horribly opaque this bit
of the emulator is)
[-- Attachment #1.2: Type: text/html, Size: 1023 bytes --]
[-- Attachment #2: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] x86emul: re-order main 2-byte opcode switch() statement
2016-08-16 9:34 ` [PATCH 3/3] x86emul: re-order main 2-byte opcode switch() statement Jan Beulich
@ 2016-08-16 14:21 ` Andrew Cooper
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Cooper @ 2016-08-16 14:21 UTC (permalink / raw)
To: Jan Beulich, xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 304 bytes --]
On 16/08/16 10:34, Jan Beulich wrote:
> This was meant to be numerically sorted (with reasonable exceptions),
> but we've manage to diverge from that.
>
> No functional change, only code movement.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[-- Attachment #1.2: Type: text/html, Size: 855 bytes --]
[-- Attachment #2: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV}
2016-08-16 14:08 ` Andrew Cooper
@ 2016-08-16 14:57 ` Jan Beulich
2016-08-16 15:11 ` Andrew Cooper
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2016-08-16 14:57 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
>>> On 16.08.16 at 16:08, <andrew.cooper3@citrix.com> wrote:
> On 16/08/16 10:32, Jan Beulich wrote:
>> Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
>> possible"): While it avoids just a few instructions, we should
>> nevertheless make use of generic code as much as possible.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> This does reduce the amount of code, but it isn't strictly true. The
> mul and div instructions are DstEaxEdx, as are a number of other
> instructions.
>
> We shouldn't end up with special casing the eax part because we have an
> easy literal for it, but leaving the edx hard coded because that is
> easier to express in the current code.
I think the code reduction is nevertheless worth it, and reduction
here can only help readability imo. Would you be okay if I added
a comment to the place where the DstEax gets set here? (Note
that DstEdxEax wouldn't be true for 8-bit operations, so I'd rather
not use this as another alias or even a completely new operand
kind description. And please also remember that the tables don't
express all operands in all cases anyway - just consider
SHLD/SHRD.)
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV}
2016-08-16 14:57 ` Jan Beulich
@ 2016-08-16 15:11 ` Andrew Cooper
2016-08-16 15:23 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Cooper @ 2016-08-16 15:11 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel
On 16/08/16 15:57, Jan Beulich wrote:
>>>> On 16.08.16 at 16:08, <andrew.cooper3@citrix.com> wrote:
>> On 16/08/16 10:32, Jan Beulich wrote:
>>> Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
>>> possible"): While it avoids just a few instructions, we should
>>> nevertheless make use of generic code as much as possible.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> This does reduce the amount of code, but it isn't strictly true. The
>> mul and div instructions are DstEaxEdx, as are a number of other
>> instructions.
>>
>> We shouldn't end up with special casing the eax part because we have an
>> easy literal for it, but leaving the edx hard coded because that is
>> easier to express in the current code.
> I think the code reduction is nevertheless worth it, and reduction
> here can only help readability imo. Would you be okay if I added
> a comment to the place where the DstEax gets set here? (Note
> that DstEdxEax wouldn't be true for 8-bit operations, so I'd rather
> not use this as another alias or even a completely new operand
> kind description. And please also remember that the tables don't
> express all operands in all cases anyway - just consider
> SHLD/SHRD.)
The other option would be to use DstNone and explicitly fill in
_regs.eax, which avoids all the code to play with dst, and matches how
rdtsc/rdmsr/wrmsr currently work.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV}
2016-08-16 15:11 ` Andrew Cooper
@ 2016-08-16 15:23 ` Jan Beulich
2016-08-16 16:07 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2016-08-16 15:23 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
>>> On 16.08.16 at 17:11, <andrew.cooper3@citrix.com> wrote:
> On 16/08/16 15:57, Jan Beulich wrote:
>>>>> On 16.08.16 at 16:08, <andrew.cooper3@citrix.com> wrote:
>>> On 16/08/16 10:32, Jan Beulich wrote:
>>>> Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
>>>> possible"): While it avoids just a few instructions, we should
>>>> nevertheless make use of generic code as much as possible.
>>>>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> This does reduce the amount of code, but it isn't strictly true. The
>>> mul and div instructions are DstEaxEdx, as are a number of other
>>> instructions.
>>>
>>> We shouldn't end up with special casing the eax part because we have an
>>> easy literal for it, but leaving the edx hard coded because that is
>>> easier to express in the current code.
>> I think the code reduction is nevertheless worth it, and reduction
>> here can only help readability imo. Would you be okay if I added
>> a comment to the place where the DstEax gets set here? (Note
>> that DstEdxEax wouldn't be true for 8-bit operations, so I'd rather
>> not use this as another alias or even a completely new operand
>> kind description. And please also remember that the tables don't
>> express all operands in all cases anyway - just consider
>> SHLD/SHRD.)
>
> The other option would be to use DstNone and explicitly fill in
> _regs.eax, which avoids all the code to play with dst, and matches how
> rdtsc/rdmsr/wrmsr currently work.
Well, that would be more code, but not less of a lie. Or maybe, if
it we stayed with DstImplicit (as it is without this patch) instead of
making it DstNone. Let me see how that ends up looking.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV}
2016-08-16 15:23 ` Jan Beulich
@ 2016-08-16 16:07 ` Jan Beulich
2016-08-16 16:14 ` Andrew Cooper
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2016-08-16 16:07 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
>>> On 16.08.16 at 17:23, <JBeulich@suse.com> wrote:
>>>> On 16.08.16 at 17:11, <andrew.cooper3@citrix.com> wrote:
>> On 16/08/16 15:57, Jan Beulich wrote:
>>>>>> On 16.08.16 at 16:08, <andrew.cooper3@citrix.com> wrote:
>>>> On 16/08/16 10:32, Jan Beulich wrote:
>>>>> Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
>>>>> possible"): While it avoids just a few instructions, we should
>>>>> nevertheless make use of generic code as much as possible.
>>>>>
>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>> This does reduce the amount of code, but it isn't strictly true. The
>>>> mul and div instructions are DstEaxEdx, as are a number of other
>>>> instructions.
>>>>
>>>> We shouldn't end up with special casing the eax part because we have an
>>>> easy literal for it, but leaving the edx hard coded because that is
>>>> easier to express in the current code.
>>> I think the code reduction is nevertheless worth it, and reduction
>>> here can only help readability imo. Would you be okay if I added
>>> a comment to the place where the DstEax gets set here? (Note
>>> that DstEdxEax wouldn't be true for 8-bit operations, so I'd rather
>>> not use this as another alias or even a completely new operand
>>> kind description. And please also remember that the tables don't
>>> express all operands in all cases anyway - just consider
>>> SHLD/SHRD.)
>>
>> The other option would be to use DstNone and explicitly fill in
>> _regs.eax, which avoids all the code to play with dst, and matches how
>> rdtsc/rdmsr/wrmsr currently work.
>
> Well, that would be more code, but not less of a lie. Or maybe, if
> it we stayed with DstImplicit (as it is without this patch) instead of
> making it DstNone. Let me see how that ends up looking.
Actually - no, we can't do that: The imul case has other imul cases
funneled into it (via the imul: label), and I wouldn't want the mul,
div, and idiv cases be different from the imul one. So I'd really like
to ask you to reconsider whether the patch in its current form
(perhaps with some comment added) isn't acceptable.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV}
2016-08-16 16:07 ` Jan Beulich
@ 2016-08-16 16:14 ` Andrew Cooper
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Cooper @ 2016-08-16 16:14 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel
On 16/08/16 17:07, Jan Beulich wrote:
>>>> On 16.08.16 at 17:23, <JBeulich@suse.com> wrote:
>>>>> On 16.08.16 at 17:11, <andrew.cooper3@citrix.com> wrote:
>>> On 16/08/16 15:57, Jan Beulich wrote:
>>>>>>> On 16.08.16 at 16:08, <andrew.cooper3@citrix.com> wrote:
>>>>> On 16/08/16 10:32, Jan Beulich wrote:
>>>>>> Just like said in commit c0bc0adf24 ("x86emul: use DstEax where
>>>>>> possible"): While it avoids just a few instructions, we should
>>>>>> nevertheless make use of generic code as much as possible.
>>>>>>
>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>> This does reduce the amount of code, but it isn't strictly true. The
>>>>> mul and div instructions are DstEaxEdx, as are a number of other
>>>>> instructions.
>>>>>
>>>>> We shouldn't end up with special casing the eax part because we have an
>>>>> easy literal for it, but leaving the edx hard coded because that is
>>>>> easier to express in the current code.
>>>> I think the code reduction is nevertheless worth it, and reduction
>>>> here can only help readability imo. Would you be okay if I added
>>>> a comment to the place where the DstEax gets set here? (Note
>>>> that DstEdxEax wouldn't be true for 8-bit operations, so I'd rather
>>>> not use this as another alias or even a completely new operand
>>>> kind description. And please also remember that the tables don't
>>>> express all operands in all cases anyway - just consider
>>>> SHLD/SHRD.)
>>> The other option would be to use DstNone and explicitly fill in
>>> _regs.eax, which avoids all the code to play with dst, and matches how
>>> rdtsc/rdmsr/wrmsr currently work.
>> Well, that would be more code, but not less of a lie. Or maybe, if
>> it we stayed with DstImplicit (as it is without this patch) instead of
>> making it DstNone. Let me see how that ends up looking.
> Actually - no, we can't do that: The imul case has other imul cases
> funneled into it (via the imul: label), and I wouldn't want the mul,
> div, and idiv cases be different from the imul one. So I'd really like
> to ask you to reconsider whether the patch in its current form
> (perhaps with some comment added) isn't acceptable.
Ok - with a comment, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-08-16 16:14 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-16 9:30 [PATCH 0/3] x86emul: further misc small adjustments Jan Beulich
2016-08-16 9:32 ` [PATCH 1/3] x86emul: use DstEax also for {, I}{MUL, DIV} Jan Beulich
2016-08-16 14:08 ` Andrew Cooper
2016-08-16 14:57 ` Jan Beulich
2016-08-16 15:11 ` Andrew Cooper
2016-08-16 15:23 ` Jan Beulich
2016-08-16 16:07 ` Jan Beulich
2016-08-16 16:14 ` Andrew Cooper
2016-08-16 9:33 ` [PATCH 2/3] x86emul: don't open code EFLAGS handling for 2-operand IMUL Jan Beulich
2016-08-16 14:20 ` Andrew Cooper
2016-08-16 9:34 ` [PATCH 3/3] x86emul: re-order main 2-byte opcode switch() statement Jan Beulich
2016-08-16 14:21 ` Andrew Cooper
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).