* [PATCH 0/3] Improvements to decode_register()
@ 2018-01-26 13:16 Andrew Cooper
2018-01-26 13:16 ` [PATCH 1/3] x86/emul: Optimise decode_register() somewhat Andrew Cooper
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-01-26 13:16 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper
Andrew Cooper (3):
x86/emul: Optimise decode_register() somewhat
x86/hvm: Improvements to external users of decode_register()
x86/emul: Improvements to internal users of decode_register()
xen/arch/x86/hvm/hvm.c | 17 +---
xen/arch/x86/hvm/vmx/vvmx.c | 16 +---
xen/arch/x86/x86_emulate/x86_emulate.c | 149 +++++++++++++++++++--------------
xen/arch/x86/x86_emulate/x86_emulate.h | 25 +++++-
4 files changed, 113 insertions(+), 94 deletions(-)
--
2.1.4
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] x86/emul: Optimise decode_register() somewhat
2018-01-26 13:16 [PATCH 0/3] Improvements to decode_register() Andrew Cooper
@ 2018-01-26 13:16 ` Andrew Cooper
2018-01-29 11:05 ` Jan Beulich
2018-01-26 13:16 ` [PATCH 2/3] x86/hvm: Improvements to external users of decode_register() Andrew Cooper
2018-01-26 13:16 ` [PATCH 3/3] x86/emul: Improvements to internal " Andrew Cooper
2 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2018-01-26 13:16 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich
The positions of GPRs inside struct cpu_user_regs doesn't follow any
particular order, so as compiled, decode_register() becomes a jump table to 16
blocks which calculate the appropriate offset, at a total of 207 bytes.
Instead, pre-compute the offsets at build time and use pointer arithmetic to
calculate the result. The resulting function is far more reasonable:
test %edx,%edx
lea 0xbfb97(%rip),%rax # <cpu_user_regs_high_gpr_offsets>
lea 0xbfba0(%rip),%rdx # <cpu_user_regs_gpr_offsets>
cmove %rdx,%rax
and $0xf,%edi
movzbl (%rax,%rdi,1),%eax
add %rsi,%rax
retq
and by observation, most callers in x86_emulate() inline and
constant-propagate the highbyte_regs value of 0 to drop the test, one lea and
the cmove.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
xen/arch/x86/x86_emulate/x86_emulate.c | 82 ++++++++++++++++++++++++----------
1 file changed, 58 insertions(+), 24 deletions(-)
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index ff0a003..3f5636f 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -396,6 +396,51 @@ static const struct {
/* Shift values between src and dst sizes of pmov{s,z}x{b,w,d}{w,d,q}. */
static const uint8_t pmov_convert_delta[] = { 1, 2, 3, 1, 2, 1 };
+/*
+ * Map GPRs by ModRM encoding to their offset within struct cpu_user_regs.
+ * The AH,CH,DH,BH offsets are misaligned.
+ */
+static const uint8_t cpu_user_regs_gpr_offsets[] = {
+ offsetof(struct cpu_user_regs, r(ax)),
+ offsetof(struct cpu_user_regs, r(cx)),
+ offsetof(struct cpu_user_regs, r(dx)),
+ offsetof(struct cpu_user_regs, r(bx)),
+ offsetof(struct cpu_user_regs, r(sp)),
+ offsetof(struct cpu_user_regs, r(bp)),
+ offsetof(struct cpu_user_regs, r(si)),
+ offsetof(struct cpu_user_regs, r(di)),
+#if defined(__x86_64__)
+ offsetof(struct cpu_user_regs, r8),
+ offsetof(struct cpu_user_regs, r9),
+ offsetof(struct cpu_user_regs, r10),
+ offsetof(struct cpu_user_regs, r11),
+ offsetof(struct cpu_user_regs, r12),
+ offsetof(struct cpu_user_regs, r13),
+ offsetof(struct cpu_user_regs, r14),
+ offsetof(struct cpu_user_regs, r15),
+#endif
+};
+static const uint8_t cpu_user_regs_high_gpr_offsets[] = {
+ offsetof(struct cpu_user_regs, r(ax)),
+ offsetof(struct cpu_user_regs, r(cx)),
+ offsetof(struct cpu_user_regs, r(dx)),
+ offsetof(struct cpu_user_regs, r(bx)),
+ offsetof(struct cpu_user_regs, ah),
+ offsetof(struct cpu_user_regs, ch),
+ offsetof(struct cpu_user_regs, dh),
+ offsetof(struct cpu_user_regs, bh),
+#if defined(__x86_64__)
+ offsetof(struct cpu_user_regs, r8),
+ offsetof(struct cpu_user_regs, r9),
+ offsetof(struct cpu_user_regs, r10),
+ offsetof(struct cpu_user_regs, r11),
+ offsetof(struct cpu_user_regs, r12),
+ offsetof(struct cpu_user_regs, r13),
+ offsetof(struct cpu_user_regs, r14),
+ offsetof(struct cpu_user_regs, r15),
+#endif
+};
+
static const struct {
uint8_t simd_size:5;
uint8_t to_mem:1;
@@ -1939,32 +1984,21 @@ void *
decode_register(
uint8_t modrm_reg, struct cpu_user_regs *regs, int highbyte_regs)
{
- void *p;
+ const uint8_t *offsets = highbyte_regs ? cpu_user_regs_high_gpr_offsets
+ : cpu_user_regs_gpr_offsets;
- switch ( modrm_reg )
- {
- case 0: p = ®s->r(ax); break;
- case 1: p = ®s->r(cx); break;
- case 2: p = ®s->r(dx); break;
- case 3: p = ®s->r(bx); break;
- case 4: p = (highbyte_regs ? ®s->ah : (void *)®s->r(sp)); break;
- case 5: p = (highbyte_regs ? ®s->ch : (void *)®s->r(bp)); break;
- case 6: p = (highbyte_regs ? ®s->dh : (void *)®s->r(si)); break;
- case 7: p = (highbyte_regs ? ®s->bh : (void *)®s->r(di)); break;
-#if defined(__x86_64__)
- case 8: p = ®s->r8; break;
- case 9: p = ®s->r9; break;
- case 10: p = ®s->r10; break;
- case 11: p = ®s->r11; break;
- case 12: p = ®s->r12; break;
- case 13: p = ®s->r13; break;
- case 14: p = ®s->r14; break;
- case 15: p = ®s->r15; break;
-#endif
- default: BUG(); p = NULL; break;
- }
+ /* Check that the arrays are the same size, and a power of two. */
+ BUILD_BUG_ON(ARRAY_SIZE(cpu_user_regs_gpr_offsets) !=
+ ARRAY_SIZE(cpu_user_regs_high_gpr_offsets));
+ BUILD_BUG_ON(ARRAY_SIZE(cpu_user_regs_gpr_offsets) &
+ (ARRAY_SIZE(cpu_user_regs_gpr_offsets) - 1));
+
+ ASSERT(modrm_reg < ARRAY_SIZE(cpu_user_regs_gpr_offsets));
+
+ /* For safety in release builds. Debug builds will hit the ASSERT() */
+ modrm_reg &= ARRAY_SIZE(cpu_user_regs_gpr_offsets) - 1;
- return p;
+ return (void *)regs + offsets[modrm_reg];
}
static void *decode_vex_gpr(unsigned int vex_reg, struct cpu_user_regs *regs,
--
2.1.4
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] x86/hvm: Improvements to external users of decode_register()
2018-01-26 13:16 [PATCH 0/3] Improvements to decode_register() Andrew Cooper
2018-01-26 13:16 ` [PATCH 1/3] x86/emul: Optimise decode_register() somewhat Andrew Cooper
@ 2018-01-26 13:16 ` Andrew Cooper
2018-01-29 3:11 ` Tian, Kevin
2018-01-29 11:12 ` Jan Beulich
2018-01-26 13:16 ` [PATCH 3/3] x86/emul: Improvements to internal " Andrew Cooper
2 siblings, 2 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-01-26 13:16 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, Jan Beulich
* Rename to decode_gpr() to be more specific as to its purpose
* Drop the highbyte encoding handling, as users care, and it unlikely that
future users would care.
* Change to a static inline, returning an unsigned long pointer.
Doing so highlights that the "invalid gpr" paths in hvm_mov_{to,from}_cr()
were actually unreachable. All callers already passed in-range gprs, which
would have hit a BUG() previously.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
Future work: We should have gpr_{read,write}() helpers which suitably
cast/truncate values, because all these callsites have broken corner cases,
but that involves wiring the current operand size thought, which I haven't got
time to do right now.
---
xen/arch/x86/hvm/hvm.c | 17 ++---------------
xen/arch/x86/hvm/vmx/vvmx.c | 16 +++-------------
xen/arch/x86/x86_emulate/x86_emulate.c | 2 +-
xen/arch/x86/x86_emulate/x86_emulate.h | 25 +++++++++++++++++++++----
4 files changed, 27 insertions(+), 33 deletions(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2689046..59ef754 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2059,16 +2059,9 @@ static void hvm_set_uc_mode(struct vcpu *v, bool_t is_in_uc_mode)
int hvm_mov_to_cr(unsigned int cr, unsigned int gpr)
{
struct vcpu *curr = current;
- unsigned long val, *reg;
+ unsigned long val = *decode_gpr(guest_cpu_user_regs(), gpr);
int rc;
- if ( (reg = decode_register(gpr, guest_cpu_user_regs(), 0)) == NULL )
- {
- gdprintk(XENLOG_ERR, "invalid gpr: %u\n", gpr);
- goto exit_and_crash;
- }
-
- val = *reg;
HVMTRACE_LONG_2D(CR_WRITE, cr, TRC_PAR_LONG(val));
HVM_DBG_LOG(DBG_LEVEL_1, "CR%u, value = %lx", cr, val);
@@ -2109,13 +2102,7 @@ int hvm_mov_to_cr(unsigned int cr, unsigned int gpr)
int hvm_mov_from_cr(unsigned int cr, unsigned int gpr)
{
struct vcpu *curr = current;
- unsigned long val = 0, *reg;
-
- if ( (reg = decode_register(gpr, guest_cpu_user_regs(), 0)) == NULL )
- {
- gdprintk(XENLOG_ERR, "invalid gpr: %u\n", gpr);
- goto exit_and_crash;
- }
+ unsigned long val = 0, *reg = decode_gpr(guest_cpu_user_regs(), gpr);
switch ( cr )
{
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 885eab3..dfe97b9 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -347,18 +347,14 @@ enum vmx_insn_errno set_vvmcs_real_safe(const struct vcpu *v, u32 encoding,
static unsigned long reg_read(struct cpu_user_regs *regs,
unsigned int index)
{
- unsigned long *pval = decode_register(index, regs, 0);
-
- return *pval;
+ return *decode_gpr(regs, index);
}
static void reg_write(struct cpu_user_regs *regs,
unsigned int index,
unsigned long value)
{
- unsigned long *pval = decode_register(index, regs, 0);
-
- *pval = value;
+ *decode_gpr(regs, index) = value;
}
static inline u32 __n2_pin_exec_control(struct vcpu *v)
@@ -2483,14 +2479,8 @@ int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
case VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR:
{
unsigned long gp = VMX_CONTROL_REG_ACCESS_GPR(exit_qualification);
- unsigned long *reg;
+ val = *decode_gpr(guest_cpu_user_regs(), gp);
- if ( (reg = decode_register(gp, guest_cpu_user_regs(), 0)) == NULL )
- {
- gdprintk(XENLOG_ERR, "invalid gpr: %lx\n", gp);
- break;
- }
- val = *reg;
if ( cr == 0 )
{
u64 cr0_gh_mask = get_vvmcs(v, CR0_GUEST_HOST_MASK);
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 3f5636f..70c38a2 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -400,7 +400,7 @@ static const uint8_t pmov_convert_delta[] = { 1, 2, 3, 1, 2, 1 };
* Map GPRs by ModRM encoding to their offset within struct cpu_user_regs.
* The AH,CH,DH,BH offsets are misaligned.
*/
-static const uint8_t cpu_user_regs_gpr_offsets[] = {
+const uint8_t cpu_user_regs_gpr_offsets[] = {
offsetof(struct cpu_user_regs, r(ax)),
offsetof(struct cpu_user_regs, r(cx)),
offsetof(struct cpu_user_regs, r(dx)),
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 0c8c80a..5cdcfdd 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -25,6 +25,14 @@
#define MAX_INST_LEN 15
+#if defined(__i386__)
+# define X86_NR_GPRS 8
+#elif defined(__x86_64__)
+# define X86_NR_GPRS 16
+#else
+# error Unknown compilation width
+#endif
+
struct x86_emulate_ctxt;
/*
@@ -601,14 +609,23 @@ int x86_emulate_wrapper(
#define x86_emulate x86_emulate_wrapper
#endif
+/* Map GPRs by ModRM encoding to their offset within struct cpu_user_regs. */
+extern const uint8_t cpu_user_regs_gpr_offsets[X86_NR_GPRS];
+
/*
* Given the 'reg' portion of a ModRM byte, and a register block, return a
* pointer into the block that addresses the relevant register.
- * @highbyte_regs specifies whether to decode AH,CH,DH,BH.
*/
-void *
-decode_register(
- uint8_t modrm_reg, struct cpu_user_regs *regs, int highbyte_regs);
+static inline unsigned long *decode_gpr(struct cpu_user_regs *regs,
+ unsigned int modrm)
+{
+ ASSERT(modrm < ARRAY_SIZE(cpu_user_regs_gpr_offsets));
+
+ /* For safety in release builds. Debug builds will hit the ASSERT() */
+ modrm &= ARRAY_SIZE(cpu_user_regs_gpr_offsets) - 1;
+
+ return (void *)regs + cpu_user_regs_gpr_offsets[modrm];
+}
/* Unhandleable read, write or instruction fetch */
int
--
2.1.4
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] x86/emul: Improvements to internal users of decode_register()
2018-01-26 13:16 [PATCH 0/3] Improvements to decode_register() Andrew Cooper
2018-01-26 13:16 ` [PATCH 1/3] x86/emul: Optimise decode_register() somewhat Andrew Cooper
2018-01-26 13:16 ` [PATCH 2/3] x86/hvm: Improvements to external users of decode_register() Andrew Cooper
@ 2018-01-26 13:16 ` Andrew Cooper
2018-01-29 11:20 ` Jan Beulich
2 siblings, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2018-01-26 13:16 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich
Most users of decode_register() can be replaced with decode_gpr() right away.
For the few sites which do care about possibly using the highbyte encoding,
rename decode_register() to decode_high_gpr() and make it prototype match
decode_gpr().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
xen/arch/x86/x86_emulate/x86_emulate.c | 67 +++++++++++++++-------------------
1 file changed, 29 insertions(+), 38 deletions(-)
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 70c38a2..04cb649 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1980,9 +1980,8 @@ load_seg(
return rc;
}
-void *
-decode_register(
- uint8_t modrm_reg, struct cpu_user_regs *regs, int highbyte_regs)
+static unsigned long *decode_high_gpr(
+ struct cpu_user_regs *regs, unsigned int modrm_reg, bool highbyte_regs)
{
const uint8_t *offsets = highbyte_regs ? cpu_user_regs_high_gpr_offsets
: cpu_user_regs_gpr_offsets;
@@ -2001,10 +2000,11 @@ decode_register(
return (void *)regs + offsets[modrm_reg];
}
-static void *decode_vex_gpr(unsigned int vex_reg, struct cpu_user_regs *regs,
- const struct x86_emulate_ctxt *ctxt)
+static unsigned long *decode_vex_gpr(
+ unsigned int vex_reg, struct cpu_user_regs *regs,
+ const struct x86_emulate_ctxt *ctxt)
{
- return decode_register(~vex_reg & (mode_64bit() ? 0xf : 7), regs, 0);
+ return decode_gpr(regs, ~vex_reg & (mode_64bit() ? 0xf : 7));
}
static bool is_aligned(enum x86_segment seg, unsigned long offs,
@@ -2813,8 +2813,7 @@ x86_decode(
sib_index = ((sib >> 3) & 7) | ((rex_prefix << 2) & 8);
sib_base = (sib & 7) | ((rex_prefix << 3) & 8);
if ( sib_index != 4 && !(d & vSIB) )
- ea.mem.off = *(long *)decode_register(sib_index,
- state->regs, 0);
+ ea.mem.off = *decode_gpr(state->regs, sib_index);
ea.mem.off <<= (sib >> 6) & 3;
if ( (modrm_mod == 0) && ((sib_base & 7) == 5) )
ea.mem.off += insn_fetch_type(int32_t);
@@ -2833,15 +2832,13 @@ x86_decode(
ea.mem.off += state->regs->r(bp);
}
else
- ea.mem.off += *(long *)decode_register(sib_base,
- state->regs, 0);
+ ea.mem.off += *decode_gpr(state->regs, sib_base);
}
else
{
generate_exception_if(d & vSIB, EXC_UD);
modrm_rm |= (rex_prefix & 1) << 3;
- ea.mem.off = *(long *)decode_register(modrm_rm,
- state->regs, 0);
+ ea.mem.off = *decode_gpr(state->regs, modrm_rm);
if ( (modrm_rm == 5) && (modrm_mod != 0) )
ea.mem.seg = x86_seg_ss;
}
@@ -3066,8 +3063,7 @@ x86_emulate(
generate_exception_if(state->not_64bit && mode_64bit(), EXC_UD);
if ( ea.type == OP_REG )
- ea.reg = decode_register(modrm_rm, &_regs,
- (d & ByteOp) && !rex_prefix);
+ ea.reg = decode_high_gpr(&_regs, modrm_rm, (d & ByteOp) && !rex_prefix);
memset(mmvalp, 0xaa /* arbitrary */, sizeof(*mmvalp));
@@ -3081,13 +3077,13 @@ x86_emulate(
src.type = OP_REG;
if ( d & ByteOp )
{
- src.reg = decode_register(modrm_reg, &_regs, (rex_prefix == 0));
+ src.reg = decode_high_gpr(&_regs, modrm_reg, (rex_prefix == 0));
src.val = *(uint8_t *)src.reg;
src.bytes = 1;
}
else
{
- src.reg = decode_register(modrm_reg, &_regs, 0);
+ src.reg = decode_gpr(&_regs, modrm_reg);
switch ( (src.bytes = op_bytes) )
{
case 2: src.val = *(uint16_t *)src.reg; break;
@@ -3157,13 +3153,13 @@ x86_emulate(
dst.type = OP_REG;
if ( d & ByteOp )
{
- dst.reg = decode_register(modrm_reg, &_regs, (rex_prefix == 0));
+ dst.reg = decode_high_gpr(&_regs, modrm_reg, (rex_prefix == 0));
dst.val = *(uint8_t *)dst.reg;
dst.bytes = 1;
}
else
{
- dst.reg = decode_register(modrm_reg, &_regs, 0);
+ dst.reg = decode_gpr(&_regs, modrm_reg);
switch ( (dst.bytes = op_bytes) )
{
case 2: dst.val = *(uint16_t *)dst.reg; break;
@@ -3353,7 +3349,7 @@ x86_emulate(
case 0x40 ... 0x4f: /* inc/dec reg */
dst.type = OP_REG;
- dst.reg = decode_register(b & 7, &_regs, 0);
+ dst.reg = decode_gpr(&_regs, b & 7);
dst.bytes = op_bytes;
dst.val = *dst.reg;
if ( b & 8 )
@@ -3363,14 +3359,12 @@ x86_emulate(
break;
case 0x50 ... 0x57: /* push reg */
- src.val = *(unsigned long *)decode_register(
- (b & 7) | ((rex_prefix & 1) << 3), &_regs, 0);
+ src.val = *decode_gpr(&_regs, (b & 7) | ((rex_prefix & 1) << 3));
goto push;
case 0x58 ... 0x5f: /* pop reg */
dst.type = OP_REG;
- dst.reg = decode_register(
- (b & 7) | ((rex_prefix & 1) << 3), &_regs, 0);
+ dst.reg = decode_gpr(&_regs, (b & 7) | ((rex_prefix & 1) << 3));
dst.bytes = op_bytes;
if ( mode_64bit() && (dst.bytes == 4) )
dst.bytes = 8;
@@ -3384,7 +3378,7 @@ x86_emulate(
ea.val = _regs.esp;
for ( i = 0; i < 8; i++ )
{
- void *reg = decode_register(i, &_regs, 0);
+ void *reg = decode_gpr(&_regs, i);
if ( (rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes),
reg != &_regs.esp ? reg : &ea.val,
@@ -3396,7 +3390,7 @@ x86_emulate(
case 0x61: /* popa */
for ( i = 0; i < 8; i++ )
{
- void *reg = decode_register(7 - i, &_regs, 0);
+ void *reg = decode_gpr(&_regs, 7 - i);
if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes),
&dst.val, op_bytes, ctxt, ops)) != 0 )
@@ -3676,8 +3670,7 @@ x86_emulate(
case 0x91 ... 0x97: /* xchg reg,%%rax */
dst.type = OP_REG;
dst.bytes = op_bytes;
- dst.reg = decode_register(
- (b & 7) | ((rex_prefix & 1) << 3), &_regs, 0);
+ dst.reg = decode_gpr(&_regs, (b & 7) | ((rex_prefix & 1) << 3));
dst.val = *dst.reg;
goto xchg;
@@ -3909,14 +3902,13 @@ x86_emulate(
}
case 0xb0 ... 0xb7: /* mov imm8,r8 */
- dst.reg = decode_register(
- (b & 7) | ((rex_prefix & 1) << 3), &_regs, (rex_prefix == 0));
+ dst.reg = decode_high_gpr(&_regs, (b & 7) | ((rex_prefix & 1) << 3),
+ (rex_prefix == 0));
dst.val = src.val;
break;
case 0xb8 ... 0xbf: /* mov imm{16,32,64},r{16,32,64} */
- dst.reg = decode_register(
- (b & 7) | ((rex_prefix & 1) << 3), &_regs, 0);
+ dst.reg = decode_gpr(&_regs, (b & 7) | ((rex_prefix & 1) << 3));
dst.val = src.val;
break;
@@ -5698,7 +5690,7 @@ x86_emulate(
opc[2] = 0xc3;
copy_REX_VEX(opc, rex_prefix, vex);
- ea.reg = decode_register(modrm_reg, &_regs, 0);
+ ea.reg = decode_gpr(&_regs, modrm_reg);
invoke_stub("", "", "=a" (*ea.reg), "+m" (fic.exn_raised)
: "c" (mmvalp), "m" (*mmvalp));
@@ -6475,7 +6467,7 @@ x86_emulate(
else
{
shift = src.val;
- src.reg = decode_register(modrm_reg, &_regs, 0);
+ src.reg = decode_gpr(&_regs, modrm_reg);
src.val = truncate_word(*src.reg, dst.bytes);
}
if ( (shift &= width - 1) == 0 )
@@ -6594,7 +6586,7 @@ x86_emulate(
fail_if(!ops->read_segment);
if ( (rc = ops->read_segment(seg, &sreg, ctxt)) != X86EMUL_OKAY )
goto done;
- dst.reg = decode_register(modrm_rm, &_regs, 0);
+ dst.reg = decode_gpr(&_regs, modrm_rm);
if ( !(modrm_reg & 2) )
{
/* rd{f,g}sbase */
@@ -6654,7 +6646,7 @@ x86_emulate(
case X86EMUL_OPC(0x0f, 0xb6): /* movzx rm8,r{16,32,64} */
/* Recompute DstReg as we may have decoded AH/BH/CH/DH. */
- dst.reg = decode_register(modrm_reg, &_regs, 0);
+ dst.reg = decode_gpr(&_regs, modrm_reg);
dst.bytes = op_bytes;
dst.val = (uint8_t)src.val;
break;
@@ -6746,7 +6738,7 @@ x86_emulate(
case X86EMUL_OPC(0x0f, 0xbe): /* movsx rm8,r{16,32,64} */
/* Recompute DstReg as we may have decoded AH/BH/CH/DH. */
- dst.reg = decode_register(modrm_reg, &_regs, 0);
+ dst.reg = decode_gpr(&_regs, modrm_reg);
dst.bytes = op_bytes;
dst.val = (int8_t)src.val;
break;
@@ -6971,8 +6963,7 @@ x86_emulate(
case X86EMUL_OPC(0x0f, 0xc8) ... X86EMUL_OPC(0x0f, 0xcf): /* bswap */
dst.type = OP_REG;
- dst.reg = decode_register(
- (b & 7) | ((rex_prefix & 1) << 3), &_regs, 0);
+ dst.reg = decode_gpr(&_regs, (b & 7) | ((rex_prefix & 1) << 3));
switch ( dst.bytes = op_bytes )
{
default: /* case 2: */
--
2.1.4
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] x86/hvm: Improvements to external users of decode_register()
2018-01-26 13:16 ` [PATCH 2/3] x86/hvm: Improvements to external users of decode_register() Andrew Cooper
@ 2018-01-29 3:11 ` Tian, Kevin
2018-01-29 11:12 ` Jan Beulich
1 sibling, 0 replies; 9+ messages in thread
From: Tian, Kevin @ 2018-01-29 3:11 UTC (permalink / raw)
To: Andrew Cooper, Xen-devel; +Cc: Nakajima, Jun, Jan Beulich
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Friday, January 26, 2018 9:17 PM
>
> * Rename to decode_gpr() to be more specific as to its purpose
> * Drop the highbyte encoding handling, as users care, and it unlikely that
> future users would care.
> * Change to a static inline, returning an unsigned long pointer.
>
> Doing so highlights that the "invalid gpr" paths in hvm_mov_{to,from}_cr()
> were actually unreachable. All callers already passed in-range gprs, which
> would have hit a BUG() previously.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] x86/emul: Optimise decode_register() somewhat
2018-01-26 13:16 ` [PATCH 1/3] x86/emul: Optimise decode_register() somewhat Andrew Cooper
@ 2018-01-29 11:05 ` Jan Beulich
2018-01-29 11:19 ` Andrew Cooper
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2018-01-29 11:05 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Xen-devel
>>> On 26.01.18 at 14:16, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -396,6 +396,51 @@ static const struct {
> /* Shift values between src and dst sizes of pmov{s,z}x{b,w,d}{w,d,q}. */
> static const uint8_t pmov_convert_delta[] = { 1, 2, 3, 1, 2, 1 };
>
> +/*
> + * Map GPRs by ModRM encoding to their offset within struct cpu_user_regs.
> + * The AH,CH,DH,BH offsets are misaligned.
> + */
> +static const uint8_t cpu_user_regs_gpr_offsets[] = {
> + offsetof(struct cpu_user_regs, r(ax)),
> + offsetof(struct cpu_user_regs, r(cx)),
> + offsetof(struct cpu_user_regs, r(dx)),
> + offsetof(struct cpu_user_regs, r(bx)),
> + offsetof(struct cpu_user_regs, r(sp)),
> + offsetof(struct cpu_user_regs, r(bp)),
> + offsetof(struct cpu_user_regs, r(si)),
> + offsetof(struct cpu_user_regs, r(di)),
> +#if defined(__x86_64__)
#ifdef ?
> + offsetof(struct cpu_user_regs, r8),
> + offsetof(struct cpu_user_regs, r9),
> + offsetof(struct cpu_user_regs, r10),
> + offsetof(struct cpu_user_regs, r11),
> + offsetof(struct cpu_user_regs, r12),
> + offsetof(struct cpu_user_regs, r13),
> + offsetof(struct cpu_user_regs, r14),
> + offsetof(struct cpu_user_regs, r15),
> +#endif
> +};
> +static const uint8_t cpu_user_regs_high_gpr_offsets[] = {
> + offsetof(struct cpu_user_regs, r(ax)),
> + offsetof(struct cpu_user_regs, r(cx)),
> + offsetof(struct cpu_user_regs, r(dx)),
> + offsetof(struct cpu_user_regs, r(bx)),
al, cl, dl, bl respectively?
> + offsetof(struct cpu_user_regs, ah),
> + offsetof(struct cpu_user_regs, ch),
> + offsetof(struct cpu_user_regs, dh),
> + offsetof(struct cpu_user_regs, bh),
> +#if defined(__x86_64__)
> + offsetof(struct cpu_user_regs, r8),
> + offsetof(struct cpu_user_regs, r9),
> + offsetof(struct cpu_user_regs, r10),
> + offsetof(struct cpu_user_regs, r11),
> + offsetof(struct cpu_user_regs, r12),
> + offsetof(struct cpu_user_regs, r13),
> + offsetof(struct cpu_user_regs, r14),
> + offsetof(struct cpu_user_regs, r15),
> +#endif
These 8 entries are pointless - there's no encoding which could
result in them being used.
Not also that the name of the array isn't reflecting its contents: Only
four of the entries actually are "high byte" registers.
I also think both arrays would better have function scope only.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] x86/hvm: Improvements to external users of decode_register()
2018-01-26 13:16 ` [PATCH 2/3] x86/hvm: Improvements to external users of decode_register() Andrew Cooper
2018-01-29 3:11 ` Tian, Kevin
@ 2018-01-29 11:12 ` Jan Beulich
1 sibling, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2018-01-29 11:12 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Kevin Tian, Jun Nakajima, Xen-devel
>>> On 26.01.18 at 14:16, <andrew.cooper3@citrix.com> wrote:
> * Rename to decode_gpr() to be more specific as to its purpose
> * Drop the highbyte encoding handling, as users care, and it unlikely that
"... as users don't care, and it's ..." or "... as no users care, and it's
..."?
> @@ -601,14 +609,23 @@ int x86_emulate_wrapper(
> #define x86_emulate x86_emulate_wrapper
> #endif
>
> +/* Map GPRs by ModRM encoding to their offset within struct cpu_user_regs. */
> +extern const uint8_t cpu_user_regs_gpr_offsets[X86_NR_GPRS];
Okay, this explains why you don't want this array inside the
function.
With the description suitably corrected
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] x86/emul: Optimise decode_register() somewhat
2018-01-29 11:05 ` Jan Beulich
@ 2018-01-29 11:19 ` Andrew Cooper
0 siblings, 0 replies; 9+ messages in thread
From: Andrew Cooper @ 2018-01-29 11:19 UTC (permalink / raw)
To: Jan Beulich; +Cc: Xen-devel
On 29/01/18 11:05, Jan Beulich wrote:
>>>> On 26.01.18 at 14:16, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -396,6 +396,51 @@ static const struct {
>> /* Shift values between src and dst sizes of pmov{s,z}x{b,w,d}{w,d,q}. */
>> static const uint8_t pmov_convert_delta[] = { 1, 2, 3, 1, 2, 1 };
>>
>> +/*
>> + * Map GPRs by ModRM encoding to their offset within struct cpu_user_regs.
>> + * The AH,CH,DH,BH offsets are misaligned.
>> + */
>> +static const uint8_t cpu_user_regs_gpr_offsets[] = {
>> + offsetof(struct cpu_user_regs, r(ax)),
>> + offsetof(struct cpu_user_regs, r(cx)),
>> + offsetof(struct cpu_user_regs, r(dx)),
>> + offsetof(struct cpu_user_regs, r(bx)),
>> + offsetof(struct cpu_user_regs, r(sp)),
>> + offsetof(struct cpu_user_regs, r(bp)),
>> + offsetof(struct cpu_user_regs, r(si)),
>> + offsetof(struct cpu_user_regs, r(di)),
>> +#if defined(__x86_64__)
> #ifdef ?
Can do.
>
>> + offsetof(struct cpu_user_regs, r8),
>> + offsetof(struct cpu_user_regs, r9),
>> + offsetof(struct cpu_user_regs, r10),
>> + offsetof(struct cpu_user_regs, r11),
>> + offsetof(struct cpu_user_regs, r12),
>> + offsetof(struct cpu_user_regs, r13),
>> + offsetof(struct cpu_user_regs, r14),
>> + offsetof(struct cpu_user_regs, r15),
>> +#endif
>> +};
>> +static const uint8_t cpu_user_regs_high_gpr_offsets[] = {
>> + offsetof(struct cpu_user_regs, r(ax)),
>> + offsetof(struct cpu_user_regs, r(cx)),
>> + offsetof(struct cpu_user_regs, r(dx)),
>> + offsetof(struct cpu_user_regs, r(bx)),
> al, cl, dl, bl respectively?
Erm - one version of this patch definitely had these correct...
>
>> + offsetof(struct cpu_user_regs, ah),
>> + offsetof(struct cpu_user_regs, ch),
>> + offsetof(struct cpu_user_regs, dh),
>> + offsetof(struct cpu_user_regs, bh),
>> +#if defined(__x86_64__)
>> + offsetof(struct cpu_user_regs, r8),
>> + offsetof(struct cpu_user_regs, r9),
>> + offsetof(struct cpu_user_regs, r10),
>> + offsetof(struct cpu_user_regs, r11),
>> + offsetof(struct cpu_user_regs, r12),
>> + offsetof(struct cpu_user_regs, r13),
>> + offsetof(struct cpu_user_regs, r14),
>> + offsetof(struct cpu_user_regs, r15),
>> +#endif
> These 8 entries are pointless - there's no encoding which could
> result in them being used.
>
> Not also that the name of the array isn't reflecting its contents: Only
> four of the entries actually are "high byte" registers.
>
> I also think both arrays would better have function scope only.
You've noticed why cpu_user_regs_gpr_offsets can't be, but the high
encodings could be (in principle).
Having the arrays difference lengths would mean that we need to adjust
the boundary conditions on each side of the highbyte_reg branch. Given
the way patch 3 has turned out, this is probably fine as most codepaths
use the simpler decode_gpr() anyway. It might also be handy having an
assert catching attempts to use highbyte_reg with a rex addition.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] x86/emul: Improvements to internal users of decode_register()
2018-01-26 13:16 ` [PATCH 3/3] x86/emul: Improvements to internal " Andrew Cooper
@ 2018-01-29 11:20 ` Jan Beulich
0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2018-01-29 11:20 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Xen-devel
>>> On 26.01.18 at 14:16, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1980,9 +1980,8 @@ load_seg(
> return rc;
> }
>
> -void *
> -decode_register(
> - uint8_t modrm_reg, struct cpu_user_regs *regs, int highbyte_regs)
> +static unsigned long *decode_high_gpr(
> + struct cpu_user_regs *regs, unsigned int modrm_reg, bool highbyte_regs)
Same remark as made earlier for the array, here regarding the
function name: You're decoding legacy byte registers, not high
byte ones. As to the return type - wasn't it you who objected
to "unsigned long *" when the pointer really just points at a
byte (which I've already reflected in my v4 of the patch which
I assume triggered your attempt to do things differently)? I.e.
either void * or uint8_t *, please.
> @@ -3066,8 +3063,7 @@ x86_emulate(
> generate_exception_if(state->not_64bit && mode_64bit(), EXC_UD);
>
> if ( ea.type == OP_REG )
> - ea.reg = decode_register(modrm_rm, &_regs,
> - (d & ByteOp) && !rex_prefix);
> + ea.reg = decode_high_gpr(&_regs, modrm_rm, (d & ByteOp) && !rex_prefix);
>
> memset(mmvalp, 0xaa /* arbitrary */, sizeof(*mmvalp));
>
> @@ -3081,13 +3077,13 @@ x86_emulate(
> src.type = OP_REG;
> if ( d & ByteOp )
> {
> - src.reg = decode_register(modrm_reg, &_regs, (rex_prefix == 0));
> + src.reg = decode_high_gpr(&_regs, modrm_reg, (rex_prefix == 0));
Along the lines of the earlier hunk "!rex_prefix"? Same in I think two
more cases further down.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-01-29 11:20 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-26 13:16 [PATCH 0/3] Improvements to decode_register() Andrew Cooper
2018-01-26 13:16 ` [PATCH 1/3] x86/emul: Optimise decode_register() somewhat Andrew Cooper
2018-01-29 11:05 ` Jan Beulich
2018-01-29 11:19 ` Andrew Cooper
2018-01-26 13:16 ` [PATCH 2/3] x86/hvm: Improvements to external users of decode_register() Andrew Cooper
2018-01-29 3:11 ` Tian, Kevin
2018-01-29 11:12 ` Jan Beulich
2018-01-26 13:16 ` [PATCH 3/3] x86/emul: Improvements to internal " Andrew Cooper
2018-01-29 11:20 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).