[Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations
@ 2012-10-30  0:11 Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 01/19] target-mips: correctly restore btarget upon exception Aurelien Jarno
                   ` (19 more replies)
  0 siblings, 20 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

This patch series does some bug fixes and code cleanup in the MIPS 
target, and then does some optimizations.   

Changes v1 -> v2:
 - patch 1: new patch
 - patch 2: new patch
 - patch 5: new patch to address Richard Henders comments
 - patch 6: update following patch 5 addition
 - patch 7: new patch to address Richard Henders comments
 - patch 9: new patch to address Richard Henders comments
 - patch 16: spare one register by reusing the output of setcond to assign
             the value 1
 - patch 17: remove the buggy (lsb > msb) case
 - patch 18: fixed indentation

--

Aurelien Jarno (19):
  target-mips: correctly restore btarget upon exception
  target-mips: do not save CPU state when using retranslation
  softfloat: implement fused multiply-add NaN propagation for MIPS
  target-mips: use the softfloat floatXX_muladd functions
  target-mips: keep softfloat exception set to 0 between instructions
  target-mips: fix FPU exceptions
  target-mips: cleanup float to int conversion helpers
  target-mips: use softfloat constants when possible
  target-mips: restore CPU state after an FPU exception
  target-mips: cleanup load/store operations
  target-mips: optimize load operations
  target-mips: simplify load/store microMIPS helpers
  target-mips: implement unaligned loads using TCG
  target-mips: don't use local temps for store conditional
  target-mips: implement movn/movz using movcond
  target-mips: optimize ddiv/ddivu/div/divu with movcond
  target-mips: use deposit instead of hardcoded version
  target-mips: fix TLBR wrt SEGMask
  target-mips: don't flush extra TLB on permissions upgrade

 fpu/softfloat-specialize.h |   27 ++
 target-mips/helper.h       |   12 +-
 target-mips/op_helper.c    |  824 ++++++++++++++++----------------------------
 target-mips/translate.c    |  391 ++++++++++-----------
 4 files changed, 520 insertions(+), 734 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 01/19] target-mips: correctly restore btarget upon exception
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
@ 2012-10-30  0:11 ` Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 02/19] target-mips: do not save CPU state when using retranslation Aurelien Jarno
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

When the CPU state is restored through retranslation after an exception,
btarget should also be restored.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index ed55e26..3cf4ca1 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -578,6 +578,7 @@ static TCGv_i32 fpu_fcr0, fpu_fcr31;
 static TCGv_i64 fpu_f64[32];
 
 static uint32_t gen_opc_hflags[OPC_BUF_SIZE];
+static target_ulong gen_opc_btarget[OPC_BUF_SIZE];
 
 #include "gen-icount.h"
 
@@ -12859,6 +12860,7 @@ gen_intermediate_code_internal (CPUMIPSState *env, TranslationBlock *tb,
             }
             gen_opc_pc[lj] = ctx.pc;
             gen_opc_hflags[lj] = ctx.hflags & MIPS_HFLAG_BMASK;
+            gen_opc_btarget[lj] = ctx.btarget;
             gen_opc_instr_start[lj] = 1;
             gen_opc_icount[lj] = num_insns;
         }
@@ -13274,4 +13276,13 @@ void restore_state_to_opc(CPUMIPSState *env, TranslationBlock *tb, int pc_pos)
     env->active_tc.PC = gen_opc_pc[pc_pos];
     env->hflags &= ~MIPS_HFLAG_BMASK;
     env->hflags |= gen_opc_hflags[pc_pos];
+    switch (env->hflags & MIPS_HFLAG_BMASK_BASE) {
+    case MIPS_HFLAG_BR:
+        break;
+    case MIPS_HFLAG_BC:
+    case MIPS_HFLAG_BL:
+    case MIPS_HFLAG_B:
+        env->btarget = gen_opc_btarget[pc_pos];
+        break;
+    }
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 02/19] target-mips: do not save CPU state when using retranslation
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 01/19] target-mips: correctly restore btarget upon exception Aurelien Jarno
@ 2012-10-30  0:11 ` Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 03/19] softfloat: implement fused multiply-add NaN propagation for MIPS Aurelien Jarno
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

When the CPU state after a possible retranslation is going to be handled
through code retranslation, we don't need to save the CPU state before.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 3cf4ca1..97a63ea 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1171,13 +1171,11 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
     switch (opc) {
 #if defined(TARGET_MIPS64)
     case OPC_LWU:
-        save_cpu_state(ctx, 0);
         op_ld_lwu(t0, t0, ctx);
         gen_store_gpr(t0, rt);
         opn = "lwu";
         break;
     case OPC_LD:
-        save_cpu_state(ctx, 0);
         op_ld_ld(t0, t0, ctx);
         gen_store_gpr(t0, rt);
         opn = "ld";
@@ -1203,7 +1201,6 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
         opn = "ldr";
         break;
     case OPC_LDPC:
-        save_cpu_state(ctx, 0);
         tcg_gen_movi_tl(t1, pc_relative_pc(ctx));
         gen_op_addr_add(ctx, t0, t0, t1);
         op_ld_ld(t0, t0, ctx);
@@ -1212,7 +1209,6 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
         break;
 #endif
     case OPC_LWPC:
-        save_cpu_state(ctx, 0);
         tcg_gen_movi_tl(t1, pc_relative_pc(ctx));
         gen_op_addr_add(ctx, t0, t0, t1);
         op_ld_lw(t0, t0, ctx);
@@ -1220,31 +1216,26 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
         opn = "lwpc";
         break;
     case OPC_LW:
-        save_cpu_state(ctx, 0);
         op_ld_lw(t0, t0, ctx);
         gen_store_gpr(t0, rt);
         opn = "lw";
         break;
     case OPC_LH:
-        save_cpu_state(ctx, 0);
         op_ld_lh(t0, t0, ctx);
         gen_store_gpr(t0, rt);
         opn = "lh";
         break;
     case OPC_LHU:
-        save_cpu_state(ctx, 0);
         op_ld_lhu(t0, t0, ctx);
         gen_store_gpr(t0, rt);
         opn = "lhu";
         break;
     case OPC_LB:
-        save_cpu_state(ctx, 0);
         op_ld_lb(t0, t0, ctx);
         gen_store_gpr(t0, rt);
         opn = "lb";
         break;
     case OPC_LBU:
-        save_cpu_state(ctx, 0);
         op_ld_lbu(t0, t0, ctx);
         gen_store_gpr(t0, rt);
         opn = "lbu";
@@ -1289,7 +1280,6 @@ static void gen_st (DisasContext *ctx, uint32_t opc, int rt,
     switch (opc) {
 #if defined(TARGET_MIPS64)
     case OPC_SD:
-        save_cpu_state(ctx, 0);
         op_st_sd(t1, t0, ctx);
         opn = "sd";
         break;
@@ -1305,17 +1295,14 @@ static void gen_st (DisasContext *ctx, uint32_t opc, int rt,
         break;
 #endif
     case OPC_SW:
-        save_cpu_state(ctx, 0);
         op_st_sw(t1, t0, ctx);
         opn = "sw";
         break;
     case OPC_SH:
-        save_cpu_state(ctx, 0);
         op_st_sh(t1, t0, ctx);
         opn = "sh";
         break;
     case OPC_SB:
-        save_cpu_state(ctx, 0);
         op_st_sb(t1, t0, ctx);
         opn = "sb";
         break;
@@ -8149,7 +8136,6 @@ static void gen_flt3_ldst (DisasContext *ctx, uint32_t opc,
     }
     /* Don't do NOP if destination is zero: we must perform the actual
        memory access. */
-    save_cpu_state(ctx, 0);
     switch (opc) {
     case OPC_LWXC1:
         check_cop1x(ctx);
@@ -10422,7 +10408,6 @@ static void gen_ldxs (DisasContext *ctx, int base, int index, int rd)
         gen_op_addr_add(ctx, t0, t1, t0);
     }
 
-    save_cpu_state(ctx, 0);
     op_ld_lw(t1, t0, ctx);
     gen_store_gpr(t1, rd);
 
@@ -10452,7 +10437,6 @@ static void gen_ldst_pair (DisasContext *ctx, uint32_t opc, int rd,
             generate_exception(ctx, EXCP_RI);
             return;
         }
-        save_cpu_state(ctx, 0);
         op_ld_lw(t1, t0, ctx);
         gen_store_gpr(t1, rd);
         tcg_gen_movi_tl(t1, 4);
@@ -10462,7 +10446,6 @@ static void gen_ldst_pair (DisasContext *ctx, uint32_t opc, int rd,
         opn = "lwp";
         break;
     case SWP:
-        save_cpu_state(ctx, 0);
         gen_load_gpr(t1, rd);
         op_st_sw(t1, t0, ctx);
         tcg_gen_movi_tl(t1, 4);
@@ -10477,7 +10460,6 @@ static void gen_ldst_pair (DisasContext *ctx, uint32_t opc, int rd,
             generate_exception(ctx, EXCP_RI);
             return;
         }
-        save_cpu_state(ctx, 0);
         op_ld_ld(t1, t0, ctx);
         gen_store_gpr(t1, rd);
         tcg_gen_movi_tl(t1, 8);
@@ -10487,7 +10469,6 @@ static void gen_ldst_pair (DisasContext *ctx, uint32_t opc, int rd,
         opn = "ldp";
         break;
     case SDP:
-        save_cpu_state(ctx, 0);
         gen_load_gpr(t1, rd);
         op_st_sd(t1, t0, ctx);
         tcg_gen_movi_tl(t1, 8);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 03/19] softfloat: implement fused multiply-add NaN propagation for MIPS
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 01/19] target-mips: correctly restore btarget upon exception Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 02/19] target-mips: do not save CPU state when using retranslation Aurelien Jarno
@ 2012-10-30  0:11 ` Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 04/19] target-mips: use the softfloat floatXX_muladd functions Aurelien Jarno
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Aurelien Jarno

Add a pickNaNMulAdd function for MIPS, implementing NaN propagation
rules for MIPS fused multiply-add instructions.

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 fpu/softfloat-specialize.h |   27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index a1d489e..518f694 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -486,6 +486,33 @@ static int pickNaNMulAdd(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN,
         return 1;
     }
 }
+#elif defined(TARGET_MIPS)
+static int pickNaNMulAdd(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN,
+                         flag cIsQNaN, flag cIsSNaN, flag infzero STATUS_PARAM)
+{
+    /* For MIPS, the (inf,zero,qnan) case sets InvalidOp and returns
+     * the default NaN
+     */
+    if (infzero) {
+        float_raise(float_flag_invalid STATUS_VAR);
+        return 3;
+    }
+
+    /* Prefer sNaN over qNaN, in the a, b, c order. */
+    if (aIsSNaN) {
+        return 0;
+    } else if (bIsSNaN) {
+        return 1;
+    } else if (cIsSNaN) {
+        return 2;
+    } else if (aIsQNaN) {
+        return 0;
+    } else if (bIsQNaN) {
+        return 1;
+    } else {
+        return 2;
+    }
+}
 #elif defined(TARGET_PPC)
 static int pickNaNMulAdd(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN,
                          flag cIsQNaN, flag cIsSNaN, flag infzero STATUS_PARAM)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 04/19] target-mips: use the softfloat floatXX_muladd functions
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (2 preceding siblings ...)
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 03/19] softfloat: implement fused multiply-add NaN propagation for MIPS Aurelien Jarno
@ 2012-10-30  0:11 ` Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 05/19] target-mips: keep softfloat exception set to 0 between instructions Aurelien Jarno
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Use the new softfloat floatXX_muladd() functions to implement the madd,
msub, nmadd and nmsub instructions. At the same time replace the name of
the helpers by the name of the instruction, as the only reason for the
previous names was to keep the macros simple.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/helper.h    |    8 +--
 target-mips/op_helper.c |  137 +++++++++++++++++------------------------------
 target-mips/translate.c |   24 ++++-----
 3 files changed, 64 insertions(+), 105 deletions(-)

diff --git a/target-mips/helper.h b/target-mips/helper.h
index 43ac39f..210960f 100644
--- a/target-mips/helper.h
+++ b/target-mips/helper.h
@@ -254,10 +254,10 @@ FOP_PROTO(rsqrt2)
 DEF_HELPER_4(float_ ## op ## _s, i32, env, i32, i32, i32)  \
 DEF_HELPER_4(float_ ## op ## _d, i64, env, i64, i64, i64)  \
 DEF_HELPER_4(float_ ## op ## _ps, i64, env, i64, i64, i64)
-FOP_PROTO(muladd)
-FOP_PROTO(mulsub)
-FOP_PROTO(nmuladd)
-FOP_PROTO(nmulsub)
+FOP_PROTO(madd)
+FOP_PROTO(msub)
+FOP_PROTO(nmadd)
+FOP_PROTO(nmsub)
 #undef FOP_PROTO
 
 #define FOP_PROTO(op)                                    \
diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index d50334f..1abed8e 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -3031,95 +3031,54 @@ FLOAT_BINOP(mul)
 FLOAT_BINOP(div)
 #undef FLOAT_BINOP
 
-/* ternary operations */
-#define FLOAT_TERNOP(name1, name2)                                        \
-uint64_t helper_float_ ## name1 ## name2 ## _d(CPUMIPSState *env,         \
-                                               uint64_t fdt0,             \
-                                               uint64_t fdt1,             \
-                                               uint64_t fdt2)             \
-{                                                                         \
-    fdt0 = float64_ ## name1 (fdt0, fdt1, &env->active_fpu.fp_status);          \
-    return float64_ ## name2 (fdt0, fdt2, &env->active_fpu.fp_status);          \
-}                                                                         \
-                                                                          \
-uint32_t helper_float_ ## name1 ## name2 ## _s(CPUMIPSState *env,         \
-                                               uint32_t fst0,             \
-                                               uint32_t fst1,             \
-                                               uint32_t fst2)             \
-{                                                                         \
-    fst0 = float32_ ## name1 (fst0, fst1, &env->active_fpu.fp_status);          \
-    return float32_ ## name2 (fst0, fst2, &env->active_fpu.fp_status);          \
-}                                                                         \
-                                                                          \
-uint64_t helper_float_ ## name1 ## name2 ## _ps(CPUMIPSState *env,        \
-                                                uint64_t fdt0,            \
-                                                uint64_t fdt1,            \
-                                                uint64_t fdt2)            \
-{                                                                         \
-    uint32_t fst0 = fdt0 & 0XFFFFFFFF;                                    \
-    uint32_t fsth0 = fdt0 >> 32;                                          \
-    uint32_t fst1 = fdt1 & 0XFFFFFFFF;                                    \
-    uint32_t fsth1 = fdt1 >> 32;                                          \
-    uint32_t fst2 = fdt2 & 0XFFFFFFFF;                                    \
-    uint32_t fsth2 = fdt2 >> 32;                                          \
-                                                                          \
-    fst0 = float32_ ## name1 (fst0, fst1, &env->active_fpu.fp_status);          \
-    fsth0 = float32_ ## name1 (fsth0, fsth1, &env->active_fpu.fp_status);       \
-    fst2 = float32_ ## name2 (fst0, fst2, &env->active_fpu.fp_status);          \
-    fsth2 = float32_ ## name2 (fsth0, fsth2, &env->active_fpu.fp_status);       \
-    return ((uint64_t)fsth2 << 32) | fst2;                                \
-}
-
-FLOAT_TERNOP(mul, add)
-FLOAT_TERNOP(mul, sub)
-#undef FLOAT_TERNOP
-
-/* negated ternary operations */
-#define FLOAT_NTERNOP(name1, name2)                                       \
-uint64_t helper_float_n ## name1 ## name2 ## _d(CPUMIPSState *env,        \
-                                                uint64_t fdt0,            \
-                                                uint64_t fdt1,            \
-                                                uint64_t fdt2)            \
-{                                                                         \
-    fdt0 = float64_ ## name1 (fdt0, fdt1, &env->active_fpu.fp_status);          \
-    fdt2 = float64_ ## name2 (fdt0, fdt2, &env->active_fpu.fp_status);          \
-    return float64_chs(fdt2);                                             \
-}                                                                         \
-                                                                          \
-uint32_t helper_float_n ## name1 ## name2 ## _s(CPUMIPSState *env,        \
-                                                uint32_t fst0,            \
-                                                uint32_t fst1,            \
-                                                uint32_t fst2)            \
-{                                                                         \
-    fst0 = float32_ ## name1 (fst0, fst1, &env->active_fpu.fp_status);          \
-    fst2 = float32_ ## name2 (fst0, fst2, &env->active_fpu.fp_status);          \
-    return float32_chs(fst2);                                             \
-}                                                                         \
-                                                                          \
-uint64_t helper_float_n ## name1 ## name2 ## _ps(CPUMIPSState *env,       \
-                                                 uint64_t fdt0,           \
-                                                 uint64_t fdt1,           \
-                                                 uint64_t fdt2)           \
-{                                                                         \
-    uint32_t fst0 = fdt0 & 0XFFFFFFFF;                                    \
-    uint32_t fsth0 = fdt0 >> 32;                                          \
-    uint32_t fst1 = fdt1 & 0XFFFFFFFF;                                    \
-    uint32_t fsth1 = fdt1 >> 32;                                          \
-    uint32_t fst2 = fdt2 & 0XFFFFFFFF;                                    \
-    uint32_t fsth2 = fdt2 >> 32;                                          \
-                                                                          \
-    fst0 = float32_ ## name1 (fst0, fst1, &env->active_fpu.fp_status);          \
-    fsth0 = float32_ ## name1 (fsth0, fsth1, &env->active_fpu.fp_status);       \
-    fst2 = float32_ ## name2 (fst0, fst2, &env->active_fpu.fp_status);          \
-    fsth2 = float32_ ## name2 (fsth0, fsth2, &env->active_fpu.fp_status);       \
-    fst2 = float32_chs(fst2);                                             \
-    fsth2 = float32_chs(fsth2);                                           \
-    return ((uint64_t)fsth2 << 32) | fst2;                                \
-}
-
-FLOAT_NTERNOP(mul, add)
-FLOAT_NTERNOP(mul, sub)
-#undef FLOAT_NTERNOP
+/* FMA based operations */
+#define FLOAT_FMA(name, type)                                        \
+uint64_t helper_float_ ## name ## _d(CPUMIPSState *env,              \
+                                     uint64_t fdt0, uint64_t fdt1,   \
+                                     uint64_t fdt2)                  \
+{                                                                    \
+    set_float_exception_flags(0, &env->active_fpu.fp_status);        \
+    fdt0 = float64_muladd(fdt0, fdt1, fdt2, type,                    \
+                         &env->active_fpu.fp_status);                \
+    update_fcr31(env);                                               \
+    return fdt0;                                                     \
+}                                                                    \
+                                                                     \
+uint32_t helper_float_ ## name ## _s(CPUMIPSState *env,              \
+                                     uint32_t fst0, uint32_t fst1,   \
+                                     uint32_t fst2)                  \
+{                                                                    \
+    set_float_exception_flags(0, &env->active_fpu.fp_status);        \
+    fst0 = float32_muladd(fst0, fst1, fst2, type,                    \
+                         &env->active_fpu.fp_status);                \
+    update_fcr31(env);                                               \
+    return fst0;                                                     \
+}                                                                    \
+                                                                     \
+uint64_t helper_float_ ## name ## _ps(CPUMIPSState *env,             \
+                                      uint64_t fdt0, uint64_t fdt1,  \
+                                      uint64_t fdt2)                 \
+{                                                                    \
+    uint32_t fst0 = fdt0 & 0XFFFFFFFF;                               \
+    uint32_t fsth0 = fdt0 >> 32;                                     \
+    uint32_t fst1 = fdt1 & 0XFFFFFFFF;                               \
+    uint32_t fsth1 = fdt1 >> 32;                                     \
+    uint32_t fst2 = fdt2 & 0XFFFFFFFF;                               \
+    uint32_t fsth2 = fdt2 >> 32;                                     \
+                                                                     \
+    set_float_exception_flags(0, &env->active_fpu.fp_status);        \
+    fst0 = float32_muladd(fst0, fst1, fst2, type,                    \
+                          &env->active_fpu.fp_status);               \
+    fsth0 = float32_muladd(fsth0, fsth1, fsth2, type,                \
+                           &env->active_fpu.fp_status);              \
+    update_fcr31(env);                                               \
+    return ((uint64_t)fsth0 << 32) | fst0;                           \
+}
+FLOAT_FMA(madd, 0)
+FLOAT_FMA(msub, float_muladd_negate_c)
+FLOAT_FMA(nmadd, float_muladd_negate_result)
+FLOAT_FMA(nmsub, float_muladd_negate_result | float_muladd_negate_c)
+#undef FLOAT_FMA
 
 /* MIPS specific binary operations */
 uint64_t helper_float_recip2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 97a63ea..732c65d 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -8275,7 +8275,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr32(fp0, fs);
             gen_load_fpr32(fp1, ft);
             gen_load_fpr32(fp2, fr);
-            gen_helper_float_muladd_s(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_madd_s(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i32(fp0);
             tcg_temp_free_i32(fp1);
             gen_store_fpr32(fp2, fd);
@@ -8294,7 +8294,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_muladd_d(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_madd_d(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
@@ -8312,7 +8312,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_muladd_ps(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_madd_ps(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
@@ -8330,7 +8330,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr32(fp0, fs);
             gen_load_fpr32(fp1, ft);
             gen_load_fpr32(fp2, fr);
-            gen_helper_float_mulsub_s(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_msub_s(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i32(fp0);
             tcg_temp_free_i32(fp1);
             gen_store_fpr32(fp2, fd);
@@ -8349,7 +8349,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_mulsub_d(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_msub_d(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
@@ -8367,7 +8367,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_mulsub_ps(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_msub_ps(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
@@ -8385,7 +8385,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr32(fp0, fs);
             gen_load_fpr32(fp1, ft);
             gen_load_fpr32(fp2, fr);
-            gen_helper_float_nmuladd_s(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_nmadd_s(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i32(fp0);
             tcg_temp_free_i32(fp1);
             gen_store_fpr32(fp2, fd);
@@ -8404,7 +8404,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_nmuladd_d(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_nmadd_d(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
@@ -8422,7 +8422,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_nmuladd_ps(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_nmadd_ps(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
@@ -8440,7 +8440,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr32(fp0, fs);
             gen_load_fpr32(fp1, ft);
             gen_load_fpr32(fp2, fr);
-            gen_helper_float_nmulsub_s(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_nmsub_s(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i32(fp0);
             tcg_temp_free_i32(fp1);
             gen_store_fpr32(fp2, fd);
@@ -8459,7 +8459,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_nmulsub_d(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_nmsub_d(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
@@ -8477,7 +8477,7 @@ static void gen_flt3_arith (DisasContext *ctx, uint32_t opc,
             gen_load_fpr64(ctx, fp0, fs);
             gen_load_fpr64(ctx, fp1, ft);
             gen_load_fpr64(ctx, fp2, fr);
-            gen_helper_float_nmulsub_ps(fp2, cpu_env, fp0, fp1, fp2);
+            gen_helper_float_nmsub_ps(fp2, cpu_env, fp0, fp1, fp2);
             tcg_temp_free_i64(fp0);
             tcg_temp_free_i64(fp1);
             gen_store_fpr64(ctx, fp2, fd);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 05/19] target-mips: keep softfloat exception set to 0 between instructions
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (3 preceding siblings ...)
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 04/19] target-mips: use the softfloat floatXX_muladd functions Aurelien Jarno
@ 2012-10-30  0:11 ` Aurelien Jarno
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 06/19] target-mips: fix FPU exceptions Aurelien Jarno
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Instead of clearing the softfloat exception flags before each floating
point instruction, reset them to 0 in update_fcr31() when an exception
is detected.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |   73 +++++++----------------------------------------
 1 file changed, 10 insertions(+), 63 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 1abed8e..8204499 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2445,10 +2445,16 @@ static inline void update_fcr31(CPUMIPSState *env)
     int tmp = ieee_ex_to_mips(get_float_exception_flags(&env->active_fpu.fp_status));
 
     SET_FP_CAUSE(env->active_fpu.fcr31, tmp);
-    if (GET_FP_ENABLE(env->active_fpu.fcr31) & tmp)
-        helper_raise_exception(env, EXCP_FPE);
-    else
-        UPDATE_FP_FLAGS(env->active_fpu.fcr31, tmp);
+
+    if (tmp) {
+        set_float_exception_flags(0, &env->active_fpu.fp_status);
+
+        if (GET_FP_ENABLE(env->active_fpu.fcr31) & tmp) {
+            helper_raise_exception(env, EXCP_FPE);
+        } else {
+            UPDATE_FP_FLAGS(env->active_fpu.fcr31, tmp);
+        }
+    }
 }
 
 /* Float support.
@@ -2471,7 +2477,6 @@ uint64_t helper_float_cvtd_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint64_t fdt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = float32_to_float64(fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
@@ -2481,7 +2486,6 @@ uint64_t helper_float_cvtd_w(CPUMIPSState *env, uint32_t wt0)
 {
     uint64_t fdt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = int32_to_float64(wt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
@@ -2491,7 +2495,6 @@ uint64_t helper_float_cvtd_l(CPUMIPSState *env, uint64_t dt0)
 {
     uint64_t fdt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = int64_to_float64(dt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
@@ -2501,7 +2504,6 @@ uint64_t helper_float_cvtl_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2513,7 +2515,6 @@ uint64_t helper_float_cvtl_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2526,7 +2527,6 @@ uint64_t helper_float_cvtps_pw(CPUMIPSState *env, uint64_t dt0)
     uint32_t fst2;
     uint32_t fsth2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = int32_to_float32(dt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     fsth2 = int32_to_float32(dt0 >> 32, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -2538,7 +2538,6 @@ uint64_t helper_float_cvtpw_ps(CPUMIPSState *env, uint64_t fdt0)
     uint32_t wt2;
     uint32_t wth2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     wth2 = float32_to_int32(fdt0 >> 32, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -2553,7 +2552,6 @@ uint32_t helper_float_cvts_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint32_t fst2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float64_to_float32(fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
@@ -2563,7 +2561,6 @@ uint32_t helper_float_cvts_w(CPUMIPSState *env, uint32_t wt0)
 {
     uint32_t fst2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = int32_to_float32(wt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
@@ -2573,7 +2570,6 @@ uint32_t helper_float_cvts_l(CPUMIPSState *env, uint64_t dt0)
 {
     uint32_t fst2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = int64_to_float32(dt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
@@ -2583,7 +2579,6 @@ uint32_t helper_float_cvts_pl(CPUMIPSState *env, uint32_t wt0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     wt2 = wt0;
     update_fcr31(env);
     return wt2;
@@ -2593,7 +2588,6 @@ uint32_t helper_float_cvts_pu(CPUMIPSState *env, uint32_t wth0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     wt2 = wth0;
     update_fcr31(env);
     return wt2;
@@ -2603,7 +2597,6 @@ uint32_t helper_float_cvtw_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2615,7 +2608,6 @@ uint32_t helper_float_cvtw_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2627,7 +2619,6 @@ uint64_t helper_float_roundl_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2641,7 +2632,6 @@ uint64_t helper_float_roundl_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2655,7 +2645,6 @@ uint32_t helper_float_roundw_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2669,7 +2658,6 @@ uint32_t helper_float_roundw_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2683,7 +2671,6 @@ uint64_t helper_float_truncl_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     dt2 = float64_to_int64_round_to_zero(fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2695,7 +2682,6 @@ uint64_t helper_float_truncl_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     dt2 = float32_to_int64_round_to_zero(fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2707,7 +2693,6 @@ uint32_t helper_float_truncw_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     wt2 = float64_to_int32_round_to_zero(fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2719,7 +2704,6 @@ uint32_t helper_float_truncw_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     wt2 = float32_to_int32_round_to_zero(fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
@@ -2731,7 +2715,6 @@ uint64_t helper_float_ceill_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2745,7 +2728,6 @@ uint64_t helper_float_ceill_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2759,7 +2741,6 @@ uint32_t helper_float_ceilw_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2773,7 +2754,6 @@ uint32_t helper_float_ceilw_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2787,7 +2767,6 @@ uint64_t helper_float_floorl_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2801,7 +2780,6 @@ uint64_t helper_float_floorl_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint64_t dt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2815,7 +2793,6 @@ uint32_t helper_float_floorw_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2829,7 +2806,6 @@ uint32_t helper_float_floorw_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t wt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
@@ -2867,7 +2843,6 @@ uint64_t helper_float_recip_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t fdt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = float64_div(FLOAT_ONE64, fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
@@ -2877,7 +2852,6 @@ uint32_t helper_float_recip_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t fst2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_div(FLOAT_ONE32, fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
@@ -2887,7 +2861,6 @@ uint64_t helper_float_rsqrt_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t fdt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
     fdt2 = float64_div(FLOAT_ONE64, fdt2, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -2898,7 +2871,6 @@ uint32_t helper_float_rsqrt_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t fst2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_sqrt(fst0, &env->active_fpu.fp_status);
     fst2 = float32_div(FLOAT_ONE32, fst2, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -2909,7 +2881,6 @@ uint64_t helper_float_recip1_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t fdt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = float64_div(FLOAT_ONE64, fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
@@ -2919,7 +2890,6 @@ uint32_t helper_float_recip1_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t fst2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_div(FLOAT_ONE32, fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
@@ -2930,7 +2900,6 @@ uint64_t helper_float_recip1_ps(CPUMIPSState *env, uint64_t fdt0)
     uint32_t fst2;
     uint32_t fsth2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_div(FLOAT_ONE32, fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     fsth2 = float32_div(FLOAT_ONE32, fdt0 >> 32, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -2941,7 +2910,6 @@ uint64_t helper_float_rsqrt1_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t fdt2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
     fdt2 = float64_div(FLOAT_ONE64, fdt2, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -2952,7 +2920,6 @@ uint32_t helper_float_rsqrt1_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t fst2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_sqrt(fst0, &env->active_fpu.fp_status);
     fst2 = float32_div(FLOAT_ONE32, fst2, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -2964,7 +2931,6 @@ uint64_t helper_float_rsqrt1_ps(CPUMIPSState *env, uint64_t fdt0)
     uint32_t fst2;
     uint32_t fsth2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_sqrt(fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     fsth2 = float32_sqrt(fdt0 >> 32, &env->active_fpu.fp_status);
     fst2 = float32_div(FLOAT_ONE32, fst2, &env->active_fpu.fp_status);
@@ -2982,7 +2948,6 @@ uint64_t helper_float_ ## name ## _d(CPUMIPSState *env,            \
 {                                                                  \
     uint64_t dt2;                                                  \
                                                                    \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);            \
     dt2 = float64_ ## name (fdt0, fdt1, &env->active_fpu.fp_status);     \
     update_fcr31(env);                                             \
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_INVALID)                \
@@ -2995,7 +2960,6 @@ uint32_t helper_float_ ## name ## _s(CPUMIPSState *env,            \
 {                                                                  \
     uint32_t wt2;                                                  \
                                                                    \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);            \
     wt2 = float32_ ## name (fst0, fst1, &env->active_fpu.fp_status);     \
     update_fcr31(env);                                             \
     if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_INVALID)                \
@@ -3014,7 +2978,6 @@ uint64_t helper_float_ ## name ## _ps(CPUMIPSState *env,           \
     uint32_t wt2;                                                  \
     uint32_t wth2;                                                 \
                                                                    \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);            \
     wt2 = float32_ ## name (fst0, fst1, &env->active_fpu.fp_status);     \
     wth2 = float32_ ## name (fsth0, fsth1, &env->active_fpu.fp_status);  \
     update_fcr31(env);                                             \
@@ -3037,7 +3000,6 @@ uint64_t helper_float_ ## name ## _d(CPUMIPSState *env,              \
                                      uint64_t fdt0, uint64_t fdt1,   \
                                      uint64_t fdt2)                  \
 {                                                                    \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);        \
     fdt0 = float64_muladd(fdt0, fdt1, fdt2, type,                    \
                          &env->active_fpu.fp_status);                \
     update_fcr31(env);                                               \
@@ -3048,7 +3010,6 @@ uint32_t helper_float_ ## name ## _s(CPUMIPSState *env,              \
                                      uint32_t fst0, uint32_t fst1,   \
                                      uint32_t fst2)                  \
 {                                                                    \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);        \
     fst0 = float32_muladd(fst0, fst1, fst2, type,                    \
                          &env->active_fpu.fp_status);                \
     update_fcr31(env);                                               \
@@ -3066,7 +3027,6 @@ uint64_t helper_float_ ## name ## _ps(CPUMIPSState *env,             \
     uint32_t fst2 = fdt2 & 0XFFFFFFFF;                               \
     uint32_t fsth2 = fdt2 >> 32;                                     \
                                                                      \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);        \
     fst0 = float32_muladd(fst0, fst1, fst2, type,                    \
                           &env->active_fpu.fp_status);               \
     fsth0 = float32_muladd(fsth0, fsth1, fsth2, type,                \
@@ -3083,7 +3043,6 @@ FLOAT_FMA(nmsub, float_muladd_negate_result | float_muladd_negate_c)
 /* MIPS specific binary operations */
 uint64_t helper_float_recip2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 {
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = float64_mul(fdt0, fdt2, &env->active_fpu.fp_status);
     fdt2 = float64_chs(float64_sub(fdt2, FLOAT_ONE64, &env->active_fpu.fp_status));
     update_fcr31(env);
@@ -3092,7 +3051,6 @@ uint64_t helper_float_recip2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 
 uint32_t helper_float_recip2_s(CPUMIPSState *env, uint32_t fst0, uint32_t fst2)
 {
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status));
     update_fcr31(env);
@@ -3106,7 +3064,6 @@ uint64_t helper_float_recip2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
     uint32_t fst2 = fdt2 & 0XFFFFFFFF;
     uint32_t fsth2 = fdt2 >> 32;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fsth2 = float32_mul(fsth0, fsth2, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status));
@@ -3117,7 +3074,6 @@ uint64_t helper_float_recip2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 
 uint64_t helper_float_rsqrt2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 {
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fdt2 = float64_mul(fdt0, fdt2, &env->active_fpu.fp_status);
     fdt2 = float64_sub(fdt2, FLOAT_ONE64, &env->active_fpu.fp_status);
     fdt2 = float64_chs(float64_div(fdt2, FLOAT_TWO64, &env->active_fpu.fp_status));
@@ -3127,7 +3083,6 @@ uint64_t helper_float_rsqrt2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 
 uint32_t helper_float_rsqrt2_s(CPUMIPSState *env, uint32_t fst0, uint32_t fst2)
 {
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fst2 = float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_div(fst2, FLOAT_TWO32, &env->active_fpu.fp_status));
@@ -3142,7 +3097,6 @@ uint64_t helper_float_rsqrt2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
     uint32_t fst2 = fdt2 & 0XFFFFFFFF;
     uint32_t fsth2 = fdt2 >> 32;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fsth2 = float32_mul(fsth0, fsth2, &env->active_fpu.fp_status);
     fst2 = float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status);
@@ -3162,7 +3116,6 @@ uint64_t helper_float_addr_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt1)
     uint32_t fst2;
     uint32_t fsth2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_add (fst0, fsth0, &env->active_fpu.fp_status);
     fsth2 = float32_add (fst1, fsth1, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -3178,7 +3131,6 @@ uint64_t helper_float_mulr_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt1)
     uint32_t fst2;
     uint32_t fsth2;
 
-    set_float_exception_flags(0, &env->active_fpu.fp_status);
     fst2 = float32_mul (fst0, fsth0, &env->active_fpu.fp_status);
     fsth2 = float32_mul (fst1, fsth1, &env->active_fpu.fp_status);
     update_fcr31(env);
@@ -3191,7 +3143,6 @@ void helper_cmp_d_ ## op(CPUMIPSState *env, uint64_t fdt0,     \
                          uint64_t fdt1, int cc)                \
 {                                                              \
     int c;                                                     \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);  \
     c = cond;                                                  \
     update_fcr31(env);                                         \
     if (c)                                                     \
@@ -3203,7 +3154,6 @@ void helper_cmpabs_d_ ## op(CPUMIPSState *env, uint64_t fdt0,  \
                             uint64_t fdt1, int cc)             \
 {                                                              \
     int c;                                                     \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);  \
     fdt0 = float64_abs(fdt0);                                  \
     fdt1 = float64_abs(fdt1);                                  \
     c = cond;                                                  \
@@ -3240,7 +3190,6 @@ void helper_cmp_s_ ## op(CPUMIPSState *env, uint32_t fst0,     \
                          uint32_t fst1, int cc)                \
 {                                                              \
     int c;                                                     \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);  \
     c = cond;                                                  \
     update_fcr31(env);                                         \
     if (c)                                                     \
@@ -3252,7 +3201,6 @@ void helper_cmpabs_s_ ## op(CPUMIPSState *env, uint32_t fst0,  \
                             uint32_t fst1, int cc)             \
 {                                                              \
     int c;                                                     \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);  \
     fst0 = float32_abs(fst0);                                  \
     fst1 = float32_abs(fst1);                                  \
     c = cond;                                                  \
@@ -3290,7 +3238,6 @@ void helper_cmp_ps_ ## op(CPUMIPSState *env, uint64_t fdt0,     \
 {                                                               \
     uint32_t fst0, fsth0, fst1, fsth1;                          \
     int ch, cl;                                                 \
-    set_float_exception_flags(0, &env->active_fpu.fp_status);   \
     fst0 = fdt0 & 0XFFFFFFFF;                                   \
     fsth0 = fdt0 >> 32;                                         \
     fst1 = fdt1 & 0XFFFFFFFF;                                   \
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 06/19] target-mips: fix FPU exceptions
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (4 preceding siblings ...)
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 05/19] target-mips: keep softfloat exception set to 0 between instructions Aurelien Jarno
@ 2012-10-30  0:11 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 07/19] target-mips: cleanup float to int conversion helpers Aurelien Jarno
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

For each FPU instruction that can trigger an FPU exception, to call
call update_fcr31() after.

Remove the manual NaN assignment in case of float to float operation, as
softfloat is already taking care of that. However for float to int
operation, the value has to be changed to the MIPS one. In the cvtpw_ps
case, the two registers have to be handled separately to guarantee
a correct final value in both registers.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |   32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 8204499..7981ea2 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2465,12 +2465,16 @@ static inline void update_fcr31(CPUMIPSState *env)
 /* unary operations, modifying fp status  */
 uint64_t helper_float_sqrt_d(CPUMIPSState *env, uint64_t fdt0)
 {
-    return float64_sqrt(fdt0, &env->active_fpu.fp_status);
+    fdt0 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
+    update_fcr31(env);
+    return fdt0;
 }
 
 uint32_t helper_float_sqrt_s(CPUMIPSState *env, uint32_t fst0)
 {
-    return float32_sqrt(fst0, &env->active_fpu.fp_status);
+    fst0 = float32_sqrt(fst0, &env->active_fpu.fp_status);
+    update_fcr31(env);
+    return fst0;
 }
 
 uint64_t helper_float_cvtd_s(CPUMIPSState *env, uint32_t fst0)
@@ -2537,14 +2541,24 @@ uint64_t helper_float_cvtpw_ps(CPUMIPSState *env, uint64_t fdt0)
 {
     uint32_t wt2;
     uint32_t wth2;
+    int excp, excph;
 
     wt2 = float32_to_int32(fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
-    wth2 = float32_to_int32(fdt0 >> 32, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID)) {
+    excp = get_float_exception_flags(&env->active_fpu.fp_status);
+    if (excp & (float_flag_overflow | float_flag_invalid)) {
         wt2 = FLOAT_SNAN32;
+    }
+
+    set_float_exception_flags(0, &env->active_fpu.fp_status);
+    wth2 = float32_to_int32(fdt0 >> 32, &env->active_fpu.fp_status);
+    excph = get_float_exception_flags(&env->active_fpu.fp_status);
+    if (excph & (float_flag_overflow | float_flag_invalid)) {
         wth2 = FLOAT_SNAN32;
     }
+
+    set_float_exception_flags(excp | excph, &env->active_fpu.fp_status);
+    update_fcr31(env);
+
     return ((uint64_t)wth2 << 32) | wt2;
 }
 
@@ -2950,8 +2964,6 @@ uint64_t helper_float_ ## name ## _d(CPUMIPSState *env,            \
                                                                    \
     dt2 = float64_ ## name (fdt0, fdt1, &env->active_fpu.fp_status);     \
     update_fcr31(env);                                             \
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_INVALID)                \
-        dt2 = FLOAT_QNAN64;                                        \
     return dt2;                                                    \
 }                                                                  \
                                                                    \
@@ -2962,8 +2974,6 @@ uint32_t helper_float_ ## name ## _s(CPUMIPSState *env,            \
                                                                    \
     wt2 = float32_ ## name (fst0, fst1, &env->active_fpu.fp_status);     \
     update_fcr31(env);                                             \
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_INVALID)                \
-        wt2 = FLOAT_QNAN32;                                        \
     return wt2;                                                    \
 }                                                                  \
                                                                    \
@@ -2981,10 +2991,6 @@ uint64_t helper_float_ ## name ## _ps(CPUMIPSState *env,           \
     wt2 = float32_ ## name (fst0, fst1, &env->active_fpu.fp_status);     \
     wth2 = float32_ ## name (fsth0, fsth1, &env->active_fpu.fp_status);  \
     update_fcr31(env);                                             \
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & FP_INVALID) {              \
-        wt2 = FLOAT_QNAN32;                                        \
-        wth2 = FLOAT_QNAN32;                                       \
-    }                                                              \
     return ((uint64_t)wth2 << 32) | wt2;                           \
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 07/19] target-mips: cleanup float to int conversion helpers
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (5 preceding siblings ...)
  2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 06/19] target-mips: fix FPU exceptions Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 08/19] target-mips: use softfloat constants when possible Aurelien Jarno
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Instead of accessing the flags from the floating point control
register after updating it, read the softfloat flags.

This is just code cleanup and should not change the behaviour.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |  118 +++++++++++++++++++++++++++++++----------------
 1 file changed, 79 insertions(+), 39 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 7981ea2..d3a317b 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2509,9 +2509,11 @@ uint64_t helper_float_cvtl_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t dt2;
 
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2520,9 +2522,11 @@ uint64_t helper_float_cvtl_s(CPUMIPSState *env, uint32_t fst0)
     uint64_t dt2;
 
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2613,8 +2617,10 @@ uint32_t helper_float_cvtw_s(CPUMIPSState *env, uint32_t fst0)
 
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
     return wt2;
 }
 
@@ -2623,9 +2629,11 @@ uint32_t helper_float_cvtw_d(CPUMIPSState *env, uint64_t fdt0)
     uint32_t wt2;
 
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2636,9 +2644,11 @@ uint64_t helper_float_roundl_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2649,9 +2659,11 @@ uint64_t helper_float_roundl_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2662,9 +2674,11 @@ uint32_t helper_float_roundw_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2675,9 +2689,11 @@ uint32_t helper_float_roundw_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2686,9 +2702,11 @@ uint64_t helper_float_truncl_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t dt2;
 
     dt2 = float64_to_int64_round_to_zero(fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2697,9 +2715,11 @@ uint64_t helper_float_truncl_s(CPUMIPSState *env, uint32_t fst0)
     uint64_t dt2;
 
     dt2 = float32_to_int64_round_to_zero(fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2708,9 +2728,11 @@ uint32_t helper_float_truncw_d(CPUMIPSState *env, uint64_t fdt0)
     uint32_t wt2;
 
     wt2 = float64_to_int32_round_to_zero(fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2719,9 +2741,11 @@ uint32_t helper_float_truncw_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t wt2;
 
     wt2 = float32_to_int32_round_to_zero(fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2732,9 +2756,11 @@ uint64_t helper_float_ceill_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2745,9 +2771,11 @@ uint64_t helper_float_ceill_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2758,9 +2786,11 @@ uint32_t helper_float_ceilw_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2771,9 +2801,11 @@ uint32_t helper_float_ceilw_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2784,9 +2816,11 @@ uint64_t helper_float_floorl_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2797,9 +2831,11 @@ uint64_t helper_float_floorl_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FLOAT_SNAN64;
+    }
+    update_fcr31(env);
     return dt2;
 }
 
@@ -2810,9 +2846,11 @@ uint32_t helper_float_floorw_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
@@ -2823,9 +2861,11 @@ uint32_t helper_float_floorw_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     RESTORE_ROUNDING_MODE;
-    update_fcr31(env);
-    if (GET_FP_CAUSE(env->active_fpu.fcr31) & (FP_OVERFLOW | FP_INVALID))
+    if (get_float_exception_flags(&env->active_fpu.fp_status)
+        & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FLOAT_SNAN32;
+    }
+    update_fcr31(env);
     return wt2;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 08/19] target-mips: use softfloat constants when possible
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (6 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 07/19] target-mips: cleanup float to int conversion helpers Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 09/19] target-mips: restore CPU state after an FPU exception Aurelien Jarno
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

softfloat already has a few constants defined, use them instead of
redefining them in target-mips.

Rename FLOAT_SNAN32 and FLOAT_SNAN64 to FP_TO_INT32_OVERFLOW and
FP_TO_INT64_OVERFLOW as even if they have the same value, they are
technically different (and defined differently in the MIPS ISA).

Remove the unused constants.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |   92 +++++++++++++++++++++++------------------------
 1 file changed, 44 insertions(+), 48 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index d3a317b..2f9ec5d 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2332,14 +2332,10 @@ void cpu_unassigned_access(CPUMIPSState *env, hwaddr addr,
 
 /* Complex FPU operations which may need stack space. */
 
-#define FLOAT_ONE32 make_float32(0x3f8 << 20)
-#define FLOAT_ONE64 make_float64(0x3ffULL << 52)
 #define FLOAT_TWO32 make_float32(1 << 30)
 #define FLOAT_TWO64 make_float64(1ULL << 62)
-#define FLOAT_QNAN32 0x7fbfffff
-#define FLOAT_QNAN64 0x7ff7ffffffffffffULL
-#define FLOAT_SNAN32 0x7fffffff
-#define FLOAT_SNAN64 0x7fffffffffffffffULL
+#define FP_TO_INT32_OVERFLOW 0x7fffffff
+#define FP_TO_INT64_OVERFLOW 0x7fffffffffffffffULL
 
 /* convert MIPS rounding mode in FCR31 to IEEE library */
 static unsigned int ieee_rm[] = {
@@ -2511,7 +2507,7 @@ uint64_t helper_float_cvtl_d(CPUMIPSState *env, uint64_t fdt0)
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2524,7 +2520,7 @@ uint64_t helper_float_cvtl_s(CPUMIPSState *env, uint32_t fst0)
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2550,14 +2546,14 @@ uint64_t helper_float_cvtpw_ps(CPUMIPSState *env, uint64_t fdt0)
     wt2 = float32_to_int32(fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     excp = get_float_exception_flags(&env->active_fpu.fp_status);
     if (excp & (float_flag_overflow | float_flag_invalid)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
 
     set_float_exception_flags(0, &env->active_fpu.fp_status);
     wth2 = float32_to_int32(fdt0 >> 32, &env->active_fpu.fp_status);
     excph = get_float_exception_flags(&env->active_fpu.fp_status);
     if (excph & (float_flag_overflow | float_flag_invalid)) {
-        wth2 = FLOAT_SNAN32;
+        wth2 = FP_TO_INT32_OVERFLOW;
     }
 
     set_float_exception_flags(excp | excph, &env->active_fpu.fp_status);
@@ -2619,7 +2615,7 @@ uint32_t helper_float_cvtw_s(CPUMIPSState *env, uint32_t fst0)
     update_fcr31(env);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     return wt2;
 }
@@ -2631,7 +2627,7 @@ uint32_t helper_float_cvtw_d(CPUMIPSState *env, uint64_t fdt0)
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2646,7 +2642,7 @@ uint64_t helper_float_roundl_d(CPUMIPSState *env, uint64_t fdt0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2661,7 +2657,7 @@ uint64_t helper_float_roundl_s(CPUMIPSState *env, uint32_t fst0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2676,7 +2672,7 @@ uint32_t helper_float_roundw_d(CPUMIPSState *env, uint64_t fdt0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2691,7 +2687,7 @@ uint32_t helper_float_roundw_s(CPUMIPSState *env, uint32_t fst0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2704,7 +2700,7 @@ uint64_t helper_float_truncl_d(CPUMIPSState *env, uint64_t fdt0)
     dt2 = float64_to_int64_round_to_zero(fdt0, &env->active_fpu.fp_status);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2717,7 +2713,7 @@ uint64_t helper_float_truncl_s(CPUMIPSState *env, uint32_t fst0)
     dt2 = float32_to_int64_round_to_zero(fst0, &env->active_fpu.fp_status);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2730,7 +2726,7 @@ uint32_t helper_float_truncw_d(CPUMIPSState *env, uint64_t fdt0)
     wt2 = float64_to_int32_round_to_zero(fdt0, &env->active_fpu.fp_status);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2743,7 +2739,7 @@ uint32_t helper_float_truncw_s(CPUMIPSState *env, uint32_t fst0)
     wt2 = float32_to_int32_round_to_zero(fst0, &env->active_fpu.fp_status);
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2758,7 +2754,7 @@ uint64_t helper_float_ceill_d(CPUMIPSState *env, uint64_t fdt0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2773,7 +2769,7 @@ uint64_t helper_float_ceill_s(CPUMIPSState *env, uint32_t fst0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2788,7 +2784,7 @@ uint32_t helper_float_ceilw_d(CPUMIPSState *env, uint64_t fdt0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2803,7 +2799,7 @@ uint32_t helper_float_ceilw_s(CPUMIPSState *env, uint32_t fst0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2818,7 +2814,7 @@ uint64_t helper_float_floorl_d(CPUMIPSState *env, uint64_t fdt0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2833,7 +2829,7 @@ uint64_t helper_float_floorl_s(CPUMIPSState *env, uint32_t fst0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FLOAT_SNAN64;
+        dt2 = FP_TO_INT64_OVERFLOW;
     }
     update_fcr31(env);
     return dt2;
@@ -2848,7 +2844,7 @@ uint32_t helper_float_floorw_d(CPUMIPSState *env, uint64_t fdt0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2863,7 +2859,7 @@ uint32_t helper_float_floorw_s(CPUMIPSState *env, uint32_t fst0)
     RESTORE_ROUNDING_MODE;
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FLOAT_SNAN32;
+        wt2 = FP_TO_INT32_OVERFLOW;
     }
     update_fcr31(env);
     return wt2;
@@ -2897,7 +2893,7 @@ uint64_t helper_float_recip_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t fdt2;
 
-    fdt2 = float64_div(FLOAT_ONE64, fdt0, &env->active_fpu.fp_status);
+    fdt2 = float64_div(float64_one, fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
 }
@@ -2906,7 +2902,7 @@ uint32_t helper_float_recip_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t fst2;
 
-    fst2 = float32_div(FLOAT_ONE32, fst0, &env->active_fpu.fp_status);
+    fst2 = float32_div(float32_one, fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
 }
@@ -2916,7 +2912,7 @@ uint64_t helper_float_rsqrt_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t fdt2;
 
     fdt2 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
-    fdt2 = float64_div(FLOAT_ONE64, fdt2, &env->active_fpu.fp_status);
+    fdt2 = float64_div(float64_one, fdt2, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
 }
@@ -2926,7 +2922,7 @@ uint32_t helper_float_rsqrt_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t fst2;
 
     fst2 = float32_sqrt(fst0, &env->active_fpu.fp_status);
-    fst2 = float32_div(FLOAT_ONE32, fst2, &env->active_fpu.fp_status);
+    fst2 = float32_div(float32_one, fst2, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
 }
@@ -2935,7 +2931,7 @@ uint64_t helper_float_recip1_d(CPUMIPSState *env, uint64_t fdt0)
 {
     uint64_t fdt2;
 
-    fdt2 = float64_div(FLOAT_ONE64, fdt0, &env->active_fpu.fp_status);
+    fdt2 = float64_div(float64_one, fdt0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
 }
@@ -2944,7 +2940,7 @@ uint32_t helper_float_recip1_s(CPUMIPSState *env, uint32_t fst0)
 {
     uint32_t fst2;
 
-    fst2 = float32_div(FLOAT_ONE32, fst0, &env->active_fpu.fp_status);
+    fst2 = float32_div(float32_one, fst0, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
 }
@@ -2954,8 +2950,8 @@ uint64_t helper_float_recip1_ps(CPUMIPSState *env, uint64_t fdt0)
     uint32_t fst2;
     uint32_t fsth2;
 
-    fst2 = float32_div(FLOAT_ONE32, fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
-    fsth2 = float32_div(FLOAT_ONE32, fdt0 >> 32, &env->active_fpu.fp_status);
+    fst2 = float32_div(float32_one, fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
+    fsth2 = float32_div(float32_one, fdt0 >> 32, &env->active_fpu.fp_status);
     update_fcr31(env);
     return ((uint64_t)fsth2 << 32) | fst2;
 }
@@ -2965,7 +2961,7 @@ uint64_t helper_float_rsqrt1_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t fdt2;
 
     fdt2 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
-    fdt2 = float64_div(FLOAT_ONE64, fdt2, &env->active_fpu.fp_status);
+    fdt2 = float64_div(float64_one, fdt2, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fdt2;
 }
@@ -2975,7 +2971,7 @@ uint32_t helper_float_rsqrt1_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t fst2;
 
     fst2 = float32_sqrt(fst0, &env->active_fpu.fp_status);
-    fst2 = float32_div(FLOAT_ONE32, fst2, &env->active_fpu.fp_status);
+    fst2 = float32_div(float32_one, fst2, &env->active_fpu.fp_status);
     update_fcr31(env);
     return fst2;
 }
@@ -2987,8 +2983,8 @@ uint64_t helper_float_rsqrt1_ps(CPUMIPSState *env, uint64_t fdt0)
 
     fst2 = float32_sqrt(fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     fsth2 = float32_sqrt(fdt0 >> 32, &env->active_fpu.fp_status);
-    fst2 = float32_div(FLOAT_ONE32, fst2, &env->active_fpu.fp_status);
-    fsth2 = float32_div(FLOAT_ONE32, fsth2, &env->active_fpu.fp_status);
+    fst2 = float32_div(float32_one, fst2, &env->active_fpu.fp_status);
+    fsth2 = float32_div(float32_one, fsth2, &env->active_fpu.fp_status);
     update_fcr31(env);
     return ((uint64_t)fsth2 << 32) | fst2;
 }
@@ -3090,7 +3086,7 @@ FLOAT_FMA(nmsub, float_muladd_negate_result | float_muladd_negate_c)
 uint64_t helper_float_recip2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 {
     fdt2 = float64_mul(fdt0, fdt2, &env->active_fpu.fp_status);
-    fdt2 = float64_chs(float64_sub(fdt2, FLOAT_ONE64, &env->active_fpu.fp_status));
+    fdt2 = float64_chs(float64_sub(fdt2, float64_one, &env->active_fpu.fp_status));
     update_fcr31(env);
     return fdt2;
 }
@@ -3098,7 +3094,7 @@ uint64_t helper_float_recip2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 uint32_t helper_float_recip2_s(CPUMIPSState *env, uint32_t fst0, uint32_t fst2)
 {
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
-    fst2 = float32_chs(float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status));
+    fst2 = float32_chs(float32_sub(fst2, float32_one, &env->active_fpu.fp_status));
     update_fcr31(env);
     return fst2;
 }
@@ -3112,8 +3108,8 @@ uint64_t helper_float_recip2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fsth2 = float32_mul(fsth0, fsth2, &env->active_fpu.fp_status);
-    fst2 = float32_chs(float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status));
-    fsth2 = float32_chs(float32_sub(fsth2, FLOAT_ONE32, &env->active_fpu.fp_status));
+    fst2 = float32_chs(float32_sub(fst2, float32_one, &env->active_fpu.fp_status));
+    fsth2 = float32_chs(float32_sub(fsth2, float32_one, &env->active_fpu.fp_status));
     update_fcr31(env);
     return ((uint64_t)fsth2 << 32) | fst2;
 }
@@ -3121,7 +3117,7 @@ uint64_t helper_float_recip2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 uint64_t helper_float_rsqrt2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 {
     fdt2 = float64_mul(fdt0, fdt2, &env->active_fpu.fp_status);
-    fdt2 = float64_sub(fdt2, FLOAT_ONE64, &env->active_fpu.fp_status);
+    fdt2 = float64_sub(fdt2, float64_one, &env->active_fpu.fp_status);
     fdt2 = float64_chs(float64_div(fdt2, FLOAT_TWO64, &env->active_fpu.fp_status));
     update_fcr31(env);
     return fdt2;
@@ -3130,7 +3126,7 @@ uint64_t helper_float_rsqrt2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 uint32_t helper_float_rsqrt2_s(CPUMIPSState *env, uint32_t fst0, uint32_t fst2)
 {
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
-    fst2 = float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status);
+    fst2 = float32_sub(fst2, float32_one, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_div(fst2, FLOAT_TWO32, &env->active_fpu.fp_status));
     update_fcr31(env);
     return fst2;
@@ -3145,8 +3141,8 @@ uint64_t helper_float_rsqrt2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fsth2 = float32_mul(fsth0, fsth2, &env->active_fpu.fp_status);
-    fst2 = float32_sub(fst2, FLOAT_ONE32, &env->active_fpu.fp_status);
-    fsth2 = float32_sub(fsth2, FLOAT_ONE32, &env->active_fpu.fp_status);
+    fst2 = float32_sub(fst2, float32_one, &env->active_fpu.fp_status);
+    fsth2 = float32_sub(fsth2, float32_one, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_div(fst2, FLOAT_TWO32, &env->active_fpu.fp_status));
     fsth2 = float32_chs(float32_div(fsth2, FLOAT_TWO32, &env->active_fpu.fp_status));
     update_fcr31(env);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 09/19] target-mips: restore CPU state after an FPU exception
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (7 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 08/19] target-mips: use softfloat constants when possible Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 10/19] target-mips: cleanup load/store operations Aurelien Jarno
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Rework *raise_exception*() functions so that they can be called from
other helpers, passing the return address as an argument.

Use do_raise_exception() function in update_fcr31() to correctly restore
the CPU state after an FPU exception.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |  185 ++++++++++++++++++++++++-----------------------
 1 file changed, 95 insertions(+), 90 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 2f9ec5d..a7509ca 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -33,34 +33,49 @@ static inline void cpu_mips_tlb_flush (CPUMIPSState *env, int flush_global);
 /*****************************************************************************/
 /* Exceptions processing helpers */
 
-void helper_raise_exception_err(CPUMIPSState *env, uint32_t exception,
-                                int error_code)
+static inline void QEMU_NORETURN do_raise_exception_err(CPUMIPSState *env,
+                                                        uint32_t exception,
+                                                        int error_code,
+                                                        uintptr_t pc)
 {
+    TranslationBlock *tb;
 #if 1
     if (exception < 0x100)
         qemu_log("%s: %d %d\n", __func__, exception, error_code);
 #endif
     env->exception_index = exception;
     env->error_code = error_code;
+
+    if (pc) {
+        /* now we have a real cpu fault */
+        tb = tb_find_pc(pc);
+        if (tb) {
+            /* the PC is inside the translated code. It means that we have
+               a virtual CPU fault */
+            cpu_restore_state(tb, env, pc);
+        }
+    }
+
     cpu_loop_exit(env);
 }
 
-void helper_raise_exception(CPUMIPSState *env, uint32_t exception)
+static inline void QEMU_NORETURN do_raise_exception(CPUMIPSState *env,
+                                                    uint32_t exception,
+                                                    uintptr_t pc)
 {
-    helper_raise_exception_err(env, exception, 0);
+    do_raise_exception_err(env, exception, 0, pc);
 }
 
-#if !defined(CONFIG_USER_ONLY)
-static void do_restore_state(CPUMIPSState *env, uintptr_t pc)
+void helper_raise_exception_err(CPUMIPSState *env, uint32_t exception,
+                                int error_code)
 {
-    TranslationBlock *tb;
+    do_raise_exception_err(env, exception, error_code, 0);
+}
 
-    tb = tb_find_pc (pc);
-    if (tb) {
-        cpu_restore_state(tb, env, pc);
-    }
+void helper_raise_exception(CPUMIPSState *env, uint32_t exception)
+{
+    do_raise_exception(env, exception, 0);
 }
-#endif
 
 #if defined(CONFIG_USER_ONLY)
 #define HELPER_LD(name, insn, type)                                     \
@@ -2295,28 +2310,18 @@ static void do_unaligned_access(CPUMIPSState *env, target_ulong addr,
                                 int is_write, int is_user, uintptr_t retaddr)
 {
     env->CP0_BadVAddr = addr;
-    do_restore_state(env, retaddr);
-    helper_raise_exception(env, (is_write == 1) ? EXCP_AdES : EXCP_AdEL);
+    do_raise_exception(env, (is_write == 1) ? EXCP_AdES : EXCP_AdEL, retaddr);
 }
 
 void tlb_fill(CPUMIPSState *env, target_ulong addr, int is_write, int mmu_idx,
               uintptr_t retaddr)
 {
-    TranslationBlock *tb;
     int ret;
 
     ret = cpu_mips_handle_mmu_fault(env, addr, is_write, mmu_idx);
     if (ret) {
-        if (retaddr) {
-            /* now we have a real cpu fault */
-            tb = tb_find_pc(retaddr);
-            if (tb) {
-                /* the PC is inside the translated code. It means that we have
-                   a virtual CPU fault */
-                cpu_restore_state(tb, env, retaddr);
-            }
-        }
-        helper_raise_exception_err(env, env->exception_index, env->error_code);
+        do_raise_exception_err(env, env->exception_index,
+                               env->error_code, retaddr);
     }
 }
 
@@ -2410,7 +2415,7 @@ void helper_ctc1(CPUMIPSState *env, target_ulong arg1, uint32_t reg)
     RESTORE_FLUSH_MODE;
     set_float_exception_flags(0, &env->active_fpu.fp_status);
     if ((GET_FP_ENABLE(env->active_fpu.fcr31) | 0x20) & GET_FP_CAUSE(env->active_fpu.fcr31))
-        helper_raise_exception(env, EXCP_FPE);
+        do_raise_exception(env, EXCP_FPE, GETPC());
 }
 
 static inline int ieee_ex_to_mips(int xcpt)
@@ -2436,7 +2441,7 @@ static inline int ieee_ex_to_mips(int xcpt)
     return ret;
 }
 
-static inline void update_fcr31(CPUMIPSState *env)
+static inline void update_fcr31(CPUMIPSState *env, uintptr_t pc)
 {
     int tmp = ieee_ex_to_mips(get_float_exception_flags(&env->active_fpu.fp_status));
 
@@ -2446,7 +2451,7 @@ static inline void update_fcr31(CPUMIPSState *env)
         set_float_exception_flags(0, &env->active_fpu.fp_status);
 
         if (GET_FP_ENABLE(env->active_fpu.fcr31) & tmp) {
-            helper_raise_exception(env, EXCP_FPE);
+            do_raise_exception(env, EXCP_FPE, pc);
         } else {
             UPDATE_FP_FLAGS(env->active_fpu.fcr31, tmp);
         }
@@ -2462,14 +2467,14 @@ static inline void update_fcr31(CPUMIPSState *env)
 uint64_t helper_float_sqrt_d(CPUMIPSState *env, uint64_t fdt0)
 {
     fdt0 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt0;
 }
 
 uint32_t helper_float_sqrt_s(CPUMIPSState *env, uint32_t fst0)
 {
     fst0 = float32_sqrt(fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst0;
 }
 
@@ -2478,7 +2483,7 @@ uint64_t helper_float_cvtd_s(CPUMIPSState *env, uint32_t fst0)
     uint64_t fdt2;
 
     fdt2 = float32_to_float64(fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -2487,7 +2492,7 @@ uint64_t helper_float_cvtd_w(CPUMIPSState *env, uint32_t wt0)
     uint64_t fdt2;
 
     fdt2 = int32_to_float64(wt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -2496,7 +2501,7 @@ uint64_t helper_float_cvtd_l(CPUMIPSState *env, uint64_t dt0)
     uint64_t fdt2;
 
     fdt2 = int64_to_float64(dt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -2509,7 +2514,7 @@ uint64_t helper_float_cvtl_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2522,7 +2527,7 @@ uint64_t helper_float_cvtl_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2533,7 +2538,7 @@ uint64_t helper_float_cvtps_pw(CPUMIPSState *env, uint64_t dt0)
 
     fst2 = int32_to_float32(dt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     fsth2 = int32_to_float32(dt0 >> 32, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return ((uint64_t)fsth2 << 32) | fst2;
 }
 
@@ -2557,7 +2562,7 @@ uint64_t helper_float_cvtpw_ps(CPUMIPSState *env, uint64_t fdt0)
     }
 
     set_float_exception_flags(excp | excph, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
 
     return ((uint64_t)wth2 << 32) | wt2;
 }
@@ -2567,7 +2572,7 @@ uint32_t helper_float_cvts_d(CPUMIPSState *env, uint64_t fdt0)
     uint32_t fst2;
 
     fst2 = float64_to_float32(fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -2576,7 +2581,7 @@ uint32_t helper_float_cvts_w(CPUMIPSState *env, uint32_t wt0)
     uint32_t fst2;
 
     fst2 = int32_to_float32(wt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -2585,7 +2590,7 @@ uint32_t helper_float_cvts_l(CPUMIPSState *env, uint64_t dt0)
     uint32_t fst2;
 
     fst2 = int64_to_float32(dt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -2594,7 +2599,7 @@ uint32_t helper_float_cvts_pl(CPUMIPSState *env, uint32_t wt0)
     uint32_t wt2;
 
     wt2 = wt0;
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2603,7 +2608,7 @@ uint32_t helper_float_cvts_pu(CPUMIPSState *env, uint32_t wth0)
     uint32_t wt2;
 
     wt2 = wth0;
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2612,7 +2617,7 @@ uint32_t helper_float_cvtw_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t wt2;
 
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     if (get_float_exception_flags(&env->active_fpu.fp_status)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
@@ -2629,7 +2634,7 @@ uint32_t helper_float_cvtw_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2644,7 +2649,7 @@ uint64_t helper_float_roundl_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2659,7 +2664,7 @@ uint64_t helper_float_roundl_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2674,7 +2679,7 @@ uint32_t helper_float_roundw_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2689,7 +2694,7 @@ uint32_t helper_float_roundw_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2702,7 +2707,7 @@ uint64_t helper_float_truncl_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2715,7 +2720,7 @@ uint64_t helper_float_truncl_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2728,7 +2733,7 @@ uint32_t helper_float_truncw_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2741,7 +2746,7 @@ uint32_t helper_float_truncw_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2756,7 +2761,7 @@ uint64_t helper_float_ceill_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2771,7 +2776,7 @@ uint64_t helper_float_ceill_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2786,7 +2791,7 @@ uint32_t helper_float_ceilw_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2801,7 +2806,7 @@ uint32_t helper_float_ceilw_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2816,7 +2821,7 @@ uint64_t helper_float_floorl_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2831,7 +2836,7 @@ uint64_t helper_float_floorl_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         dt2 = FP_TO_INT64_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return dt2;
 }
 
@@ -2846,7 +2851,7 @@ uint32_t helper_float_floorw_d(CPUMIPSState *env, uint64_t fdt0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2861,7 +2866,7 @@ uint32_t helper_float_floorw_s(CPUMIPSState *env, uint32_t fst0)
         & (float_flag_invalid | float_flag_overflow)) {
         wt2 = FP_TO_INT32_OVERFLOW;
     }
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return wt2;
 }
 
@@ -2894,7 +2899,7 @@ uint64_t helper_float_recip_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t fdt2;
 
     fdt2 = float64_div(float64_one, fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -2903,7 +2908,7 @@ uint32_t helper_float_recip_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t fst2;
 
     fst2 = float32_div(float32_one, fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -2913,7 +2918,7 @@ uint64_t helper_float_rsqrt_d(CPUMIPSState *env, uint64_t fdt0)
 
     fdt2 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
     fdt2 = float64_div(float64_one, fdt2, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -2923,7 +2928,7 @@ uint32_t helper_float_rsqrt_s(CPUMIPSState *env, uint32_t fst0)
 
     fst2 = float32_sqrt(fst0, &env->active_fpu.fp_status);
     fst2 = float32_div(float32_one, fst2, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -2932,7 +2937,7 @@ uint64_t helper_float_recip1_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t fdt2;
 
     fdt2 = float64_div(float64_one, fdt0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -2941,7 +2946,7 @@ uint32_t helper_float_recip1_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t fst2;
 
     fst2 = float32_div(float32_one, fst0, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -2952,7 +2957,7 @@ uint64_t helper_float_recip1_ps(CPUMIPSState *env, uint64_t fdt0)
 
     fst2 = float32_div(float32_one, fdt0 & 0XFFFFFFFF, &env->active_fpu.fp_status);
     fsth2 = float32_div(float32_one, fdt0 >> 32, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return ((uint64_t)fsth2 << 32) | fst2;
 }
 
@@ -2962,7 +2967,7 @@ uint64_t helper_float_rsqrt1_d(CPUMIPSState *env, uint64_t fdt0)
 
     fdt2 = float64_sqrt(fdt0, &env->active_fpu.fp_status);
     fdt2 = float64_div(float64_one, fdt2, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -2972,7 +2977,7 @@ uint32_t helper_float_rsqrt1_s(CPUMIPSState *env, uint32_t fst0)
 
     fst2 = float32_sqrt(fst0, &env->active_fpu.fp_status);
     fst2 = float32_div(float32_one, fst2, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -2985,7 +2990,7 @@ uint64_t helper_float_rsqrt1_ps(CPUMIPSState *env, uint64_t fdt0)
     fsth2 = float32_sqrt(fdt0 >> 32, &env->active_fpu.fp_status);
     fst2 = float32_div(float32_one, fst2, &env->active_fpu.fp_status);
     fsth2 = float32_div(float32_one, fsth2, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return ((uint64_t)fsth2 << 32) | fst2;
 }
 
@@ -2999,7 +3004,7 @@ uint64_t helper_float_ ## name ## _d(CPUMIPSState *env,            \
     uint64_t dt2;                                                  \
                                                                    \
     dt2 = float64_ ## name (fdt0, fdt1, &env->active_fpu.fp_status);     \
-    update_fcr31(env);                                             \
+    update_fcr31(env, GETPC());                                    \
     return dt2;                                                    \
 }                                                                  \
                                                                    \
@@ -3009,7 +3014,7 @@ uint32_t helper_float_ ## name ## _s(CPUMIPSState *env,            \
     uint32_t wt2;                                                  \
                                                                    \
     wt2 = float32_ ## name (fst0, fst1, &env->active_fpu.fp_status);     \
-    update_fcr31(env);                                             \
+    update_fcr31(env, GETPC());                                    \
     return wt2;                                                    \
 }                                                                  \
                                                                    \
@@ -3026,7 +3031,7 @@ uint64_t helper_float_ ## name ## _ps(CPUMIPSState *env,           \
                                                                    \
     wt2 = float32_ ## name (fst0, fst1, &env->active_fpu.fp_status);     \
     wth2 = float32_ ## name (fsth0, fsth1, &env->active_fpu.fp_status);  \
-    update_fcr31(env);                                             \
+    update_fcr31(env, GETPC());                                    \
     return ((uint64_t)wth2 << 32) | wt2;                           \
 }
 
@@ -3044,7 +3049,7 @@ uint64_t helper_float_ ## name ## _d(CPUMIPSState *env,              \
 {                                                                    \
     fdt0 = float64_muladd(fdt0, fdt1, fdt2, type,                    \
                          &env->active_fpu.fp_status);                \
-    update_fcr31(env);                                               \
+    update_fcr31(env, GETPC());                                      \
     return fdt0;                                                     \
 }                                                                    \
                                                                      \
@@ -3054,7 +3059,7 @@ uint32_t helper_float_ ## name ## _s(CPUMIPSState *env,              \
 {                                                                    \
     fst0 = float32_muladd(fst0, fst1, fst2, type,                    \
                          &env->active_fpu.fp_status);                \
-    update_fcr31(env);                                               \
+    update_fcr31(env, GETPC());                                      \
     return fst0;                                                     \
 }                                                                    \
                                                                      \
@@ -3073,7 +3078,7 @@ uint64_t helper_float_ ## name ## _ps(CPUMIPSState *env,             \
                           &env->active_fpu.fp_status);               \
     fsth0 = float32_muladd(fsth0, fsth1, fsth2, type,                \
                            &env->active_fpu.fp_status);              \
-    update_fcr31(env);                                               \
+    update_fcr31(env, GETPC());                                      \
     return ((uint64_t)fsth0 << 32) | fst0;                           \
 }
 FLOAT_FMA(madd, 0)
@@ -3087,7 +3092,7 @@ uint64_t helper_float_recip2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
 {
     fdt2 = float64_mul(fdt0, fdt2, &env->active_fpu.fp_status);
     fdt2 = float64_chs(float64_sub(fdt2, float64_one, &env->active_fpu.fp_status));
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -3095,7 +3100,7 @@ uint32_t helper_float_recip2_s(CPUMIPSState *env, uint32_t fst0, uint32_t fst2)
 {
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_sub(fst2, float32_one, &env->active_fpu.fp_status));
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -3110,7 +3115,7 @@ uint64_t helper_float_recip2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
     fsth2 = float32_mul(fsth0, fsth2, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_sub(fst2, float32_one, &env->active_fpu.fp_status));
     fsth2 = float32_chs(float32_sub(fsth2, float32_one, &env->active_fpu.fp_status));
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return ((uint64_t)fsth2 << 32) | fst2;
 }
 
@@ -3119,7 +3124,7 @@ uint64_t helper_float_rsqrt2_d(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
     fdt2 = float64_mul(fdt0, fdt2, &env->active_fpu.fp_status);
     fdt2 = float64_sub(fdt2, float64_one, &env->active_fpu.fp_status);
     fdt2 = float64_chs(float64_div(fdt2, FLOAT_TWO64, &env->active_fpu.fp_status));
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fdt2;
 }
 
@@ -3128,7 +3133,7 @@ uint32_t helper_float_rsqrt2_s(CPUMIPSState *env, uint32_t fst0, uint32_t fst2)
     fst2 = float32_mul(fst0, fst2, &env->active_fpu.fp_status);
     fst2 = float32_sub(fst2, float32_one, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_div(fst2, FLOAT_TWO32, &env->active_fpu.fp_status));
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return fst2;
 }
 
@@ -3145,7 +3150,7 @@ uint64_t helper_float_rsqrt2_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt2)
     fsth2 = float32_sub(fsth2, float32_one, &env->active_fpu.fp_status);
     fst2 = float32_chs(float32_div(fst2, FLOAT_TWO32, &env->active_fpu.fp_status));
     fsth2 = float32_chs(float32_div(fsth2, FLOAT_TWO32, &env->active_fpu.fp_status));
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return ((uint64_t)fsth2 << 32) | fst2;
 }
 
@@ -3160,7 +3165,7 @@ uint64_t helper_float_addr_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt1)
 
     fst2 = float32_add (fst0, fsth0, &env->active_fpu.fp_status);
     fsth2 = float32_add (fst1, fsth1, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return ((uint64_t)fsth2 << 32) | fst2;
 }
 
@@ -3175,7 +3180,7 @@ uint64_t helper_float_mulr_ps(CPUMIPSState *env, uint64_t fdt0, uint64_t fdt1)
 
     fst2 = float32_mul (fst0, fsth0, &env->active_fpu.fp_status);
     fsth2 = float32_mul (fst1, fsth1, &env->active_fpu.fp_status);
-    update_fcr31(env);
+    update_fcr31(env, GETPC());
     return ((uint64_t)fsth2 << 32) | fst2;
 }
 
@@ -3186,7 +3191,7 @@ void helper_cmp_d_ ## op(CPUMIPSState *env, uint64_t fdt0,     \
 {                                                              \
     int c;                                                     \
     c = cond;                                                  \
-    update_fcr31(env);                                         \
+    update_fcr31(env, GETPC());                                \
     if (c)                                                     \
         SET_FP_COND(cc, env->active_fpu);                      \
     else                                                       \
@@ -3199,7 +3204,7 @@ void helper_cmpabs_d_ ## op(CPUMIPSState *env, uint64_t fdt0,  \
     fdt0 = float64_abs(fdt0);                                  \
     fdt1 = float64_abs(fdt1);                                  \
     c = cond;                                                  \
-    update_fcr31(env);                                         \
+    update_fcr31(env, GETPC());                                \
     if (c)                                                     \
         SET_FP_COND(cc, env->active_fpu);                      \
     else                                                       \
@@ -3233,7 +3238,7 @@ void helper_cmp_s_ ## op(CPUMIPSState *env, uint32_t fst0,     \
 {                                                              \
     int c;                                                     \
     c = cond;                                                  \
-    update_fcr31(env);                                         \
+    update_fcr31(env, GETPC());                                \
     if (c)                                                     \
         SET_FP_COND(cc, env->active_fpu);                      \
     else                                                       \
@@ -3246,7 +3251,7 @@ void helper_cmpabs_s_ ## op(CPUMIPSState *env, uint32_t fst0,  \
     fst0 = float32_abs(fst0);                                  \
     fst1 = float32_abs(fst1);                                  \
     c = cond;                                                  \
-    update_fcr31(env);                                         \
+    update_fcr31(env, GETPC());                                \
     if (c)                                                     \
         SET_FP_COND(cc, env->active_fpu);                      \
     else                                                       \
@@ -3286,7 +3291,7 @@ void helper_cmp_ps_ ## op(CPUMIPSState *env, uint64_t fdt0,     \
     fsth1 = fdt1 >> 32;                                         \
     cl = condl;                                                 \
     ch = condh;                                                 \
-    update_fcr31(env);                                          \
+    update_fcr31(env, GETPC());                                 \
     if (cl)                                                     \
         SET_FP_COND(cc, env->active_fpu);                       \
     else                                                        \
@@ -3307,7 +3312,7 @@ void helper_cmpabs_ps_ ## op(CPUMIPSState *env, uint64_t fdt0,  \
     fsth1 = float32_abs(fdt1 >> 32);                            \
     cl = condl;                                                 \
     ch = condh;                                                 \
-    update_fcr31(env);                                          \
+    update_fcr31(env, GETPC());                                 \
     if (cl)                                                     \
         SET_FP_COND(cc, env->active_fpu);                       \
     else                                                        \
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 10/19] target-mips: cleanup load/store operations
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (8 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 09/19] target-mips: restore CPU state after an FPU exception Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 11/19] target-mips: optimize load operations Aurelien Jarno
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Load/store operations use macros for historical reasons. Now that there
is no point in keeping them, replace them by direct calls to qemu_ld/st.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   91 ++++++++++++++++-------------------------------
 1 file changed, 31 insertions(+), 60 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 732c65d..4485a81 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1029,35 +1029,6 @@ FOP_CONDS(abs, 1, ps, FMT_PS, 64)
 #undef gen_ldcmp_fpr64
 
 /* load/store instructions. */
-#define OP_LD(insn,fname)                                                 \
-static inline void op_ld_##insn(TCGv ret, TCGv arg1, DisasContext *ctx)   \
-{                                                                         \
-    tcg_gen_qemu_##fname(ret, arg1, ctx->mem_idx);                        \
-}
-OP_LD(lb,ld8s);
-OP_LD(lbu,ld8u);
-OP_LD(lh,ld16s);
-OP_LD(lhu,ld16u);
-OP_LD(lw,ld32s);
-#if defined(TARGET_MIPS64)
-OP_LD(lwu,ld32u);
-OP_LD(ld,ld64);
-#endif
-#undef OP_LD
-
-#define OP_ST(insn,fname)                                                  \
-static inline void op_st_##insn(TCGv arg1, TCGv arg2, DisasContext *ctx)   \
-{                                                                          \
-    tcg_gen_qemu_##fname(arg1, arg2, ctx->mem_idx);                        \
-}
-OP_ST(sb,st8);
-OP_ST(sh,st16);
-OP_ST(sw,st32);
-#if defined(TARGET_MIPS64)
-OP_ST(sd,st64);
-#endif
-#undef OP_ST
-
 #ifdef CONFIG_USER_ONLY
 #define OP_LD_ATOMIC(insn,fname)                                           \
 static inline void op_ld_##insn(TCGv ret, TCGv arg1, DisasContext *ctx)    \
@@ -1171,12 +1142,12 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
     switch (opc) {
 #if defined(TARGET_MIPS64)
     case OPC_LWU:
-        op_ld_lwu(t0, t0, ctx);
+        tcg_gen_qemu_ld32u(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lwu";
         break;
     case OPC_LD:
-        op_ld_ld(t0, t0, ctx);
+        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "ld";
         break;
@@ -1203,7 +1174,7 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
     case OPC_LDPC:
         tcg_gen_movi_tl(t1, pc_relative_pc(ctx));
         gen_op_addr_add(ctx, t0, t0, t1);
-        op_ld_ld(t0, t0, ctx);
+        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "ldpc";
         break;
@@ -1211,32 +1182,32 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
     case OPC_LWPC:
         tcg_gen_movi_tl(t1, pc_relative_pc(ctx));
         gen_op_addr_add(ctx, t0, t0, t1);
-        op_ld_lw(t0, t0, ctx);
+        tcg_gen_qemu_ld32s(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lwpc";
         break;
     case OPC_LW:
-        op_ld_lw(t0, t0, ctx);
+        tcg_gen_qemu_ld32s(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lw";
         break;
     case OPC_LH:
-        op_ld_lh(t0, t0, ctx);
+        tcg_gen_qemu_ld16s(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lh";
         break;
     case OPC_LHU:
-        op_ld_lhu(t0, t0, ctx);
+        tcg_gen_qemu_ld16u(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lhu";
         break;
     case OPC_LB:
-        op_ld_lb(t0, t0, ctx);
+        tcg_gen_qemu_ld8s(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lb";
         break;
     case OPC_LBU:
-        op_ld_lbu(t0, t0, ctx);
+        tcg_gen_qemu_ld8u(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lbu";
         break;
@@ -1280,7 +1251,7 @@ static void gen_st (DisasContext *ctx, uint32_t opc, int rt,
     switch (opc) {
 #if defined(TARGET_MIPS64)
     case OPC_SD:
-        op_st_sd(t1, t0, ctx);
+        tcg_gen_qemu_st64(t1, t0, ctx->mem_idx);
         opn = "sd";
         break;
     case OPC_SDL:
@@ -1295,15 +1266,15 @@ static void gen_st (DisasContext *ctx, uint32_t opc, int rt,
         break;
 #endif
     case OPC_SW:
-        op_st_sw(t1, t0, ctx);
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);
         opn = "sw";
         break;
     case OPC_SH:
-        op_st_sh(t1, t0, ctx);
+        tcg_gen_qemu_st16(t1, t0, ctx->mem_idx);
         opn = "sh";
         break;
     case OPC_SB:
-        op_st_sb(t1, t0, ctx);
+        tcg_gen_qemu_st8(t1, t0, ctx->mem_idx);
         opn = "sb";
         break;
     case OPC_SWL:
@@ -8778,22 +8749,22 @@ static void gen_mips16_save (DisasContext *ctx,
     case 4:
         gen_base_offset_addr(ctx, t0, 29, 12);
         gen_load_gpr(t1, 7);
-        op_st_sw(t1, t0, ctx);
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);
         /* Fall through */
     case 3:
         gen_base_offset_addr(ctx, t0, 29, 8);
         gen_load_gpr(t1, 6);
-        op_st_sw(t1, t0, ctx);
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);
         /* Fall through */
     case 2:
         gen_base_offset_addr(ctx, t0, 29, 4);
         gen_load_gpr(t1, 5);
-        op_st_sw(t1, t0, ctx);
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);
         /* Fall through */
     case 1:
         gen_base_offset_addr(ctx, t0, 29, 0);
         gen_load_gpr(t1, 4);
-        op_st_sw(t1, t0, ctx);
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);
     }
 
     gen_load_gpr(t0, 29);
@@ -8801,7 +8772,7 @@ static void gen_mips16_save (DisasContext *ctx,
 #define DECR_AND_STORE(reg) do {                \
         tcg_gen_subi_tl(t0, t0, 4);             \
         gen_load_gpr(t1, reg);                  \
-        op_st_sw(t1, t0, ctx);                  \
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);                  \
     } while (0)
 
     if (do_ra) {
@@ -8899,10 +8870,10 @@ static void gen_mips16_restore (DisasContext *ctx,
 
     tcg_gen_addi_tl(t0, cpu_gpr[29], framesize);
 
-#define DECR_AND_LOAD(reg) do {                 \
-        tcg_gen_subi_tl(t0, t0, 4);             \
-        op_ld_lw(t1, t0, ctx);                  \
-        gen_store_gpr(t1, reg);                 \
+#define DECR_AND_LOAD(reg) do {                   \
+        tcg_gen_subi_tl(t0, t0, 4);               \
+        tcg_gen_qemu_ld32u(t1, t0, ctx->mem_idx); \
+        gen_store_gpr(t1, reg);                   \
     } while (0)
 
     if (do_ra) {
@@ -10408,7 +10379,7 @@ static void gen_ldxs (DisasContext *ctx, int base, int index, int rd)
         gen_op_addr_add(ctx, t0, t1, t0);
     }
 
-    op_ld_lw(t1, t0, ctx);
+    tcg_gen_qemu_ld32s(t1, t0, ctx->mem_idx);
     gen_store_gpr(t1, rd);
 
     tcg_temp_free(t0);
@@ -10437,21 +10408,21 @@ static void gen_ldst_pair (DisasContext *ctx, uint32_t opc, int rd,
             generate_exception(ctx, EXCP_RI);
             return;
         }
-        op_ld_lw(t1, t0, ctx);
+        tcg_gen_qemu_ld32s(t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rd);
         tcg_gen_movi_tl(t1, 4);
         gen_op_addr_add(ctx, t0, t0, t1);
-        op_ld_lw(t1, t0, ctx);
+        tcg_gen_qemu_ld32s(t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rd+1);
         opn = "lwp";
         break;
     case SWP:
         gen_load_gpr(t1, rd);
-        op_st_sw(t1, t0, ctx);
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);
         tcg_gen_movi_tl(t1, 4);
         gen_op_addr_add(ctx, t0, t0, t1);
         gen_load_gpr(t1, rd+1);
-        op_st_sw(t1, t0, ctx);
+        tcg_gen_qemu_st32(t1, t0, ctx->mem_idx);
         opn = "swp";
         break;
 #ifdef TARGET_MIPS64
@@ -10460,21 +10431,21 @@ static void gen_ldst_pair (DisasContext *ctx, uint32_t opc, int rd,
             generate_exception(ctx, EXCP_RI);
             return;
         }
-        op_ld_ld(t1, t0, ctx);
+        tcg_gen_qemu_ld64(t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rd);
         tcg_gen_movi_tl(t1, 8);
         gen_op_addr_add(ctx, t0, t0, t1);
-        op_ld_ld(t1, t0, ctx);
+        tcg_gen_qemu_ld64(t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rd+1);
         opn = "ldp";
         break;
     case SDP:
         gen_load_gpr(t1, rd);
-        op_st_sd(t1, t0, ctx);
+        tcg_gen_qemu_st64(t1, t0, ctx->mem_idx);
         tcg_gen_movi_tl(t1, 8);
         gen_op_addr_add(ctx, t0, t0, t1);
         gen_load_gpr(t1, rd+1);
-        op_st_sd(t1, t0, ctx);
+        tcg_gen_qemu_st64(t1, t0, ctx->mem_idx);
         opn = "sdp";
         break;
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 11/19] target-mips: optimize load operations
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (9 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 10/19] target-mips: cleanup load/store operations Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 12/19] target-mips: simplify load/store microMIPS helpers Aurelien Jarno
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Only allocate t1 when needed.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 4485a81..c46129d 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1136,7 +1136,6 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
     }
 
     t0 = tcg_temp_new();
-    t1 = tcg_temp_new();
     gen_base_offset_addr(ctx, t0, base, offset);
 
     switch (opc) {
@@ -1159,29 +1158,35 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
         break;
     case OPC_LDL:
         save_cpu_state(ctx, 1);
+        t1 = tcg_temp_new();
         gen_load_gpr(t1, rt);
         gen_helper_1e2i(ldl, t1, t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rt);
+        tcg_temp_free(t1);
         opn = "ldl";
         break;
     case OPC_LDR:
         save_cpu_state(ctx, 1);
+        t1 = tcg_temp_new();
         gen_load_gpr(t1, rt);
         gen_helper_1e2i(ldr, t1, t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rt);
+        tcg_temp_free(t1);
         opn = "ldr";
         break;
     case OPC_LDPC:
-        tcg_gen_movi_tl(t1, pc_relative_pc(ctx));
+        t1 = tcg_const_tl(pc_relative_pc(ctx));
         gen_op_addr_add(ctx, t0, t0, t1);
+        tcg_temp_free(t1);
         tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "ldpc";
         break;
 #endif
     case OPC_LWPC:
-        tcg_gen_movi_tl(t1, pc_relative_pc(ctx));
+        t1 = tcg_const_tl(pc_relative_pc(ctx));
         gen_op_addr_add(ctx, t0, t0, t1);
+        tcg_temp_free(t1);
         tcg_gen_qemu_ld32s(t0, t0, ctx->mem_idx);
         gen_store_gpr(t0, rt);
         opn = "lwpc";
@@ -1213,16 +1218,20 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
         break;
     case OPC_LWL:
         save_cpu_state(ctx, 1);
+        t1 = tcg_temp_new();
         gen_load_gpr(t1, rt);
         gen_helper_1e2i(lwl, t1, t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rt);
+        tcg_temp_free(t1);
         opn = "lwl";
         break;
     case OPC_LWR:
         save_cpu_state(ctx, 1);
+        t1 = tcg_temp_new();
         gen_load_gpr(t1, rt);
         gen_helper_1e2i(lwr, t1, t1, t0, ctx->mem_idx);
         gen_store_gpr(t1, rt);
+        tcg_temp_free(t1);
         opn = "lwr";
         break;
     case OPC_LL:
@@ -1235,7 +1244,6 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
     (void)opn; /* avoid a compiler warning */
     MIPS_DEBUG("%s %s, %d(%s)", opn, regnames[rt], offset, regnames[base]);
     tcg_temp_free(t0);
-    tcg_temp_free(t1);
 }
 
 /* Store */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 12/19] target-mips: simplify load/store microMIPS helpers
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (10 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 11/19] target-mips: optimize load operations Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 13/19] target-mips: implement unaligned loads using TCG Aurelien Jarno
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

load/store microMIPS helpers are reinventing the wheel. Call do_lw,
do_ll, do_sw and do_sl instead of using a macro calling the cpu_*
load/store functions.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |   73 ++++++-----------------------------------------
 1 file changed, 9 insertions(+), 64 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index a7509ca..78497d9 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -594,32 +594,19 @@ void helper_lwm(CPUMIPSState *env, target_ulong addr, target_ulong reglist,
 {
     target_ulong base_reglist = reglist & 0xf;
     target_ulong do_r31 = reglist & 0x10;
-#ifdef CONFIG_USER_ONLY
-#undef ldfun
-#define ldfun(env, addr) ldl_raw(addr)
-#else
-    uint32_t (*ldfun)(CPUMIPSState *env, target_ulong);
-
-    switch (mem_idx)
-    {
-    case 0: ldfun = cpu_ldl_kernel; break;
-    case 1: ldfun = cpu_ldl_super; break;
-    default:
-    case 2: ldfun = cpu_ldl_user; break;
-    }
-#endif
 
     if (base_reglist > 0 && base_reglist <= ARRAY_SIZE (multiple_regs)) {
         target_ulong i;
 
         for (i = 0; i < base_reglist; i++) {
-            env->active_tc.gpr[multiple_regs[i]] = (target_long)ldfun(env, addr);
+            env->active_tc.gpr[multiple_regs[i]] =
+                (target_long)do_lw(env, addr, mem_idx);
             addr += 4;
         }
     }
 
     if (do_r31) {
-        env->active_tc.gpr[31] = (target_long)ldfun(env, addr);
+        env->active_tc.gpr[31] = (target_long)do_lw(env, addr, mem_idx);
     }
 }
 
@@ -628,32 +615,18 @@ void helper_swm(CPUMIPSState *env, target_ulong addr, target_ulong reglist,
 {
     target_ulong base_reglist = reglist & 0xf;
     target_ulong do_r31 = reglist & 0x10;
-#ifdef CONFIG_USER_ONLY
-#undef stfun
-#define stfun(env, addr, val) stl_raw(addr, val)
-#else
-    void (*stfun)(CPUMIPSState *env, target_ulong, uint32_t);
-
-    switch (mem_idx)
-    {
-    case 0: stfun = cpu_stl_kernel; break;
-    case 1: stfun = cpu_stl_super; break;
-     default:
-    case 2: stfun = cpu_stl_user; break;
-    }
-#endif
 
     if (base_reglist > 0 && base_reglist <= ARRAY_SIZE (multiple_regs)) {
         target_ulong i;
 
         for (i = 0; i < base_reglist; i++) {
-            stfun(env, addr, env->active_tc.gpr[multiple_regs[i]]);
+            do_sw(env, addr, env->active_tc.gpr[multiple_regs[i]], mem_idx);
             addr += 4;
         }
     }
 
     if (do_r31) {
-        stfun(env, addr, env->active_tc.gpr[31]);
+        do_sw(env, addr, env->active_tc.gpr[31], mem_idx);
     }
 }
 
@@ -663,32 +636,18 @@ void helper_ldm(CPUMIPSState *env, target_ulong addr, target_ulong reglist,
 {
     target_ulong base_reglist = reglist & 0xf;
     target_ulong do_r31 = reglist & 0x10;
-#ifdef CONFIG_USER_ONLY
-#undef ldfun
-#define ldfun(env, addr) ldq_raw(addr)
-#else
-    uint64_t (*ldfun)(CPUMIPSState *env, target_ulong);
-
-    switch (mem_idx)
-    {
-    case 0: ldfun = cpu_ldq_kernel; break;
-    case 1: ldfun = cpu_ldq_super; break;
-    default:
-    case 2: ldfun = cpu_ldq_user; break;
-    }
-#endif
 
     if (base_reglist > 0 && base_reglist <= ARRAY_SIZE (multiple_regs)) {
         target_ulong i;
 
         for (i = 0; i < base_reglist; i++) {
-            env->active_tc.gpr[multiple_regs[i]] = ldfun(env, addr);
+            env->active_tc.gpr[multiple_regs[i]] = do_ld(env, addr, mem_idx);
             addr += 8;
         }
     }
 
     if (do_r31) {
-        env->active_tc.gpr[31] = ldfun(env, addr);
+        env->active_tc.gpr[31] = do_ld(env, addr, mem_idx);
     }
 }
 
@@ -697,32 +656,18 @@ void helper_sdm(CPUMIPSState *env, target_ulong addr, target_ulong reglist,
 {
     target_ulong base_reglist = reglist & 0xf;
     target_ulong do_r31 = reglist & 0x10;
-#ifdef CONFIG_USER_ONLY
-#undef stfun
-#define stfun(env, addr, val) stq_raw(addr, val)
-#else
-    void (*stfun)(CPUMIPSState *env, target_ulong, uint64_t);
-
-    switch (mem_idx)
-    {
-    case 0: stfun = cpu_stq_kernel; break;
-    case 1: stfun = cpu_stq_super; break;
-     default:
-    case 2: stfun = cpu_stq_user; break;
-    }
-#endif
 
     if (base_reglist > 0 && base_reglist <= ARRAY_SIZE (multiple_regs)) {
         target_ulong i;
 
         for (i = 0; i < base_reglist; i++) {
-            stfun(env, addr, env->active_tc.gpr[multiple_regs[i]]);
+            do_sd(env, addr, env->active_tc.gpr[multiple_regs[i]], mem_idx);
             addr += 8;
         }
     }
 
     if (do_r31) {
-        stfun(env, addr, env->active_tc.gpr[31]);
+        do_sd(env, addr, env->active_tc.gpr[31], mem_idx);
     }
 }
 #endif
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 13/19] target-mips: implement unaligned loads using TCG
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (11 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 12/19] target-mips: simplify load/store microMIPS helpers Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30 18:59   ` Blue Swirl
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 14/19] target-mips: don't use local temps for store conditional Aurelien Jarno
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Load/store from helpers should be avoided as they are quite
inefficient. Rewrite unaligned loads instructions using TCG and
aligned loads. The number of actual loads operations to implement
an unaligned load instruction is reduced from up to 8 to 1.

Note: As we can't rely on shift by 32 or 64 undefined behaviour,
the code loads already shift by one constants.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/helper.h    |    4 --
 target-mips/op_helper.c |  142 -----------------------------------------------
 target-mips/translate.c |   75 ++++++++++++++++++++-----
 3 files changed, 62 insertions(+), 159 deletions(-)

diff --git a/target-mips/helper.h b/target-mips/helper.h
index 210960f..0e38cdd 100644
--- a/target-mips/helper.h
+++ b/target-mips/helper.h
@@ -4,13 +4,9 @@ DEF_HELPER_3(raise_exception_err, noreturn, env, i32, int)
 DEF_HELPER_2(raise_exception, noreturn, env, i32)
 
 #ifdef TARGET_MIPS64
-DEF_HELPER_4(ldl, tl, env, tl, tl, int)
-DEF_HELPER_4(ldr, tl, env, tl, tl, int)
 DEF_HELPER_4(sdl, void, env, tl, tl, int)
 DEF_HELPER_4(sdr, void, env, tl, tl, int)
 #endif
-DEF_HELPER_4(lwl, tl, env, tl, tl, int)
-DEF_HELPER_4(lwr, tl, env, tl, tl, int)
 DEF_HELPER_4(swl, void, env, tl, tl, int)
 DEF_HELPER_4(swr, void, env, tl, tl, int)
 
diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 78497d9..773c710 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -350,56 +350,6 @@ HELPER_ST_ATOMIC(scd, ld, sd, 0x7)
 #define GET_OFFSET(addr, offset) (addr - (offset))
 #endif
 
-target_ulong helper_lwl(CPUMIPSState *env, target_ulong arg1,
-                        target_ulong arg2, int mem_idx)
-{
-    target_ulong tmp;
-
-    tmp = do_lbu(env, arg2, mem_idx);
-    arg1 = (arg1 & 0x00FFFFFF) | (tmp << 24);
-
-    if (GET_LMASK(arg2) <= 2) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 1), mem_idx);
-        arg1 = (arg1 & 0xFF00FFFF) | (tmp << 16);
-    }
-
-    if (GET_LMASK(arg2) <= 1) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 2), mem_idx);
-        arg1 = (arg1 & 0xFFFF00FF) | (tmp << 8);
-    }
-
-    if (GET_LMASK(arg2) == 0) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 3), mem_idx);
-        arg1 = (arg1 & 0xFFFFFF00) | tmp;
-    }
-    return (int32_t)arg1;
-}
-
-target_ulong helper_lwr(CPUMIPSState *env, target_ulong arg1,
-                        target_ulong arg2, int mem_idx)
-{
-    target_ulong tmp;
-
-    tmp = do_lbu(env, arg2, mem_idx);
-    arg1 = (arg1 & 0xFFFFFF00) | tmp;
-
-    if (GET_LMASK(arg2) >= 1) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -1), mem_idx);
-        arg1 = (arg1 & 0xFFFF00FF) | (tmp << 8);
-    }
-
-    if (GET_LMASK(arg2) >= 2) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -2), mem_idx);
-        arg1 = (arg1 & 0xFF00FFFF) | (tmp << 16);
-    }
-
-    if (GET_LMASK(arg2) == 3) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -3), mem_idx);
-        arg1 = (arg1 & 0x00FFFFFF) | (tmp << 24);
-    }
-    return (int32_t)arg1;
-}
-
 void helper_swl(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
                 int mem_idx)
 {
@@ -440,98 +390,6 @@ void helper_swr(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
 #define GET_LMASK64(v) (((v) & 7) ^ 7)
 #endif
 
-target_ulong helper_ldl(CPUMIPSState *env, target_ulong arg1,
-                        target_ulong arg2, int mem_idx)
-{
-    uint64_t tmp;
-
-    tmp = do_lbu(env, arg2, mem_idx);
-    arg1 = (arg1 & 0x00FFFFFFFFFFFFFFULL) | (tmp << 56);
-
-    if (GET_LMASK64(arg2) <= 6) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 1), mem_idx);
-        arg1 = (arg1 & 0xFF00FFFFFFFFFFFFULL) | (tmp << 48);
-    }
-
-    if (GET_LMASK64(arg2) <= 5) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 2), mem_idx);
-        arg1 = (arg1 & 0xFFFF00FFFFFFFFFFULL) | (tmp << 40);
-    }
-
-    if (GET_LMASK64(arg2) <= 4) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 3), mem_idx);
-        arg1 = (arg1 & 0xFFFFFF00FFFFFFFFULL) | (tmp << 32);
-    }
-
-    if (GET_LMASK64(arg2) <= 3) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 4), mem_idx);
-        arg1 = (arg1 & 0xFFFFFFFF00FFFFFFULL) | (tmp << 24);
-    }
-
-    if (GET_LMASK64(arg2) <= 2) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 5), mem_idx);
-        arg1 = (arg1 & 0xFFFFFFFFFF00FFFFULL) | (tmp << 16);
-    }
-
-    if (GET_LMASK64(arg2) <= 1) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 6), mem_idx);
-        arg1 = (arg1 & 0xFFFFFFFFFFFF00FFULL) | (tmp << 8);
-    }
-
-    if (GET_LMASK64(arg2) == 0) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, 7), mem_idx);
-        arg1 = (arg1 & 0xFFFFFFFFFFFFFF00ULL) | tmp;
-    }
-
-    return arg1;
-}
-
-target_ulong helper_ldr(CPUMIPSState *env, target_ulong arg1,
-                        target_ulong arg2, int mem_idx)
-{
-    uint64_t tmp;
-
-    tmp = do_lbu(env, arg2, mem_idx);
-    arg1 = (arg1 & 0xFFFFFFFFFFFFFF00ULL) | tmp;
-
-    if (GET_LMASK64(arg2) >= 1) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -1), mem_idx);
-        arg1 = (arg1 & 0xFFFFFFFFFFFF00FFULL) | (tmp  << 8);
-    }
-
-    if (GET_LMASK64(arg2) >= 2) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -2), mem_idx);
-        arg1 = (arg1 & 0xFFFFFFFFFF00FFFFULL) | (tmp << 16);
-    }
-
-    if (GET_LMASK64(arg2) >= 3) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -3), mem_idx);
-        arg1 = (arg1 & 0xFFFFFFFF00FFFFFFULL) | (tmp << 24);
-    }
-
-    if (GET_LMASK64(arg2) >= 4) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -4), mem_idx);
-        arg1 = (arg1 & 0xFFFFFF00FFFFFFFFULL) | (tmp << 32);
-    }
-
-    if (GET_LMASK64(arg2) >= 5) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -5), mem_idx);
-        arg1 = (arg1 & 0xFFFF00FFFFFFFFFFULL) | (tmp << 40);
-    }
-
-    if (GET_LMASK64(arg2) >= 6) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -6), mem_idx);
-        arg1 = (arg1 & 0xFF00FFFFFFFFFFFFULL) | (tmp << 48);
-    }
-
-    if (GET_LMASK64(arg2) == 7) {
-        tmp = do_lbu(env, GET_OFFSET(arg2, -7), mem_idx);
-        arg1 = (arg1 & 0x00FFFFFFFFFFFFFFULL) | (tmp << 56);
-    }
-
-    return arg1;
-}
-
 void helper_sdl(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
                 int mem_idx)
 {
diff --git a/target-mips/translate.c b/target-mips/translate.c
index c46129d..b385923 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1125,7 +1125,7 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
                     int rt, int base, int16_t offset)
 {
     const char *opn = "ld";
-    TCGv t0, t1;
+    TCGv t0, t1, t2;
 
     if (rt == 0 && env->insn_flags & (INSN_LOONGSON2E | INSN_LOONGSON2F)) {
         /* Loongson CPU uses a load to zero register for prefetch.
@@ -1157,21 +1157,45 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
         opn = "lld";
         break;
     case OPC_LDL:
-        save_cpu_state(ctx, 1);
         t1 = tcg_temp_new();
+        tcg_gen_andi_tl(t1, t0, 7);
+#ifndef TARGET_WORDS_BIGENDIAN
+        tcg_gen_xori_tl(t1, t1, 7);
+#endif
+        tcg_gen_shli_tl(t1, t1, 3);
+        tcg_gen_andi_tl(t0, t0, ~7);
+        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
+        tcg_gen_shl_tl(t0, t0, t1);
+        tcg_gen_xori_tl(t1, t1, 63);
+        t2 = tcg_const_tl(0x7fffffffffffffffull);
+        tcg_gen_shr_tl(t2, t2, t1);
         gen_load_gpr(t1, rt);
-        gen_helper_1e2i(ldl, t1, t1, t0, ctx->mem_idx);
-        gen_store_gpr(t1, rt);
+        tcg_gen_and_tl(t1, t1, t2);
+        tcg_temp_free(t2);
+        tcg_gen_or_tl(t0, t0, t1);
         tcg_temp_free(t1);
+        gen_store_gpr(t0, rt);
         opn = "ldl";
         break;
     case OPC_LDR:
-        save_cpu_state(ctx, 1);
         t1 = tcg_temp_new();
+        tcg_gen_andi_tl(t1, t0, 7);
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_gen_xori_tl(t1, t1, 7);
+#endif
+        tcg_gen_shli_tl(t1, t1, 3);
+        tcg_gen_andi_tl(t0, t0, ~7);
+        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
+        tcg_gen_shr_tl(t0, t0, t1);
+        tcg_gen_xori_tl(t1, t1, 63);
+        t2 = tcg_const_tl(0xfffffffffffffffeull);
+        tcg_gen_shl_tl(t2, t2, t1);
         gen_load_gpr(t1, rt);
-        gen_helper_1e2i(ldr, t1, t1, t0, ctx->mem_idx);
-        gen_store_gpr(t1, rt);
+        tcg_gen_and_tl(t1, t1, t2);
+        tcg_temp_free(t2);
+        tcg_gen_or_tl(t0, t0, t1);
         tcg_temp_free(t1);
+        gen_store_gpr(t0, rt);
         opn = "ldr";
         break;
     case OPC_LDPC:
@@ -1217,21 +1241,46 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
         opn = "lbu";
         break;
     case OPC_LWL:
-        save_cpu_state(ctx, 1);
         t1 = tcg_temp_new();
+        tcg_gen_andi_tl(t1, t0, 3);
+#ifndef TARGET_WORDS_BIGENDIAN
+        tcg_gen_xori_tl(t1, t1, 3);
+#endif
+        tcg_gen_shli_tl(t1, t1, 3);
+        tcg_gen_andi_tl(t0, t0, ~3);
+        tcg_gen_qemu_ld32u(t0, t0, ctx->mem_idx);
+        tcg_gen_shl_tl(t0, t0, t1);
+        tcg_gen_xori_tl(t1, t1, 31);
+        t2 = tcg_const_tl(0x7fffffffull);
+        tcg_gen_shr_tl(t2, t2, t1);
         gen_load_gpr(t1, rt);
-        gen_helper_1e2i(lwl, t1, t1, t0, ctx->mem_idx);
-        gen_store_gpr(t1, rt);
+        tcg_gen_and_tl(t1, t1, t2);
+        tcg_temp_free(t2);
+        tcg_gen_or_tl(t0, t0, t1);
         tcg_temp_free(t1);
+        tcg_gen_ext32s_tl(t0, t0);
+        gen_store_gpr(t0, rt);
         opn = "lwl";
         break;
     case OPC_LWR:
-        save_cpu_state(ctx, 1);
         t1 = tcg_temp_new();
+        tcg_gen_andi_tl(t1, t0, 3);
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_gen_xori_tl(t1, t1, 3);
+#endif
+        tcg_gen_shli_tl(t1, t1, 3);
+        tcg_gen_andi_tl(t0, t0, ~3);
+        tcg_gen_qemu_ld32u(t0, t0, ctx->mem_idx);
+        tcg_gen_shr_tl(t0, t0, t1);
+        tcg_gen_xori_tl(t1, t1, 31);
+        t2 = tcg_const_tl(0xfffffffeull);
+        tcg_gen_shl_tl(t2, t2, t1);
         gen_load_gpr(t1, rt);
-        gen_helper_1e2i(lwr, t1, t1, t0, ctx->mem_idx);
-        gen_store_gpr(t1, rt);
+        tcg_gen_and_tl(t1, t1, t2);
+        tcg_temp_free(t2);
+        tcg_gen_or_tl(t0, t0, t1);
         tcg_temp_free(t1);
+        gen_store_gpr(t0, rt);
         opn = "lwr";
         break;
     case OPC_LL:
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 14/19] target-mips: don't use local temps for store conditional
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (12 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 13/19] target-mips: implement unaligned loads using TCG Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 15/19] target-mips: implement movn/movz using movcond Aurelien Jarno
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Store conditional operations only need local temps in user mode. Fix
the code to use temp local only in user mode, this spares two memory
stores in system mode.

At the same time remove a wrong a wrong copied & pasted comment,
store operations don't have a register destination.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index b385923..54f309f 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1359,13 +1359,14 @@ static void gen_st_cond (DisasContext *ctx, uint32_t opc, int rt,
     const char *opn = "st_cond";
     TCGv t0, t1;
 
+#ifdef CONFIG_USER_ONLY
     t0 = tcg_temp_local_new();
-
-    gen_base_offset_addr(ctx, t0, base, offset);
-    /* Don't do NOP if destination is zero: we must perform the actual
-       memory access. */
-
     t1 = tcg_temp_local_new();
+#else
+    t0 = tcg_temp_new();
+    t1 = tcg_temp_new();
+#endif
+    gen_base_offset_addr(ctx, t0, base, offset);
     gen_load_gpr(t1, rt);
     switch (opc) {
 #if defined(TARGET_MIPS64)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 15/19] target-mips: implement movn/movz using movcond
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (13 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 14/19] target-mips: don't use local temps for store conditional Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 16/19] target-mips: optimize ddiv/ddivu/div/divu with movcond Aurelien Jarno
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Avoid the branches in movn/movz implementation and replace them with
movcond. Also update a wrong command.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 54f309f..5d5c44e 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1904,35 +1904,32 @@ static void gen_cond_move(CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
                           int rd, int rs, int rt)
 {
     const char *opn = "cond move";
-    int l1;
+    TCGv t0, t1, t2;
 
     if (rd == 0) {
-        /* If no destination, treat it as a NOP.
-           For add & sub, we must generate the overflow exception when needed. */
+        /* If no destination, treat it as a NOP. */
         MIPS_DEBUG("NOP");
         return;
     }
 
-    l1 = gen_new_label();
+    t0 = tcg_temp_new();
+    gen_load_gpr(t0, rt);
+    t1 = tcg_const_tl(0);
+    t2 = tcg_temp_new();
+    gen_load_gpr(t2, rs);
     switch (opc) {
     case OPC_MOVN:
-        if (likely(rt != 0))
-            tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_gpr[rt], 0, l1);
-        else
-            tcg_gen_br(l1);
+        tcg_gen_movcond_tl(TCG_COND_NE, cpu_gpr[rd], t0, t1, t2, cpu_gpr[rd]);
         opn = "movn";
         break;
     case OPC_MOVZ:
-        if (likely(rt != 0))
-            tcg_gen_brcondi_tl(TCG_COND_NE, cpu_gpr[rt], 0, l1);
+        tcg_gen_movcond_tl(TCG_COND_EQ, cpu_gpr[rd], t0, t1, t2, cpu_gpr[rd]);
         opn = "movz";
         break;
     }
-    if (rs != 0)
-        tcg_gen_mov_tl(cpu_gpr[rd], cpu_gpr[rs]);
-    else
-        tcg_gen_movi_tl(cpu_gpr[rd], 0);
-    gen_set_label(l1);
+    tcg_temp_free(t2);
+    tcg_temp_free(t1);
+    tcg_temp_free(t0);
 
     (void)opn; /* avoid a compiler warning */
     MIPS_DEBUG("%s %s, %s, %s", opn, regnames[rd], regnames[rs], regnames[rt]);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 16/19] target-mips: optimize ddiv/ddivu/div/divu with movcond
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (14 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 15/19] target-mips: implement movn/movz using movcond Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 17/19] target-mips: use deposit instead of hardcoded version Aurelien Jarno
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

The result of a division by 0, or a division of INT_MIN by -1 in the
signed case, is unpredictable. Just replace 0 by 1 in that case so that
it doesn't trigger a floating point exception on the host.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   85 +++++++++++++++++++++--------------------------
 1 file changed, 37 insertions(+), 48 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 5d5c44e..bfc7cc7 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -2155,60 +2155,48 @@ static void gen_muldiv (DisasContext *ctx, uint32_t opc,
     const char *opn = "mul/div";
     TCGv t0, t1;
 
-    switch (opc) {
-    case OPC_DIV:
-    case OPC_DIVU:
-#if defined(TARGET_MIPS64)
-    case OPC_DDIV:
-    case OPC_DDIVU:
-#endif
-        t0 = tcg_temp_local_new();
-        t1 = tcg_temp_local_new();
-        break;
-    default:
-        t0 = tcg_temp_new();
-        t1 = tcg_temp_new();
-        break;
-    }
+    t0 = tcg_temp_new();
+    t1 = tcg_temp_new();
 
     gen_load_gpr(t0, rs);
     gen_load_gpr(t1, rt);
+
     switch (opc) {
     case OPC_DIV:
         {
-            int l1 = gen_new_label();
-            int l2 = gen_new_label();
-
+            TCGv t2 = tcg_temp_new();
+            TCGv t3 = tcg_temp_new();
             tcg_gen_ext32s_tl(t0, t0);
             tcg_gen_ext32s_tl(t1, t1);
-            tcg_gen_brcondi_tl(TCG_COND_EQ, t1, 0, l1);
-            tcg_gen_brcondi_tl(TCG_COND_NE, t0, INT_MIN, l2);
-            tcg_gen_brcondi_tl(TCG_COND_NE, t1, -1, l2);
-
-            tcg_gen_mov_tl(cpu_LO[0], t0);
-            tcg_gen_movi_tl(cpu_HI[0], 0);
-            tcg_gen_br(l1);
-            gen_set_label(l2);
+            tcg_gen_setcondi_tl(TCG_COND_EQ, t2, t0, INT_MIN);
+            tcg_gen_setcondi_tl(TCG_COND_EQ, t3, t1, -1);
+            tcg_gen_and_tl(t2, t2, t3);
+            tcg_gen_setcondi_tl(TCG_COND_EQ, t3, t1, 0);
+            tcg_gen_or_tl(t2, t2, t3);
+            tcg_gen_movi_tl(t3, 0);
+            tcg_gen_movcond_tl(TCG_COND_NE, t1, t2, t3, t2, t1);
             tcg_gen_div_tl(cpu_LO[0], t0, t1);
             tcg_gen_rem_tl(cpu_HI[0], t0, t1);
             tcg_gen_ext32s_tl(cpu_LO[0], cpu_LO[0]);
             tcg_gen_ext32s_tl(cpu_HI[0], cpu_HI[0]);
-            gen_set_label(l1);
+            tcg_temp_free(t3);
+            tcg_temp_free(t2);
         }
         opn = "div";
         break;
     case OPC_DIVU:
         {
-            int l1 = gen_new_label();
-
+            TCGv t2 = tcg_const_tl(0);
+            TCGv t3 = tcg_const_tl(1);
             tcg_gen_ext32u_tl(t0, t0);
             tcg_gen_ext32u_tl(t1, t1);
-            tcg_gen_brcondi_tl(TCG_COND_EQ, t1, 0, l1);
+            tcg_gen_movcond_tl(TCG_COND_EQ, t1, t1, t2, t3, t1);
             tcg_gen_divu_tl(cpu_LO[0], t0, t1);
             tcg_gen_remu_tl(cpu_HI[0], t0, t1);
             tcg_gen_ext32s_tl(cpu_LO[0], cpu_LO[0]);
             tcg_gen_ext32s_tl(cpu_HI[0], cpu_HI[0]);
-            gen_set_label(l1);
+            tcg_temp_free(t3);
+            tcg_temp_free(t2);
         }
         opn = "divu";
         break;
@@ -2253,30 +2241,31 @@ static void gen_muldiv (DisasContext *ctx, uint32_t opc,
 #if defined(TARGET_MIPS64)
     case OPC_DDIV:
         {
-            int l1 = gen_new_label();
-            int l2 = gen_new_label();
-
-            tcg_gen_brcondi_tl(TCG_COND_EQ, t1, 0, l1);
-            tcg_gen_brcondi_tl(TCG_COND_NE, t0, -1LL << 63, l2);
-            tcg_gen_brcondi_tl(TCG_COND_NE, t1, -1LL, l2);
-            tcg_gen_mov_tl(cpu_LO[0], t0);
-            tcg_gen_movi_tl(cpu_HI[0], 0);
-            tcg_gen_br(l1);
-            gen_set_label(l2);
-            tcg_gen_div_i64(cpu_LO[0], t0, t1);
-            tcg_gen_rem_i64(cpu_HI[0], t0, t1);
-            gen_set_label(l1);
+            TCGv t2 = tcg_temp_new();
+            TCGv t3 = tcg_temp_new();
+            tcg_gen_setcondi_tl(TCG_COND_EQ, t2, t0, -1LL << 63);
+            tcg_gen_setcondi_tl(TCG_COND_EQ, t3, t1, -1LL);
+            tcg_gen_and_tl(t2, t2, t3);
+            tcg_gen_setcondi_tl(TCG_COND_EQ, t3, t1, 0);
+            tcg_gen_or_tl(t2, t2, t3);
+            tcg_gen_movi_tl(t3, 0);
+            tcg_gen_movcond_tl(TCG_COND_NE, t1, t2, t3, t2, t1);
+            tcg_gen_div_tl(cpu_LO[0], t0, t1);
+            tcg_gen_rem_tl(cpu_HI[0], t0, t1);
+            tcg_temp_free(t3);
+            tcg_temp_free(t2);
         }
         opn = "ddiv";
         break;
     case OPC_DDIVU:
         {
-            int l1 = gen_new_label();
-
-            tcg_gen_brcondi_tl(TCG_COND_EQ, t1, 0, l1);
+            TCGv t2 = tcg_const_tl(0);
+            TCGv t3 = tcg_const_tl(1);
+            tcg_gen_movcond_tl(TCG_COND_EQ, t1, t1, t2, t3, t1);
             tcg_gen_divu_i64(cpu_LO[0], t0, t1);
             tcg_gen_remu_i64(cpu_HI[0], t0, t1);
-            gen_set_label(l1);
+            tcg_temp_free(t3);
+            tcg_temp_free(t2);
         }
         opn = "ddivu";
         break;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 17/19] target-mips: use deposit instead of hardcoded version
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (15 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 16/19] target-mips: optimize ddiv/ddivu/div/divu with movcond Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 18/19] target-mips: fix TLBR wrt SEGMask Aurelien Jarno
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Никита Канунников,
	Aurelien Jarno

Use the deposit op instead of and hardcoded bit field insertion. It
allows the host to emit the corresponding instruction if available.

At the same time remove the (lsb > msb) test. The MIPS64R2 instruction
set manual says "Because of the instruction format, lsb can never be
greater than msb, so there is no UNPREDICATABLE case for this
instruction."

(Bug reported as LP:1071149.)
Cc: Никита Канунников <n.kanunnikov@sbtcom.ru>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/translate.c |   32 ++++----------------------------
 1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index bfc7cc7..1734aa7 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -3386,7 +3386,6 @@ static void gen_bitops (DisasContext *ctx, uint32_t opc, int rt,
 {
     TCGv t0 = tcg_temp_new();
     TCGv t1 = tcg_temp_new();
-    target_ulong mask;
 
     gen_load_gpr(t1, rs);
     switch (opc) {
@@ -3419,45 +3418,22 @@ static void gen_bitops (DisasContext *ctx, uint32_t opc, int rt,
     case OPC_INS:
         if (lsb > msb)
             goto fail;
-        mask = ((msb - lsb + 1 < 32) ? ((1 << (msb - lsb + 1)) - 1) : ~0) << lsb;
         gen_load_gpr(t0, rt);
-        tcg_gen_andi_tl(t0, t0, ~mask);
-        tcg_gen_shli_tl(t1, t1, lsb);
-        tcg_gen_andi_tl(t1, t1, mask);
-        tcg_gen_or_tl(t0, t0, t1);
+        tcg_gen_deposit_tl(t0, t0, t1, lsb, msb - lsb + 1);
         tcg_gen_ext32s_tl(t0, t0);
         break;
 #if defined(TARGET_MIPS64)
     case OPC_DINSM:
-        if (lsb > msb)
-            goto fail;
-        mask = ((msb - lsb + 1 + 32 < 64) ? ((1ULL << (msb - lsb + 1 + 32)) - 1) : ~0ULL) << lsb;
         gen_load_gpr(t0, rt);
-        tcg_gen_andi_tl(t0, t0, ~mask);
-        tcg_gen_shli_tl(t1, t1, lsb);
-        tcg_gen_andi_tl(t1, t1, mask);
-        tcg_gen_or_tl(t0, t0, t1);
+        tcg_gen_deposit_tl(t0, t0, t1, lsb, msb + 32 - lsb + 1);
         break;
     case OPC_DINSU:
-        if (lsb > msb)
-            goto fail;
-        mask = ((1ULL << (msb - lsb + 1)) - 1) << (lsb + 32);
         gen_load_gpr(t0, rt);
-        tcg_gen_andi_tl(t0, t0, ~mask);
-        tcg_gen_shli_tl(t1, t1, lsb + 32);
-        tcg_gen_andi_tl(t1, t1, mask);
-        tcg_gen_or_tl(t0, t0, t1);
+        tcg_gen_deposit_tl(t0, t0, t1, lsb + 32, msb - lsb + 1);
         break;
     case OPC_DINS:
-        if (lsb > msb)
-            goto fail;
         gen_load_gpr(t0, rt);
-        mask = ((1ULL << (msb - lsb + 1)) - 1) << lsb;
-        gen_load_gpr(t0, rt);
-        tcg_gen_andi_tl(t0, t0, ~mask);
-        tcg_gen_shli_tl(t1, t1, lsb);
-        tcg_gen_andi_tl(t1, t1, mask);
-        tcg_gen_or_tl(t0, t0, t1);
+        tcg_gen_deposit_tl(t0, t0, t1, lsb, msb - lsb + 1);
         break;
 #endif
     default:
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 18/19] target-mips: fix TLBR wrt SEGMask
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (16 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 17/19] target-mips: use deposit instead of hardcoded version Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 19/19] target-mips: don't flush extra TLB on permissions upgrade Aurelien Jarno
  2012-10-31  6:37 ` [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Richard Henderson
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Like r4k_map_address(), r4k_helper_tlbp() should use SEGMask to mask the
address.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 773c710..cdd6880 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -1856,6 +1856,9 @@ void r4k_helper_tlbp(CPUMIPSState *env)
         mask = tlb->PageMask | ~(TARGET_PAGE_MASK << 1);
         tag = env->CP0_EntryHi & ~mask;
         VPN = tlb->VPN & ~mask;
+#if defined(TARGET_MIPS64)
+        tag &= env->SEGMask;
+#endif
         /* Check ASID, virtual page number & size */
         if ((tlb->G == 1 || tlb->ASID == ASID) && VPN == tag) {
             /* TLB match */
@@ -1871,6 +1874,9 @@ void r4k_helper_tlbp(CPUMIPSState *env)
             mask = tlb->PageMask | ~(TARGET_PAGE_MASK << 1);
             tag = env->CP0_EntryHi & ~mask;
             VPN = tlb->VPN & ~mask;
+#if defined(TARGET_MIPS64)
+            tag &= env->SEGMask;
+#endif
             /* Check ASID, virtual page number & size */
             if ((tlb->G == 1 || tlb->ASID == ASID) && VPN == tag) {
                 r4k_mips_tlb_flush_extra (env, i);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Qemu-devel] [PATCH v2 19/19] target-mips: don't flush extra TLB on permissions upgrade
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (17 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 18/19] target-mips: fix TLBR wrt SEGMask Aurelien Jarno
@ 2012-10-30  0:12 ` Aurelien Jarno
  2012-10-31  6:37 ` [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Richard Henderson
  19 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30  0:12 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

If the guest uses a TLBWI instruction for upgrading permissions, we
don't need to flush the extra TLBs. This improve boot time performance
by about 10%.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-mips/op_helper.c |   28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index cdd6880..f45d494 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -1819,14 +1819,32 @@ static void r4k_fill_tlb(CPUMIPSState *env, int idx)
 
 void r4k_helper_tlbwi(CPUMIPSState *env)
 {
+    r4k_tlb_t *tlb;
     int idx;
+    target_ulong VPN;
+    uint8_t ASID;
+    bool G, V0, D0, V1, D1;
 
     idx = (env->CP0_Index & ~0x80000000) % env->tlb->nb_tlb;
-
-    /* Discard cached TLB entries.  We could avoid doing this if the
-       tlbwi is just upgrading access permissions on the current entry;
-       that might be a further win.  */
-    r4k_mips_tlb_flush_extra (env, env->tlb->nb_tlb);
+    tlb = &env->tlb->mmu.r4k.tlb[idx];
+    VPN = env->CP0_EntryHi & (TARGET_PAGE_MASK << 1);
+#if defined(TARGET_MIPS64)
+    VPN &= env->SEGMask;
+#endif
+    ASID = env->CP0_EntryHi & 0xff;
+    G = env->CP0_EntryLo0 & env->CP0_EntryLo1 & 1;
+    V0 = (env->CP0_EntryLo0 & 2) != 0;
+    D0 = (env->CP0_EntryLo0 & 4) != 0;
+    V1 = (env->CP0_EntryLo1 & 2) != 0;
+    D1 = (env->CP0_EntryLo1 & 4) != 0;
+
+    /* Discard cached TLB entries, unless tlbwi is just upgrading access
+       permissions on the current entry. */
+    if (tlb->VPN != VPN || tlb->ASID != ASID || tlb->G != G ||
+        (tlb->V0 && !V0) || (tlb->D0 && !D0) ||
+        (tlb->V1 && !V1) || (tlb->D1 && !D1)) {
+        r4k_mips_tlb_flush_extra(env, env->tlb->nb_tlb);
+    }
 
     r4k_invalidate_tlb(env, idx, 0);
     r4k_fill_tlb(env, idx);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/19] target-mips: implement unaligned loads using TCG
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 13/19] target-mips: implement unaligned loads using TCG Aurelien Jarno
@ 2012-10-30 18:59   ` Blue Swirl
  2012-10-30 20:00     ` Aurelien Jarno
  0 siblings, 1 reply; 23+ messages in thread
From: Blue Swirl @ 2012-10-30 18:59 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On Tue, Oct 30, 2012 at 12:12 AM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> Load/store from helpers should be avoided as they are quite
> inefficient. Rewrite unaligned loads instructions using TCG and
> aligned loads. The number of actual loads operations to implement
> an unaligned load instruction is reduced from up to 8 to 1.

There are still other ops around the load operation. How about
implementing unaligned accesses at TCG level, then targets like x86
which don't care about alignment can implement them with normal
accesses more efficiently?

>
> Note: As we can't rely on shift by 32 or 64 undefined behaviour,
> the code loads already shift by one constants.
>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-mips/helper.h    |    4 --
>  target-mips/op_helper.c |  142 -----------------------------------------------
>  target-mips/translate.c |   75 ++++++++++++++++++++-----
>  3 files changed, 62 insertions(+), 159 deletions(-)
>
> diff --git a/target-mips/helper.h b/target-mips/helper.h
> index 210960f..0e38cdd 100644
> --- a/target-mips/helper.h
> +++ b/target-mips/helper.h
> @@ -4,13 +4,9 @@ DEF_HELPER_3(raise_exception_err, noreturn, env, i32, int)
>  DEF_HELPER_2(raise_exception, noreturn, env, i32)
>
>  #ifdef TARGET_MIPS64
> -DEF_HELPER_4(ldl, tl, env, tl, tl, int)
> -DEF_HELPER_4(ldr, tl, env, tl, tl, int)
>  DEF_HELPER_4(sdl, void, env, tl, tl, int)
>  DEF_HELPER_4(sdr, void, env, tl, tl, int)
>  #endif
> -DEF_HELPER_4(lwl, tl, env, tl, tl, int)
> -DEF_HELPER_4(lwr, tl, env, tl, tl, int)
>  DEF_HELPER_4(swl, void, env, tl, tl, int)
>  DEF_HELPER_4(swr, void, env, tl, tl, int)
>
> diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
> index 78497d9..773c710 100644
> --- a/target-mips/op_helper.c
> +++ b/target-mips/op_helper.c
> @@ -350,56 +350,6 @@ HELPER_ST_ATOMIC(scd, ld, sd, 0x7)
>  #define GET_OFFSET(addr, offset) (addr - (offset))
>  #endif
>
> -target_ulong helper_lwl(CPUMIPSState *env, target_ulong arg1,
> -                        target_ulong arg2, int mem_idx)
> -{
> -    target_ulong tmp;
> -
> -    tmp = do_lbu(env, arg2, mem_idx);
> -    arg1 = (arg1 & 0x00FFFFFF) | (tmp << 24);
> -
> -    if (GET_LMASK(arg2) <= 2) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 1), mem_idx);
> -        arg1 = (arg1 & 0xFF00FFFF) | (tmp << 16);
> -    }
> -
> -    if (GET_LMASK(arg2) <= 1) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 2), mem_idx);
> -        arg1 = (arg1 & 0xFFFF00FF) | (tmp << 8);
> -    }
> -
> -    if (GET_LMASK(arg2) == 0) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 3), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFF00) | tmp;
> -    }
> -    return (int32_t)arg1;
> -}
> -
> -target_ulong helper_lwr(CPUMIPSState *env, target_ulong arg1,
> -                        target_ulong arg2, int mem_idx)
> -{
> -    target_ulong tmp;
> -
> -    tmp = do_lbu(env, arg2, mem_idx);
> -    arg1 = (arg1 & 0xFFFFFF00) | tmp;
> -
> -    if (GET_LMASK(arg2) >= 1) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -1), mem_idx);
> -        arg1 = (arg1 & 0xFFFF00FF) | (tmp << 8);
> -    }
> -
> -    if (GET_LMASK(arg2) >= 2) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -2), mem_idx);
> -        arg1 = (arg1 & 0xFF00FFFF) | (tmp << 16);
> -    }
> -
> -    if (GET_LMASK(arg2) == 3) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -3), mem_idx);
> -        arg1 = (arg1 & 0x00FFFFFF) | (tmp << 24);
> -    }
> -    return (int32_t)arg1;
> -}
> -
>  void helper_swl(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
>                  int mem_idx)
>  {
> @@ -440,98 +390,6 @@ void helper_swr(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
>  #define GET_LMASK64(v) (((v) & 7) ^ 7)
>  #endif
>
> -target_ulong helper_ldl(CPUMIPSState *env, target_ulong arg1,
> -                        target_ulong arg2, int mem_idx)
> -{
> -    uint64_t tmp;
> -
> -    tmp = do_lbu(env, arg2, mem_idx);
> -    arg1 = (arg1 & 0x00FFFFFFFFFFFFFFULL) | (tmp << 56);
> -
> -    if (GET_LMASK64(arg2) <= 6) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 1), mem_idx);
> -        arg1 = (arg1 & 0xFF00FFFFFFFFFFFFULL) | (tmp << 48);
> -    }
> -
> -    if (GET_LMASK64(arg2) <= 5) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 2), mem_idx);
> -        arg1 = (arg1 & 0xFFFF00FFFFFFFFFFULL) | (tmp << 40);
> -    }
> -
> -    if (GET_LMASK64(arg2) <= 4) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 3), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFF00FFFFFFFFULL) | (tmp << 32);
> -    }
> -
> -    if (GET_LMASK64(arg2) <= 3) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 4), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFFFF00FFFFFFULL) | (tmp << 24);
> -    }
> -
> -    if (GET_LMASK64(arg2) <= 2) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 5), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFFFFFF00FFFFULL) | (tmp << 16);
> -    }
> -
> -    if (GET_LMASK64(arg2) <= 1) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 6), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFFFFFFFF00FFULL) | (tmp << 8);
> -    }
> -
> -    if (GET_LMASK64(arg2) == 0) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, 7), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFFFFFFFFFF00ULL) | tmp;
> -    }
> -
> -    return arg1;
> -}
> -
> -target_ulong helper_ldr(CPUMIPSState *env, target_ulong arg1,
> -                        target_ulong arg2, int mem_idx)
> -{
> -    uint64_t tmp;
> -
> -    tmp = do_lbu(env, arg2, mem_idx);
> -    arg1 = (arg1 & 0xFFFFFFFFFFFFFF00ULL) | tmp;
> -
> -    if (GET_LMASK64(arg2) >= 1) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -1), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFFFFFFFF00FFULL) | (tmp  << 8);
> -    }
> -
> -    if (GET_LMASK64(arg2) >= 2) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -2), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFFFFFF00FFFFULL) | (tmp << 16);
> -    }
> -
> -    if (GET_LMASK64(arg2) >= 3) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -3), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFFFF00FFFFFFULL) | (tmp << 24);
> -    }
> -
> -    if (GET_LMASK64(arg2) >= 4) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -4), mem_idx);
> -        arg1 = (arg1 & 0xFFFFFF00FFFFFFFFULL) | (tmp << 32);
> -    }
> -
> -    if (GET_LMASK64(arg2) >= 5) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -5), mem_idx);
> -        arg1 = (arg1 & 0xFFFF00FFFFFFFFFFULL) | (tmp << 40);
> -    }
> -
> -    if (GET_LMASK64(arg2) >= 6) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -6), mem_idx);
> -        arg1 = (arg1 & 0xFF00FFFFFFFFFFFFULL) | (tmp << 48);
> -    }
> -
> -    if (GET_LMASK64(arg2) == 7) {
> -        tmp = do_lbu(env, GET_OFFSET(arg2, -7), mem_idx);
> -        arg1 = (arg1 & 0x00FFFFFFFFFFFFFFULL) | (tmp << 56);
> -    }
> -
> -    return arg1;
> -}
> -
>  void helper_sdl(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
>                  int mem_idx)
>  {
> diff --git a/target-mips/translate.c b/target-mips/translate.c
> index c46129d..b385923 100644
> --- a/target-mips/translate.c
> +++ b/target-mips/translate.c
> @@ -1125,7 +1125,7 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
>                      int rt, int base, int16_t offset)
>  {
>      const char *opn = "ld";
> -    TCGv t0, t1;
> +    TCGv t0, t1, t2;
>
>      if (rt == 0 && env->insn_flags & (INSN_LOONGSON2E | INSN_LOONGSON2F)) {
>          /* Loongson CPU uses a load to zero register for prefetch.
> @@ -1157,21 +1157,45 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
>          opn = "lld";
>          break;
>      case OPC_LDL:
> -        save_cpu_state(ctx, 1);
>          t1 = tcg_temp_new();
> +        tcg_gen_andi_tl(t1, t0, 7);
> +#ifndef TARGET_WORDS_BIGENDIAN
> +        tcg_gen_xori_tl(t1, t1, 7);
> +#endif
> +        tcg_gen_shli_tl(t1, t1, 3);
> +        tcg_gen_andi_tl(t0, t0, ~7);
> +        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
> +        tcg_gen_shl_tl(t0, t0, t1);
> +        tcg_gen_xori_tl(t1, t1, 63);
> +        t2 = tcg_const_tl(0x7fffffffffffffffull);
> +        tcg_gen_shr_tl(t2, t2, t1);
>          gen_load_gpr(t1, rt);
> -        gen_helper_1e2i(ldl, t1, t1, t0, ctx->mem_idx);
> -        gen_store_gpr(t1, rt);
> +        tcg_gen_and_tl(t1, t1, t2);
> +        tcg_temp_free(t2);
> +        tcg_gen_or_tl(t0, t0, t1);
>          tcg_temp_free(t1);
> +        gen_store_gpr(t0, rt);
>          opn = "ldl";
>          break;
>      case OPC_LDR:
> -        save_cpu_state(ctx, 1);
>          t1 = tcg_temp_new();
> +        tcg_gen_andi_tl(t1, t0, 7);
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_gen_xori_tl(t1, t1, 7);
> +#endif
> +        tcg_gen_shli_tl(t1, t1, 3);
> +        tcg_gen_andi_tl(t0, t0, ~7);
> +        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
> +        tcg_gen_shr_tl(t0, t0, t1);
> +        tcg_gen_xori_tl(t1, t1, 63);
> +        t2 = tcg_const_tl(0xfffffffffffffffeull);
> +        tcg_gen_shl_tl(t2, t2, t1);
>          gen_load_gpr(t1, rt);
> -        gen_helper_1e2i(ldr, t1, t1, t0, ctx->mem_idx);
> -        gen_store_gpr(t1, rt);
> +        tcg_gen_and_tl(t1, t1, t2);
> +        tcg_temp_free(t2);
> +        tcg_gen_or_tl(t0, t0, t1);
>          tcg_temp_free(t1);
> +        gen_store_gpr(t0, rt);
>          opn = "ldr";
>          break;
>      case OPC_LDPC:
> @@ -1217,21 +1241,46 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
>          opn = "lbu";
>          break;
>      case OPC_LWL:
> -        save_cpu_state(ctx, 1);
>          t1 = tcg_temp_new();
> +        tcg_gen_andi_tl(t1, t0, 3);
> +#ifndef TARGET_WORDS_BIGENDIAN
> +        tcg_gen_xori_tl(t1, t1, 3);
> +#endif
> +        tcg_gen_shli_tl(t1, t1, 3);
> +        tcg_gen_andi_tl(t0, t0, ~3);
> +        tcg_gen_qemu_ld32u(t0, t0, ctx->mem_idx);
> +        tcg_gen_shl_tl(t0, t0, t1);
> +        tcg_gen_xori_tl(t1, t1, 31);
> +        t2 = tcg_const_tl(0x7fffffffull);
> +        tcg_gen_shr_tl(t2, t2, t1);
>          gen_load_gpr(t1, rt);
> -        gen_helper_1e2i(lwl, t1, t1, t0, ctx->mem_idx);
> -        gen_store_gpr(t1, rt);
> +        tcg_gen_and_tl(t1, t1, t2);
> +        tcg_temp_free(t2);
> +        tcg_gen_or_tl(t0, t0, t1);
>          tcg_temp_free(t1);
> +        tcg_gen_ext32s_tl(t0, t0);
> +        gen_store_gpr(t0, rt);
>          opn = "lwl";
>          break;
>      case OPC_LWR:
> -        save_cpu_state(ctx, 1);
>          t1 = tcg_temp_new();
> +        tcg_gen_andi_tl(t1, t0, 3);
> +#ifdef TARGET_WORDS_BIGENDIAN
> +        tcg_gen_xori_tl(t1, t1, 3);
> +#endif
> +        tcg_gen_shli_tl(t1, t1, 3);
> +        tcg_gen_andi_tl(t0, t0, ~3);
> +        tcg_gen_qemu_ld32u(t0, t0, ctx->mem_idx);
> +        tcg_gen_shr_tl(t0, t0, t1);
> +        tcg_gen_xori_tl(t1, t1, 31);
> +        t2 = tcg_const_tl(0xfffffffeull);
> +        tcg_gen_shl_tl(t2, t2, t1);
>          gen_load_gpr(t1, rt);
> -        gen_helper_1e2i(lwr, t1, t1, t0, ctx->mem_idx);
> -        gen_store_gpr(t1, rt);
> +        tcg_gen_and_tl(t1, t1, t2);
> +        tcg_temp_free(t2);
> +        tcg_gen_or_tl(t0, t0, t1);
>          tcg_temp_free(t1);
> +        gen_store_gpr(t0, rt);
>          opn = "lwr";
>          break;
>      case OPC_LL:
> --
> 1.7.10.4
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 13/19] target-mips: implement unaligned loads using TCG
  2012-10-30 18:59   ` Blue Swirl
@ 2012-10-30 20:00     ` Aurelien Jarno
  0 siblings, 0 replies; 23+ messages in thread
From: Aurelien Jarno @ 2012-10-30 20:00 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On Tue, Oct 30, 2012 at 06:59:36PM +0000, Blue Swirl wrote:
> On Tue, Oct 30, 2012 at 12:12 AM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> > Load/store from helpers should be avoided as they are quite
> > inefficient. Rewrite unaligned loads instructions using TCG and
> > aligned loads. The number of actual loads operations to implement
> > an unaligned load instruction is reduced from up to 8 to 1.
> 
> There are still other ops around the load operation. How about
> implementing unaligned accesses at TCG level, then targets like x86
> which don't care about alignment can implement them with normal
> accesses more efficiently?

Well maybe the name "unaligned load instructions is misleading". These
instructions actually do not do any unaligned access, instead they
merge the value from memory in the left (LWL, LDL) or right (LWR, LDR)
part of the register, the number of merged bytes depends on the actual
alignemnt of the address. That way a combination of LWL + LWR or LDL +
LDR instructions provide an effective unaligned access. That's why there
are still ops around the actually load, for merging the value in the
register.

If you want to reduce the number of ops around the load, the way to go
is to add a deposit op that take registers for ofs and len. Not sure
it's worthwhile here.

> >
> > Note: As we can't rely on shift by 32 or 64 undefined behaviour,
> > the code loads already shift by one constants.
> >
> > Reviewed-by: Richard Henderson <rth@twiddle.net>
> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> > ---
> >  target-mips/helper.h    |    4 --
> >  target-mips/op_helper.c |  142 -----------------------------------------------
> >  target-mips/translate.c |   75 ++++++++++++++++++++-----
> >  3 files changed, 62 insertions(+), 159 deletions(-)
> >
> > diff --git a/target-mips/helper.h b/target-mips/helper.h
> > index 210960f..0e38cdd 100644
> > --- a/target-mips/helper.h
> > +++ b/target-mips/helper.h
> > @@ -4,13 +4,9 @@ DEF_HELPER_3(raise_exception_err, noreturn, env, i32, int)
> >  DEF_HELPER_2(raise_exception, noreturn, env, i32)
> >
> >  #ifdef TARGET_MIPS64
> > -DEF_HELPER_4(ldl, tl, env, tl, tl, int)
> > -DEF_HELPER_4(ldr, tl, env, tl, tl, int)
> >  DEF_HELPER_4(sdl, void, env, tl, tl, int)
> >  DEF_HELPER_4(sdr, void, env, tl, tl, int)
> >  #endif
> > -DEF_HELPER_4(lwl, tl, env, tl, tl, int)
> > -DEF_HELPER_4(lwr, tl, env, tl, tl, int)
> >  DEF_HELPER_4(swl, void, env, tl, tl, int)
> >  DEF_HELPER_4(swr, void, env, tl, tl, int)
> >
> > diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
> > index 78497d9..773c710 100644
> > --- a/target-mips/op_helper.c
> > +++ b/target-mips/op_helper.c
> > @@ -350,56 +350,6 @@ HELPER_ST_ATOMIC(scd, ld, sd, 0x7)
> >  #define GET_OFFSET(addr, offset) (addr - (offset))
> >  #endif
> >
> > -target_ulong helper_lwl(CPUMIPSState *env, target_ulong arg1,
> > -                        target_ulong arg2, int mem_idx)
> > -{
> > -    target_ulong tmp;
> > -
> > -    tmp = do_lbu(env, arg2, mem_idx);
> > -    arg1 = (arg1 & 0x00FFFFFF) | (tmp << 24);
> > -
> > -    if (GET_LMASK(arg2) <= 2) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 1), mem_idx);
> > -        arg1 = (arg1 & 0xFF00FFFF) | (tmp << 16);
> > -    }
> > -
> > -    if (GET_LMASK(arg2) <= 1) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 2), mem_idx);
> > -        arg1 = (arg1 & 0xFFFF00FF) | (tmp << 8);
> > -    }
> > -
> > -    if (GET_LMASK(arg2) == 0) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 3), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFF00) | tmp;
> > -    }
> > -    return (int32_t)arg1;
> > -}
> > -
> > -target_ulong helper_lwr(CPUMIPSState *env, target_ulong arg1,
> > -                        target_ulong arg2, int mem_idx)
> > -{
> > -    target_ulong tmp;
> > -
> > -    tmp = do_lbu(env, arg2, mem_idx);
> > -    arg1 = (arg1 & 0xFFFFFF00) | tmp;
> > -
> > -    if (GET_LMASK(arg2) >= 1) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -1), mem_idx);
> > -        arg1 = (arg1 & 0xFFFF00FF) | (tmp << 8);
> > -    }
> > -
> > -    if (GET_LMASK(arg2) >= 2) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -2), mem_idx);
> > -        arg1 = (arg1 & 0xFF00FFFF) | (tmp << 16);
> > -    }
> > -
> > -    if (GET_LMASK(arg2) == 3) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -3), mem_idx);
> > -        arg1 = (arg1 & 0x00FFFFFF) | (tmp << 24);
> > -    }
> > -    return (int32_t)arg1;
> > -}
> > -
> >  void helper_swl(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
> >                  int mem_idx)
> >  {
> > @@ -440,98 +390,6 @@ void helper_swr(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
> >  #define GET_LMASK64(v) (((v) & 7) ^ 7)
> >  #endif
> >
> > -target_ulong helper_ldl(CPUMIPSState *env, target_ulong arg1,
> > -                        target_ulong arg2, int mem_idx)
> > -{
> > -    uint64_t tmp;
> > -
> > -    tmp = do_lbu(env, arg2, mem_idx);
> > -    arg1 = (arg1 & 0x00FFFFFFFFFFFFFFULL) | (tmp << 56);
> > -
> > -    if (GET_LMASK64(arg2) <= 6) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 1), mem_idx);
> > -        arg1 = (arg1 & 0xFF00FFFFFFFFFFFFULL) | (tmp << 48);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) <= 5) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 2), mem_idx);
> > -        arg1 = (arg1 & 0xFFFF00FFFFFFFFFFULL) | (tmp << 40);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) <= 4) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 3), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFF00FFFFFFFFULL) | (tmp << 32);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) <= 3) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 4), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFFFF00FFFFFFULL) | (tmp << 24);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) <= 2) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 5), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFFFFFF00FFFFULL) | (tmp << 16);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) <= 1) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 6), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFFFFFFFF00FFULL) | (tmp << 8);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) == 0) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, 7), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFFFFFFFFFF00ULL) | tmp;
> > -    }
> > -
> > -    return arg1;
> > -}
> > -
> > -target_ulong helper_ldr(CPUMIPSState *env, target_ulong arg1,
> > -                        target_ulong arg2, int mem_idx)
> > -{
> > -    uint64_t tmp;
> > -
> > -    tmp = do_lbu(env, arg2, mem_idx);
> > -    arg1 = (arg1 & 0xFFFFFFFFFFFFFF00ULL) | tmp;
> > -
> > -    if (GET_LMASK64(arg2) >= 1) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -1), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFFFFFFFF00FFULL) | (tmp  << 8);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) >= 2) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -2), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFFFFFF00FFFFULL) | (tmp << 16);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) >= 3) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -3), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFFFF00FFFFFFULL) | (tmp << 24);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) >= 4) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -4), mem_idx);
> > -        arg1 = (arg1 & 0xFFFFFF00FFFFFFFFULL) | (tmp << 32);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) >= 5) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -5), mem_idx);
> > -        arg1 = (arg1 & 0xFFFF00FFFFFFFFFFULL) | (tmp << 40);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) >= 6) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -6), mem_idx);
> > -        arg1 = (arg1 & 0xFF00FFFFFFFFFFFFULL) | (tmp << 48);
> > -    }
> > -
> > -    if (GET_LMASK64(arg2) == 7) {
> > -        tmp = do_lbu(env, GET_OFFSET(arg2, -7), mem_idx);
> > -        arg1 = (arg1 & 0x00FFFFFFFFFFFFFFULL) | (tmp << 56);
> > -    }
> > -
> > -    return arg1;
> > -}
> > -
> >  void helper_sdl(CPUMIPSState *env, target_ulong arg1, target_ulong arg2,
> >                  int mem_idx)
> >  {
> > diff --git a/target-mips/translate.c b/target-mips/translate.c
> > index c46129d..b385923 100644
> > --- a/target-mips/translate.c
> > +++ b/target-mips/translate.c
> > @@ -1125,7 +1125,7 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
> >                      int rt, int base, int16_t offset)
> >  {
> >      const char *opn = "ld";
> > -    TCGv t0, t1;
> > +    TCGv t0, t1, t2;
> >
> >      if (rt == 0 && env->insn_flags & (INSN_LOONGSON2E | INSN_LOONGSON2F)) {
> >          /* Loongson CPU uses a load to zero register for prefetch.
> > @@ -1157,21 +1157,45 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
> >          opn = "lld";
> >          break;
> >      case OPC_LDL:
> > -        save_cpu_state(ctx, 1);
> >          t1 = tcg_temp_new();
> > +        tcg_gen_andi_tl(t1, t0, 7);
> > +#ifndef TARGET_WORDS_BIGENDIAN
> > +        tcg_gen_xori_tl(t1, t1, 7);
> > +#endif
> > +        tcg_gen_shli_tl(t1, t1, 3);
> > +        tcg_gen_andi_tl(t0, t0, ~7);
> > +        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
> > +        tcg_gen_shl_tl(t0, t0, t1);
> > +        tcg_gen_xori_tl(t1, t1, 63);
> > +        t2 = tcg_const_tl(0x7fffffffffffffffull);
> > +        tcg_gen_shr_tl(t2, t2, t1);
> >          gen_load_gpr(t1, rt);
> > -        gen_helper_1e2i(ldl, t1, t1, t0, ctx->mem_idx);
> > -        gen_store_gpr(t1, rt);
> > +        tcg_gen_and_tl(t1, t1, t2);
> > +        tcg_temp_free(t2);
> > +        tcg_gen_or_tl(t0, t0, t1);
> >          tcg_temp_free(t1);
> > +        gen_store_gpr(t0, rt);
> >          opn = "ldl";
> >          break;
> >      case OPC_LDR:
> > -        save_cpu_state(ctx, 1);
> >          t1 = tcg_temp_new();
> > +        tcg_gen_andi_tl(t1, t0, 7);
> > +#ifdef TARGET_WORDS_BIGENDIAN
> > +        tcg_gen_xori_tl(t1, t1, 7);
> > +#endif
> > +        tcg_gen_shli_tl(t1, t1, 3);
> > +        tcg_gen_andi_tl(t0, t0, ~7);
> > +        tcg_gen_qemu_ld64(t0, t0, ctx->mem_idx);
> > +        tcg_gen_shr_tl(t0, t0, t1);
> > +        tcg_gen_xori_tl(t1, t1, 63);
> > +        t2 = tcg_const_tl(0xfffffffffffffffeull);
> > +        tcg_gen_shl_tl(t2, t2, t1);
> >          gen_load_gpr(t1, rt);
> > -        gen_helper_1e2i(ldr, t1, t1, t0, ctx->mem_idx);
> > -        gen_store_gpr(t1, rt);
> > +        tcg_gen_and_tl(t1, t1, t2);
> > +        tcg_temp_free(t2);
> > +        tcg_gen_or_tl(t0, t0, t1);
> >          tcg_temp_free(t1);
> > +        gen_store_gpr(t0, rt);
> >          opn = "ldr";
> >          break;
> >      case OPC_LDPC:
> > @@ -1217,21 +1241,46 @@ static void gen_ld (CPUMIPSState *env, DisasContext *ctx, uint32_t opc,
> >          opn = "lbu";
> >          break;
> >      case OPC_LWL:
> > -        save_cpu_state(ctx, 1);
> >          t1 = tcg_temp_new();
> > +        tcg_gen_andi_tl(t1, t0, 3);
> > +#ifndef TARGET_WORDS_BIGENDIAN
> > +        tcg_gen_xori_tl(t1, t1, 3);
> > +#endif
> > +        tcg_gen_shli_tl(t1, t1, 3);
> > +        tcg_gen_andi_tl(t0, t0, ~3);
> > +        tcg_gen_qemu_ld32u(t0, t0, ctx->mem_idx);
> > +        tcg_gen_shl_tl(t0, t0, t1);
> > +        tcg_gen_xori_tl(t1, t1, 31);
> > +        t2 = tcg_const_tl(0x7fffffffull);
> > +        tcg_gen_shr_tl(t2, t2, t1);
> >          gen_load_gpr(t1, rt);
> > -        gen_helper_1e2i(lwl, t1, t1, t0, ctx->mem_idx);
> > -        gen_store_gpr(t1, rt);
> > +        tcg_gen_and_tl(t1, t1, t2);
> > +        tcg_temp_free(t2);
> > +        tcg_gen_or_tl(t0, t0, t1);
> >          tcg_temp_free(t1);
> > +        tcg_gen_ext32s_tl(t0, t0);
> > +        gen_store_gpr(t0, rt);
> >          opn = "lwl";
> >          break;
> >      case OPC_LWR:
> > -        save_cpu_state(ctx, 1);
> >          t1 = tcg_temp_new();
> > +        tcg_gen_andi_tl(t1, t0, 3);
> > +#ifdef TARGET_WORDS_BIGENDIAN
> > +        tcg_gen_xori_tl(t1, t1, 3);
> > +#endif
> > +        tcg_gen_shli_tl(t1, t1, 3);
> > +        tcg_gen_andi_tl(t0, t0, ~3);
> > +        tcg_gen_qemu_ld32u(t0, t0, ctx->mem_idx);
> > +        tcg_gen_shr_tl(t0, t0, t1);
> > +        tcg_gen_xori_tl(t1, t1, 31);
> > +        t2 = tcg_const_tl(0xfffffffeull);
> > +        tcg_gen_shl_tl(t2, t2, t1);
> >          gen_load_gpr(t1, rt);
> > -        gen_helper_1e2i(lwr, t1, t1, t0, ctx->mem_idx);
> > -        gen_store_gpr(t1, rt);
> > +        tcg_gen_and_tl(t1, t1, t2);
> > +        tcg_temp_free(t2);
> > +        tcg_gen_or_tl(t0, t0, t1);
> >          tcg_temp_free(t1);
> > +        gen_store_gpr(t0, rt);
> >          opn = "lwr";
> >          break;
> >      case OPC_LL:
> > --
> > 1.7.10.4
> >
> >
> 

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations
  2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
                   ` (18 preceding siblings ...)
  2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 19/19] target-mips: don't flush extra TLB on permissions upgrade Aurelien Jarno
@ 2012-10-31  6:37 ` Richard Henderson
  19 siblings, 0 replies; 23+ messages in thread
From: Richard Henderson @ 2012-10-31  6:37 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 2012-10-30 11:11, Aurelien Jarno wrote:
> Aurelien Jarno (19):
>   target-mips: correctly restore btarget upon exception
>   target-mips: do not save CPU state when using retranslation
>   softfloat: implement fused multiply-add NaN propagation for MIPS
>   target-mips: use the softfloat floatXX_muladd functions
>   target-mips: keep softfloat exception set to 0 between instructions
>   target-mips: fix FPU exceptions
>   target-mips: cleanup float to int conversion helpers
>   target-mips: use softfloat constants when possible
>   target-mips: restore CPU state after an FPU exception
>   target-mips: cleanup load/store operations
>   target-mips: optimize load operations
>   target-mips: simplify load/store microMIPS helpers
>   target-mips: implement unaligned loads using TCG
>   target-mips: don't use local temps for store conditional
>   target-mips: implement movn/movz using movcond
>   target-mips: optimize ddiv/ddivu/div/divu with movcond
>   target-mips: use deposit instead of hardcoded version

Reviewed-by: Richard Henderson <rth@twiddle.net>

>   target-mips: fix TLBR wrt SEGMask
>   target-mips: don't flush extra TLB on permissions upgrade

Not reviewed; I'm not that familiar with mips at other than the isa level.


r~

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-10-31  6:38 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-30  0:11 [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Aurelien Jarno
2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 01/19] target-mips: correctly restore btarget upon exception Aurelien Jarno
2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 02/19] target-mips: do not save CPU state when using retranslation Aurelien Jarno
2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 03/19] softfloat: implement fused multiply-add NaN propagation for MIPS Aurelien Jarno
2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 04/19] target-mips: use the softfloat floatXX_muladd functions Aurelien Jarno
2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 05/19] target-mips: keep softfloat exception set to 0 between instructions Aurelien Jarno
2012-10-30  0:11 ` [Qemu-devel] [PATCH v2 06/19] target-mips: fix FPU exceptions Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 07/19] target-mips: cleanup float to int conversion helpers Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 08/19] target-mips: use softfloat constants when possible Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 09/19] target-mips: restore CPU state after an FPU exception Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 10/19] target-mips: cleanup load/store operations Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 11/19] target-mips: optimize load operations Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 12/19] target-mips: simplify load/store microMIPS helpers Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 13/19] target-mips: implement unaligned loads using TCG Aurelien Jarno
2012-10-30 18:59   ` Blue Swirl
2012-10-30 20:00     ` Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 14/19] target-mips: don't use local temps for store conditional Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 15/19] target-mips: implement movn/movz using movcond Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 16/19] target-mips: optimize ddiv/ddivu/div/divu with movcond Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 17/19] target-mips: use deposit instead of hardcoded version Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 18/19] target-mips: fix TLBR wrt SEGMask Aurelien Jarno
2012-10-30  0:12 ` [Qemu-devel] [PATCH v2 19/19] target-mips: don't flush extra TLB on permissions upgrade Aurelien Jarno
2012-10-31  6:37 ` [Qemu-devel] [PATCH v2 00/19] target-mips: misc fixes and optimizations Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).