qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements
@ 2011-10-26 21:15 Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 01/16] target-sparc: Add accessors for single-precision fpr access Richard Henderson
                   ` (16 more replies)
  0 siblings, 17 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Changes v1->v2:
  * sparc-linux-user and unrelated tcg patches removed,
  * fabsd env/constification folded into patch 5
  * always_inline hack and fallout in patch 6 mitigated by marking all
    of the helper functions inline as well.
  * some coding-style issues cleaned up
  * rebased vs mainline, now that blueswirl's series is installed


r~


Richard Henderson (16):
  target-sparc: Add accessors for single-precision fpr access.
  target-sparc: Mark fprs dirty in store accessor.
  target-sparc: Add accessors for double-precision fpr access.
  target-sparc: Pass float64 parameters instead of dt0/1 temporaries.
  target-sparc: Make FPU/VIS helpers const when possible.
  target-sparc: Extract common code for floating-point operations.
  target-sparc: Extract float128 move to a function.
  target-sparc: Undo cpu_fpr rename.
  target-sparc: Change fpr representation to doubles.
  target-sparc: Do exceptions management fully inside the helpers.
  target-sparc: Implement PDIST.
  target-sparc: Implement fpack{16,32,fix}.
  target-sparc: Implement EDGE* instructions.
  target-sparc: Implement ALIGNADDR* inline.
  target-sparc: Implement BMASK/BSHUFFLE.
  target-sparc: Implement FALIGNDATA inline.

 gdbstub.c                  |   35 +-
 linux-user/signal.c        |   28 +-
 monitor.c                  |   96 ++--
 target-sparc/cpu.h         |    8 +-
 target-sparc/cpu_init.c    |    6 +-
 target-sparc/fop_helper.c  |  294 ++++++----
 target-sparc/helper.h      |  122 ++--
 target-sparc/ldst_helper.c |  123 +---
 target-sparc/machine.c     |   20 +-
 target-sparc/translate.c   | 1460 +++++++++++++++++++++++++-------------------
 target-sparc/vis_helper.c  |  251 +++++---
 11 files changed, 1400 insertions(+), 1043 deletions(-)

-- 
1.7.6.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 01/16] target-sparc: Add accessors for single-precision fpr access.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 02/16] target-sparc: Mark fprs dirty in store accessor Richard Henderson
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Load, store, and "create destination".  This version attempts to
change the behaviour of the translator as little as possible.  We
previously used cpu_tmp32 as the temporary destination, and we
continue to use that.  This will eventually allow a change in
representation of the fprs.

Change the name of the cpu_fpr array to make certain that all
instances are converted.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |  532 +++++++++++++++++++++++++++++-----------------
 1 files changed, 337 insertions(+), 195 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 383fd9c..da52ce6 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -63,7 +63,7 @@ static TCGv cpu_tmp0;
 static TCGv_i32 cpu_tmp32;
 static TCGv_i64 cpu_tmp64;
 /* Floating point registers */
-static TCGv_i32 cpu_fpr[TARGET_FPREGS];
+static TCGv_i32 cpu__fpr[TARGET_FPREGS];
 
 static target_ulong gen_opc_npc[OPC_BUF_SIZE];
 static target_ulong gen_opc_jump_pc[2];
@@ -115,63 +115,78 @@ static int sign_extend(int x, int len)
 #define IS_IMM (insn & (1<<13))
 
 /* floating point registers moves */
+static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
+{
+    return cpu__fpr[src];
+}
+
+static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
+{
+    tcg_gen_mov_i32 (cpu__fpr[dst], v);
+}
+
+static TCGv_i32 gen_dest_fpr_F(void)
+{
+    return cpu_tmp32;
+}
+
 static void gen_op_load_fpr_DT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.lower));
 }
 
 static void gen_op_load_fpr_DT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, dt1) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt1) +
                    offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt1) +
                    offsetof(CPU_DoubleU, l.lower));
 }
 
 static void gen_op_store_DT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.lower));
 }
 
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_load_fpr_QT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_store_QT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_ld_i32(cpu_fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_ld_i32(cpu_fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
@@ -1892,6 +1907,7 @@ static void disas_sparc_insn(DisasContext * dc)
 {
     unsigned int insn, opc, rs1, rs2, rd;
     TCGv cpu_src1, cpu_src2, cpu_tmp1, cpu_tmp2;
+    TCGv_i32 cpu_src1_32, cpu_src2_32, cpu_dst_32;
     target_long simm;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)))
@@ -2369,23 +2385,32 @@ static void disas_sparc_insn(DisasContext * dc)
                 save_state(dc, cpu_cond);
                 switch (xop) {
                 case 0x1: /* fmovs */
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x5: /* fnegs */
-                    gen_helper_fnegs(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fnegs(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x9: /* fabss */
-                    gen_helper_fabss(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fabss(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x29: /* fsqrts */
                     CHECK_FPU_FEATURE(dc, FSQRT);
                     gen_clear_float_exceptions();
-                    gen_helper_fsqrts(cpu_tmp32, cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fsqrts(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x2a: /* fsqrtd */
@@ -2408,10 +2433,13 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x41: /* fadds */
                     gen_clear_float_exceptions();
-                    gen_helper_fadds(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fadds(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x42: /* faddd */
@@ -2435,10 +2463,13 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x45: /* fsubs */
                     gen_clear_float_exceptions();
-                    gen_helper_fsubs(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fsubs(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x46: /* fsubd */
@@ -2463,10 +2494,13 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0x49: /* fmuls */
                     CHECK_FPU_FEATURE(dc, FMUL);
                     gen_clear_float_exceptions();
-                    gen_helper_fmuls(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fmuls(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x4a: /* fmuld */
@@ -2492,10 +2526,13 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x4d: /* fdivs */
                     gen_clear_float_exceptions();
-                    gen_helper_fdivs(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fdivs(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x4e: /* fdivd */
@@ -2520,7 +2557,9 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0x69: /* fsmuld */
                     CHECK_FPU_FEATURE(dc, FSMULD);
                     gen_clear_float_exceptions();
-                    gen_helper_fsmuld(cpu_env, cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fsmuld(cpu_env, cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
@@ -2537,35 +2576,41 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0xc4: /* fitos */
                     gen_clear_float_exceptions();
-                    gen_helper_fitos(cpu_tmp32, cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fitos(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xc6: /* fdtos */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdtos(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fdtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xc7: /* fqtos */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtos(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fqtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xc8: /* fitod */
-                    gen_helper_fitod(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fitod(cpu_env, cpu_src1_32);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0xc9: /* fstod */
-                    gen_helper_fstod(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fstod(cpu_env, cpu_src1_32);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
@@ -2580,13 +2625,15 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0xcc: /* fitoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_helper_fitoq(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fitoq(cpu_env, cpu_src1_32);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0xcd: /* fstoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_helper_fstoq(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fstoq(cpu_env, cpu_src1_32);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
@@ -2599,44 +2646,50 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0xd1: /* fstoi */
                     gen_clear_float_exceptions();
-                    gen_helper_fstoi(cpu_tmp32, cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fstoi(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xd2: /* fdtoi */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdtoi(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fdtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xd3: /* fqtoi */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtoi(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fqtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x3: /* V9 fmovq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)], cpu_fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1],
-                                    cpu_fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2],
-                                    cpu_fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3],
-                                    cpu_fpr[QFPREG(rs2) + 3]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],
+                                    cpu__fpr[QFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],
+                                    cpu__fpr[QFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],
+                                    cpu__fpr[QFPREG(rs2) + 2]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],
+                                    cpu__fpr[QFPREG(rs2) + 3]);
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0x6: /* V9 fnegd */
@@ -2667,7 +2720,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x81: /* V9 fstox */
                     gen_clear_float_exceptions();
-                    gen_helper_fstox(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fstox(cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
@@ -2692,9 +2746,10 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0x84: /* V9 fxtos */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fxtos(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fxtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x88: /* V9 fxtod */
@@ -2738,7 +2793,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     gen_set_label(l1);
                     break;
@@ -2750,8 +2806,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1], cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)], cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1], cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     gen_set_label(l1);
                     break;
@@ -2764,10 +2820,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)], cpu_fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1], cpu_fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2], cpu_fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3], cpu_fpr[QFPREG(rs2) + 3]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)], cpu__fpr[QFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1], cpu__fpr[QFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2], cpu__fpr[QFPREG(rs2) + 2]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3], cpu__fpr[QFPREG(rs2) + 3]);
                     gen_update_fprs_dirty(QFPREG(rd));
                     gen_set_label(l1);
                     break;
@@ -2786,7 +2842,8 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);     \
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
+                        gen_store_fpr_F(dc, rd, cpu_src1_32);           \
                         gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2802,10 +2859,10 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)],            \
-                                        cpu_fpr[DFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],        \
-                                        cpu_fpr[DFPREG(rs2) + 1]);      \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
+                                        cpu__fpr[DFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
+                                        cpu__fpr[DFPREG(rs2) + 1]);     \
                         gen_update_fprs_dirty(DFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2821,14 +2878,14 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)],            \
-                                        cpu_fpr[QFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1],        \
-                                        cpu_fpr[QFPREG(rs2) + 1]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2],        \
-                                        cpu_fpr[QFPREG(rs2) + 2]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3],        \
-                                        cpu_fpr[QFPREG(rs2) + 3]);      \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
+                                        cpu__fpr[QFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
+                                        cpu__fpr[QFPREG(rs2) + 1]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
+                                        cpu__fpr[QFPREG(rs2) + 2]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
+                                        cpu__fpr[QFPREG(rs2) + 3]);     \
                         gen_update_fprs_dirty(QFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2887,7 +2944,8 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);     \
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
+                        gen_store_fpr_F(dc, rd, cpu_src1_32);           \
                         gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2903,10 +2961,10 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)],            \
-                                        cpu_fpr[DFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],        \
-                                        cpu_fpr[DFPREG(rs2) + 1]);      \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
+                                        cpu__fpr[DFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
+                                        cpu__fpr[DFPREG(rs2) + 1]);     \
                         gen_update_fprs_dirty(DFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2922,14 +2980,14 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)],            \
-                                        cpu_fpr[QFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1],        \
-                                        cpu_fpr[QFPREG(rs2) + 1]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2],        \
-                                        cpu_fpr[QFPREG(rs2) + 2]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3],        \
-                                        cpu_fpr[QFPREG(rs2) + 3]);      \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
+                                        cpu__fpr[QFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
+                                        cpu__fpr[QFPREG(rs2) + 1]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
+                                        cpu__fpr[QFPREG(rs2) + 2]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
+                                        cpu__fpr[QFPREG(rs2) + 3]);     \
                         gen_update_fprs_dirty(QFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2960,7 +3018,9 @@ static void disas_sparc_insn(DisasContext * dc)
 #undef FMOVQCC
 #endif
                     case 0x51: /* fcmps, V9 %fcc */
-                        gen_op_fcmps(rd & 3, cpu_fpr[rs1], cpu_fpr[rs2]);
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                        cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                        gen_op_fcmps(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x52: /* fcmpd, V9 %fcc */
                         gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2974,7 +3034,9 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_op_fcmpq(rd & 3);
                         break;
                     case 0x55: /* fcmpes, V9 %fcc */
-                        gen_op_fcmpes(rd & 3, cpu_fpr[rs1], cpu_fpr[rs2]);
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                        cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                        gen_op_fcmpes(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x56: /* fcmped, V9 %fcc */
                         gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -4021,8 +4083,12 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x051: /* VIS I fpadd16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpadd16s(cpu_fpr[rd], cpu_env,
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpadd16s(cpu_dst_32, cpu_env,
+                                        cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x052: /* VIS I fpadd32 */
@@ -4035,8 +4101,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x053: /* VIS I fpadd32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpadd32s(cpu_fpr[rd], cpu_env,
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_add_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x054: /* VIS I fpsub16 */
@@ -4049,8 +4118,12 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x055: /* VIS I fpsub16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpsub16s(cpu_fpr[rd], cpu_env,
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpsub16s(cpu_dst_32, cpu_env,
+                                        cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x056: /* VIS I fpsub32 */
@@ -4063,169 +4136,222 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x057: /* VIS I fpsub32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpsub32s(cpu_fpr[rd], cpu_env,
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_sub_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd)], 0);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd) + 1], 0);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], 0);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], 0);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x061: /* VIS I fzeros */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[rd], 0);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_movi_i32(cpu_dst_32, 0);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nor_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_nor_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x063: /* VIS I fnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nor_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_nor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                     cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                     cpu_fpr[DFPREG(rs1) + 1],
-                                     cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
+                                     cpu__fpr[DFPREG(rs1)],
+                                     cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                     cpu__fpr[DFPREG(rs1) + 1],
+                                     cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x065: /* VIS I fandnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_andc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x067: /* VIS I fnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)],
-                                     cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                     cpu_fpr[DFPREG(rs2) + 1],
-                                     cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
+                                     cpu__fpr[DFPREG(rs2)],
+                                     cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                     cpu__fpr[DFPREG(rs2) + 1],
+                                     cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x069: /* VIS I fandnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[rd], cpu_fpr[rs2], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_andc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x06b: /* VIS I fnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[rd], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x06d: /* VIS I fxors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xor_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_xor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nand_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                     cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_nand_i32(cpu_fpr[DFPREG(rd) + 1],
-                                     cpu_fpr[DFPREG(rs1) + 1],
-                                     cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd)],
+                                     cpu__fpr[DFPREG(rs1)],
+                                     cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd) + 1],
+                                     cpu__fpr[DFPREG(rs1) + 1],
+                                     cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x06f: /* VIS I fnands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nand_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_nand_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_and_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_and_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x071: /* VIS I fands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_and_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_and_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xori_i32(cpu_tmp32, cpu_fpr[DFPREG(rs2)], -1);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd)], cpu_tmp32,
-                                    cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_xori_i32(cpu_tmp32, cpu_fpr[DFPREG(rs2) + 1], -1);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd) + 1], cpu_tmp32,
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x073: /* VIS I fxnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xori_i32(cpu_tmp32, cpu_fpr[rs2], -1);
-                    tcg_gen_xor_i32(cpu_fpr[rd], cpu_tmp32, cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_eqv_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x075: /* VIS I fsrc1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x077: /* VIS I fornot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_orc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x078: /* VIS I fsrc2 */
@@ -4236,46 +4362,59 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x079: /* VIS I fsrc2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)],
-                                    cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs2)],
+                                    cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x07b: /* VIS I fornot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[rd], cpu_fpr[rs2], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_orc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_or_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                   cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_or_i32(cpu_fpr[DFPREG(rd) + 1],
-                                   cpu_fpr[DFPREG(rs1) + 1],
-                                   cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd)],
+                                   cpu__fpr[DFPREG(rs1)],
+                                   cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd) + 1],
+                                   cpu__fpr[DFPREG(rs1) + 1],
+                                   cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x07d: /* VIS I fors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_or_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_or_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd)], -1);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd) + 1], -1);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], -1);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], -1);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x07f: /* VIS I fones */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[rd], -1);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_movi_i32(cpu_dst_32, -1);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x080: /* VIS I shutdown */
@@ -4659,7 +4798,9 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0x20:      /* ldf, load fpreg */
                     gen_address_mask(dc, cpu_addr);
                     tcg_gen_qemu_ld32u(cpu_tmp0, cpu_addr, dc->mem_idx);
-                    tcg_gen_trunc_tl_i32(cpu_fpr[rd], cpu_tmp0);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_trunc_tl_i32(cpu_dst_32, cpu_tmp0);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x21:      /* ldfsr, V9 ldxfsr */
@@ -4810,7 +4951,8 @@ static void disas_sparc_insn(DisasContext * dc)
                 switch (xop) {
                 case 0x24: /* stf, store fpreg */
                     gen_address_mask(dc, cpu_addr);
-                    tcg_gen_ext_i32_tl(cpu_tmp0, cpu_fpr[rd]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rd);
+                    tcg_gen_ext_i32_tl(cpu_tmp0, cpu_src1_32);
                     tcg_gen_qemu_st32(cpu_tmp0, cpu_addr, dc->mem_idx);
                     break;
                 case 0x25: /* stfsr, V9 stxfsr */
@@ -5242,9 +5384,9 @@ void gen_intermediate_code_init(CPUSPARCState *env)
                                               offsetof(CPUState, gregs[i]),
                                               gregnames[i]);
         for (i = 0; i < TARGET_FPREGS; i++)
-            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
-                                                offsetof(CPUState, fpr[i]),
-                                                fregnames[i]);
+            cpu__fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
+                                                 offsetof(CPUState, fpr[i]),
+                                                 fregnames[i]);
 
         /* register helpers */
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 02/16] target-sparc: Mark fprs dirty in store accessor.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 01/16] target-sparc: Add accessors for single-precision fpr access Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 03/16] target-sparc: Add accessors for double-precision fpr access Richard Henderson
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |   54 ++++++---------------------------------------
 1 files changed, 8 insertions(+), 46 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index da52ce6..26f7e36 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -114,6 +114,13 @@ static int sign_extend(int x, int len)
 
 #define IS_IMM (insn & (1<<13))
 
+static inline void gen_update_fprs_dirty(int rd)
+{
+#if defined(TARGET_SPARC64)
+    tcg_gen_ori_i32(cpu_fprs, cpu_fprs, (rd < 32) ? 1 : 2);
+#endif
+}
+
 /* floating point registers moves */
 static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 {
@@ -123,6 +130,7 @@ static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
     tcg_gen_mov_i32 (cpu__fpr[dst], v);
+    gen_update_fprs_dirty(dst);
 }
 
 static TCGv_i32 gen_dest_fpr_F(void)
@@ -1585,13 +1593,6 @@ static int gen_trap_ifnofpu(DisasContext *dc, TCGv r_cond)
     return 0;
 }
 
-static inline void gen_update_fprs_dirty(int rd)
-{
-#if defined(TARGET_SPARC64)
-    tcg_gen_ori_i32(cpu_fprs, cpu_fprs, (rd < 32) ? 1 : 2);
-#endif
-}
-
 static inline void gen_op_clear_ieee_excp_and_FTT(void)
 {
     tcg_gen_andi_tl(cpu_fsr, cpu_fsr, FSR_FTT_CEXC_NMASK);
@@ -2387,21 +2388,18 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0x1: /* fmovs */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x5: /* fnegs */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
                     gen_helper_fnegs(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x9: /* fabss */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
                     gen_helper_fabss(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x29: /* fsqrts */
                     CHECK_FPU_FEATURE(dc, FSQRT);
@@ -2411,7 +2409,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fsqrts(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x2a: /* fsqrtd */
                     CHECK_FPU_FEATURE(dc, FSQRT);
@@ -2440,7 +2437,6 @@ static void disas_sparc_insn(DisasContext * dc)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x42: /* faddd */
                     gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2470,7 +2466,6 @@ static void disas_sparc_insn(DisasContext * dc)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x46: /* fsubd */
                     gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2501,7 +2496,6 @@ static void disas_sparc_insn(DisasContext * dc)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x4a: /* fmuld */
                     CHECK_FPU_FEATURE(dc, FMUL);
@@ -2533,7 +2527,6 @@ static void disas_sparc_insn(DisasContext * dc)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x4e: /* fdivd */
                     gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2581,7 +2574,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fitos(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xc6: /* fdtos */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
@@ -2590,7 +2582,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fdtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xc7: /* fqtos */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2600,7 +2591,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fqtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xc8: /* fitod */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
@@ -2651,7 +2641,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fstoi(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xd2: /* fdtoi */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
@@ -2660,7 +2649,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fdtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xd3: /* fqtoi */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2670,7 +2658,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fqtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
@@ -2750,7 +2737,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fxtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x88: /* V9 fxtod */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
@@ -2795,7 +2781,6 @@ static void disas_sparc_insn(DisasContext * dc)
                                        0, l1);
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     gen_set_label(l1);
                     break;
                 } else if ((xop & 0x11f) == 0x006) { // V9 fmovdr
@@ -2844,7 +2829,6 @@ static void disas_sparc_insn(DisasContext * dc)
                                            0, l1);                      \
                         cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
                         gen_store_fpr_F(dc, rd, cpu_src1_32);           \
-                        gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -2946,7 +2930,6 @@ static void disas_sparc_insn(DisasContext * dc)
                                            0, l1);                      \
                         cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
                         gen_store_fpr_F(dc, rd, cpu_src1_32);           \
-                        gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -4089,7 +4072,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fpadd16s(cpu_dst_32, cpu_env,
                                         cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x052: /* VIS I fpadd32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4106,7 +4088,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_add_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x054: /* VIS I fpsub16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4124,7 +4105,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_helper_fpsub16s(cpu_dst_32, cpu_env,
                                         cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x056: /* VIS I fpsub32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4141,7 +4121,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_sub_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4154,7 +4133,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_movi_i32(cpu_dst_32, 0);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4173,7 +4151,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_nor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4192,7 +4169,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_andc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4208,7 +4184,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4227,7 +4202,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_andc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4243,7 +4217,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4262,7 +4235,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_xor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4281,7 +4253,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_nand_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4300,7 +4271,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_and_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4319,7 +4289,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_eqv_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4333,7 +4302,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4352,7 +4320,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_orc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x078: /* VIS I fsrc2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4364,7 +4331,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4383,7 +4349,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_orc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4402,7 +4367,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_or_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4415,7 +4379,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_movi_i32(cpu_dst_32, -1);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x080: /* VIS I shutdown */
                 case 0x081: /* VIS II siam */
@@ -4801,7 +4764,6 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_trunc_tl_i32(cpu_dst_32, cpu_tmp0);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x21:      /* ldfsr, V9 ldxfsr */
 #ifdef TARGET_SPARC64
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 03/16] target-sparc: Add accessors for double-precision fpr access.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 01/16] target-sparc: Add accessors for single-precision fpr access Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 02/16] target-sparc: Mark fprs dirty in store accessor Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 04/16] target-sparc: Pass float64 parameters instead of dt0/1 temporaries Richard Henderson
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Begin using i64 quantities to manipulate double-precision values.
On a 64-bit host this will, for the moment, generate less efficient
code; on a 32-bit host code quality should be largely unchanged.
Code quality for 64-bit will be adjusted with a subsequent patch.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |  242 +++++++++++++++++++++++++---------------------
 1 files changed, 130 insertions(+), 112 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 26f7e36..937e711 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -82,6 +82,8 @@ typedef struct DisasContext {
     uint32_t cc_op;  /* current CC operation */
     struct TranslationBlock *tb;
     sparc_def_t *def;
+    TCGv_i64 t64[3];
+    int n_t64;
 } DisasContext;
 
 // This function uses non-native bit order
@@ -129,7 +131,7 @@ static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
-    tcg_gen_mov_i32 (cpu__fpr[dst], v);
+    tcg_gen_mov_i32(cpu__fpr[dst], v);
     gen_update_fprs_dirty(dst);
 }
 
@@ -138,6 +140,52 @@ static TCGv_i32 gen_dest_fpr_F(void)
     return cpu_tmp32;
 }
 
+static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
+{
+    TCGv_i64 ret = tcg_temp_new_i64();
+    src = DFPREG(src);
+
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu__fpr[src]);
+    tcg_gen_mov_i32(TCGV_LOW(ret), cpu__fpr[src + 1]);
+#else
+    {
+        TCGv_i64 t = tcg_temp_new_i64();
+        tcg_gen_extu_i32_i64(ret, cpu__fpr[src]);
+        tcg_gen_extu_i32_i64(t, cpu__fpr[src + 1]);
+        tcg_gen_shli_i64(ret, ret, 32);
+        tcg_gen_or_i64(ret, ret, t);
+        tcg_temp_free_i64(t);
+    }
+#endif
+
+    dc->t64[dc->n_t64++] = ret;
+    assert(dc->n_t64 <= ARRAY_SIZE(dc->t64));
+
+    return ret;
+}
+
+static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
+{
+    dst = DFPREG(dst);
+
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(cpu__fpu[dst], TCGV_HIGH(v));
+    tcg_gen_mov_i32(cpu__fpu[dst + 1], TCGV_LOW(v));
+#else
+    tcg_gen_trunc_i64_i32(cpu__fpr[dst + 1], v);
+    tcg_gen_shri_i64(v, v, 32);
+    tcg_gen_trunc_i64_i32(cpu__fpr[dst], v);
+#endif
+
+    gen_update_fprs_dirty(dst);
+}
+
+static TCGv_i64 gen_dest_fpr_D(void)
+{
+    return cpu_tmp64;
+}
+
 static void gen_op_load_fpr_DT0(unsigned int src)
 {
     tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
@@ -1909,6 +1957,7 @@ static void disas_sparc_insn(DisasContext * dc)
     unsigned int insn, opc, rs1, rs2, rd;
     TCGv cpu_src1, cpu_src2, cpu_tmp1, cpu_tmp2;
     TCGv_i32 cpu_src1_32, cpu_src2_32, cpu_dst_32;
+    TCGv_i64 cpu_src1_64, cpu_src2_64, cpu_dst_64;
     target_long simm;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)))
@@ -2661,11 +2710,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     break;
                 case 0x3: /* V9 fmovq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2791,9 +2837,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)], cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1], cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     gen_set_label(l1);
                     break;
                 } else if ((xop & 0x11f) == 0x007) { // V9 fmovqr
@@ -2843,11 +2888,8 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
-                                        cpu__fpr[DFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
-                                        cpu__fpr[DFPREG(rs2) + 1]);     \
-                        gen_update_fprs_dirty(DFPREG(rd));              \
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs2);          \
+                        gen_store_fpr_D(dc, rd, cpu_src1_64);           \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -2944,10 +2986,8 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
-                                        cpu__fpr[DFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
-                                        cpu__fpr[DFPREG(rs2) + 1]);     \
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs2);          \
+                        gen_store_fpr_D(dc, rd, cpu_src1_64);           \
                         gen_update_fprs_dirty(DFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -4124,9 +4164,9 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], 0);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], 0);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_movi_i64(cpu_dst_64, 0);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x061: /* VIS I fzeros */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4136,13 +4176,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_nor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x063: /* VIS I fnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4154,13 +4192,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
-                                     cpu__fpr[DFPREG(rs1)],
-                                     cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                     cpu__fpr[DFPREG(rs1) + 1],
-                                     cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_andc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x065: /* VIS I fandnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4172,11 +4208,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x067: /* VIS I fnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4187,13 +4222,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
-                                     cpu__fpr[DFPREG(rs2)],
-                                     cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                     cpu__fpr[DFPREG(rs2) + 1],
-                                     cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_andc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x069: /* VIS I fandnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4205,11 +4238,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x06b: /* VIS I fnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4220,13 +4252,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_xor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x06d: /* VIS I fxors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4238,13 +4268,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd)],
-                                     cpu__fpr[DFPREG(rs1)],
-                                     cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd) + 1],
-                                     cpu__fpr[DFPREG(rs1) + 1],
-                                     cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_nand_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x06f: /* VIS I fnands */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4256,13 +4284,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_and_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x071: /* VIS I fands */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4274,13 +4300,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_eqv_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x073: /* VIS I fxnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4292,11 +4316,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     break;
                 case 0x075: /* VIS I fsrc1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4305,13 +4326,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_orc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x077: /* VIS I fornot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4323,9 +4342,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x078: /* VIS I fsrc2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs2));
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     break;
                 case 0x079: /* VIS I fsrc2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4334,13 +4352,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs2)],
-                                    cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_orc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x07b: /* VIS I fornot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4352,13 +4368,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd)],
-                                   cpu__fpr[DFPREG(rs1)],
-                                   cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd) + 1],
-                                   cpu__fpr[DFPREG(rs1) + 1],
-                                   cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_or_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x07d: /* VIS I fors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4370,9 +4384,9 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], -1);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], -1);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_movi_i64(cpu_dst_64, -1);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x07f: /* VIS I fones */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -5199,6 +5213,10 @@ static inline void gen_intermediate_code_internal(TranslationBlock * tb,
     tcg_temp_free_i64(cpu_tmp64);
     tcg_temp_free_i32(cpu_tmp32);
     tcg_temp_free(cpu_tmp0);
+    for (j = dc->n_t64 - 1; j >= 0; --j) {
+        tcg_temp_free_i64(dc->t64[j]);
+    }
+
     if (tb->cflags & CF_LAST_IO)
         gen_io_end();
     if (!dc->is_br) {
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 04/16] target-sparc: Pass float64 parameters instead of dt0/1 temporaries.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (2 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 03/16] target-sparc: Add accessors for double-precision fpr access Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 05/16] target-sparc: Make FPU/VIS helpers const when possible Richard Henderson
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/cpu.h         |    1 -
 target-sparc/fop_helper.c  |  120 ++++++------
 target-sparc/helper.h      |   95 +++++-----
 target-sparc/ldst_helper.c |   52 -----
 target-sparc/translate.c   |  449 ++++++++++++++++++++++----------------------
 target-sparc/vis_helper.c  |  113 ++++++------
 6 files changed, 381 insertions(+), 449 deletions(-)

diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 25b4f1a..4eace33 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -463,7 +463,6 @@ typedef struct CPUSPARCState {
     uint64_t prom_addr;
 #endif
     /* temporary float registers */
-    float64 dt0, dt1;
     float128 qt0, qt1;
     float_status fp_status;
 #if defined(TARGET_SPARC64)
diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
index 23502f3..f6348c2 100644
--- a/target-sparc/fop_helper.c
+++ b/target-sparc/fop_helper.c
@@ -20,8 +20,6 @@
 #include "cpu.h"
 #include "helper.h"
 
-#define DT0 (env->dt0)
-#define DT1 (env->dt1)
 #define QT0 (env->qt0)
 #define QT1 (env->qt1)
 
@@ -33,9 +31,10 @@
     {                                                           \
         return float32_ ## name (src1, src2, &env->fp_status);  \
     }                                                           \
-    F_HELPER(name, d)                                           \
+    float64 helper_f ## name ## d (CPUState * env, float64 src1,\
+                                   float64 src2)                \
     {                                                           \
-        DT0 = float64_ ## name (DT0, DT1, &env->fp_status);     \
+        return float64_ ## name (src1, src2, &env->fp_status);  \
     }                                                           \
     F_HELPER(name, q)                                           \
     {                                                           \
@@ -48,17 +47,17 @@ F_BINOP(mul);
 F_BINOP(div);
 #undef F_BINOP
 
-void helper_fsmuld(CPUState *env, float32 src1, float32 src2)
+float64 helper_fsmuld(CPUState *env, float32 src1, float32 src2)
 {
-    DT0 = float64_mul(float32_to_float64(src1, &env->fp_status),
-                      float32_to_float64(src2, &env->fp_status),
-                      &env->fp_status);
+    return float64_mul(float32_to_float64(src1, &env->fp_status),
+                       float32_to_float64(src2, &env->fp_status),
+                       &env->fp_status);
 }
 
-void helper_fdmulq(CPUState *env)
+void helper_fdmulq(CPUState *env, float64 src1, float64 src2)
 {
-    QT0 = float128_mul(float64_to_float128(DT0, &env->fp_status),
-                       float64_to_float128(DT1, &env->fp_status),
+    QT0 = float128_mul(float64_to_float128(src1, &env->fp_status),
+                       float64_to_float128(src2, &env->fp_status),
                        &env->fp_status);
 }
 
@@ -68,9 +67,9 @@ float32 helper_fnegs(float32 src)
 }
 
 #ifdef TARGET_SPARC64
-F_HELPER(neg, d)
+float64 helper_fnegd(float64 src)
 {
-    DT0 = float64_chs(DT1);
+    return float64_chs(src);
 }
 
 F_HELPER(neg, q)
@@ -85,9 +84,9 @@ float32 helper_fitos(CPUState *env, int32_t src)
     return int32_to_float32(src, &env->fp_status);
 }
 
-void helper_fitod(CPUState *env, int32_t src)
+float64 helper_fitod(CPUState *env, int32_t src)
 {
-    DT0 = int32_to_float64(src, &env->fp_status);
+    return int32_to_float64(src, &env->fp_status);
 }
 
 void helper_fitoq(CPUState *env, int32_t src)
@@ -96,32 +95,32 @@ void helper_fitoq(CPUState *env, int32_t src)
 }
 
 #ifdef TARGET_SPARC64
-float32 helper_fxtos(CPUState *env)
+float32 helper_fxtos(CPUState *env, int64_t src)
 {
-    return int64_to_float32(*((int64_t *)&DT1), &env->fp_status);
+    return int64_to_float32(src, &env->fp_status);
 }
 
-F_HELPER(xto, d)
+float64 helper_fxtod(CPUState *env, int64_t src)
 {
-    DT0 = int64_to_float64(*((int64_t *)&DT1), &env->fp_status);
+    return int64_to_float64(src, &env->fp_status);
 }
 
-F_HELPER(xto, q)
+void helper_fxtoq(CPUState *env, int64_t src)
 {
-    QT0 = int64_to_float128(*((int64_t *)&DT1), &env->fp_status);
+    QT0 = int64_to_float128(src, &env->fp_status);
 }
 #endif
 #undef F_HELPER
 
 /* floating point conversion */
-float32 helper_fdtos(CPUState *env)
+float32 helper_fdtos(CPUState *env, float64 src)
 {
-    return float64_to_float32(DT1, &env->fp_status);
+    return float64_to_float32(src, &env->fp_status);
 }
 
-void helper_fstod(CPUState *env, float32 src)
+float64 helper_fstod(CPUState *env, float32 src)
 {
-    DT0 = float32_to_float64(src, &env->fp_status);
+    return float32_to_float64(src, &env->fp_status);
 }
 
 float32 helper_fqtos(CPUState *env)
@@ -134,14 +133,14 @@ void helper_fstoq(CPUState *env, float32 src)
     QT0 = float32_to_float128(src, &env->fp_status);
 }
 
-void helper_fqtod(CPUState *env)
+float64 helper_fqtod(CPUState *env)
 {
-    DT0 = float128_to_float64(QT1, &env->fp_status);
+    return float128_to_float64(QT1, &env->fp_status);
 }
 
-void helper_fdtoq(CPUState *env)
+void helper_fdtoq(CPUState *env, float64 src)
 {
-    QT0 = float64_to_float128(DT1, &env->fp_status);
+    QT0 = float64_to_float128(src, &env->fp_status);
 }
 
 /* Float to integer conversion.  */
@@ -150,9 +149,9 @@ int32_t helper_fstoi(CPUState *env, float32 src)
     return float32_to_int32_round_to_zero(src, &env->fp_status);
 }
 
-int32_t helper_fdtoi(CPUState *env)
+int32_t helper_fdtoi(CPUState *env, float64 src)
 {
-    return float64_to_int32_round_to_zero(DT1, &env->fp_status);
+    return float64_to_int32_round_to_zero(src, &env->fp_status);
 }
 
 int32_t helper_fqtoi(CPUState *env)
@@ -161,19 +160,19 @@ int32_t helper_fqtoi(CPUState *env)
 }
 
 #ifdef TARGET_SPARC64
-void helper_fstox(CPUState *env, float32 src)
+int64_t helper_fstox(CPUState *env, float32 src)
 {
-    *((int64_t *)&DT0) = float32_to_int64_round_to_zero(src, &env->fp_status);
+    return float32_to_int64_round_to_zero(src, &env->fp_status);
 }
 
-void helper_fdtox(CPUState *env)
+int64_t helper_fdtox(CPUState *env, float64 src)
 {
-    *((int64_t *)&DT0) = float64_to_int64_round_to_zero(DT1, &env->fp_status);
+    return float64_to_int64_round_to_zero(src, &env->fp_status);
 }
 
-void helper_fqtox(CPUState *env)
+int64_t helper_fqtox(CPUState *env)
 {
-    *((int64_t *)&DT0) = float128_to_int64_round_to_zero(QT1, &env->fp_status);
+    return float128_to_int64_round_to_zero(QT1, &env->fp_status);
 }
 #endif
 
@@ -183,9 +182,9 @@ float32 helper_fabss(float32 src)
 }
 
 #ifdef TARGET_SPARC64
-void helper_fabsd(CPUState *env)
+float64 helper_fabsd(CPUState *env, float64 src)
 {
-    DT0 = float64_abs(DT1);
+    return float64_abs(src);
 }
 
 void helper_fabsq(CPUState *env)
@@ -199,9 +198,9 @@ float32 helper_fsqrts(CPUState *env, float32 src)
     return float32_sqrt(src, &env->fp_status);
 }
 
-void helper_fsqrtd(CPUState *env)
+float64 helper_fsqrtd(CPUState *env, float64 src)
 {
-    DT0 = float64_sqrt(DT1, &env->fp_status);
+    return float64_sqrt(src, &env->fp_status);
 }
 
 void helper_fsqrtq(CPUState *env)
@@ -245,8 +244,8 @@ void helper_fsqrtq(CPUState *env)
             break;                                                      \
         }                                                               \
     }
-#define GEN_FCMPS(name, size, FS, E)                                    \
-    void glue(helper_, name)(CPUState *env, float32 src1, float32 src2) \
+#define GEN_FCMP_T(name, size, FS, E)                                   \
+    void glue(helper_, name)(CPUState *env, size src1, size src2)       \
     {                                                                   \
         env->fsr &= FSR_FTT_NMASK;                                      \
         if (E && (glue(size, _is_any_nan)(src1) ||                      \
@@ -282,41 +281,42 @@ void helper_fsqrtq(CPUState *env)
         }                                                               \
     }
 
-GEN_FCMPS(fcmps, float32, 0, 0);
-GEN_FCMP(fcmpd, float64, DT0, DT1, 0, 0);
+GEN_FCMP_T(fcmps, float32, 0, 0);
+GEN_FCMP_T(fcmpd, float64, 0, 0);
 
-GEN_FCMPS(fcmpes, float32, 0, 1);
-GEN_FCMP(fcmped, float64, DT0, DT1, 0, 1);
+GEN_FCMP_T(fcmpes, float32, 0, 1);
+GEN_FCMP_T(fcmped, float64, 0, 1);
 
 GEN_FCMP(fcmpq, float128, QT0, QT1, 0, 0);
 GEN_FCMP(fcmpeq, float128, QT0, QT1, 0, 1);
 
 #ifdef TARGET_SPARC64
-GEN_FCMPS(fcmps_fcc1, float32, 22, 0);
-GEN_FCMP(fcmpd_fcc1, float64, DT0, DT1, 22, 0);
+GEN_FCMP_T(fcmps_fcc1, float32, 22, 0);
+GEN_FCMP_T(fcmpd_fcc1, float64, 22, 0);
 GEN_FCMP(fcmpq_fcc1, float128, QT0, QT1, 22, 0);
 
-GEN_FCMPS(fcmps_fcc2, float32, 24, 0);
-GEN_FCMP(fcmpd_fcc2, float64, DT0, DT1, 24, 0);
+GEN_FCMP_T(fcmps_fcc2, float32, 24, 0);
+GEN_FCMP_T(fcmpd_fcc2, float64, 24, 0);
 GEN_FCMP(fcmpq_fcc2, float128, QT0, QT1, 24, 0);
 
-GEN_FCMPS(fcmps_fcc3, float32, 26, 0);
-GEN_FCMP(fcmpd_fcc3, float64, DT0, DT1, 26, 0);
+GEN_FCMP_T(fcmps_fcc3, float32, 26, 0);
+GEN_FCMP_T(fcmpd_fcc3, float64, 26, 0);
 GEN_FCMP(fcmpq_fcc3, float128, QT0, QT1, 26, 0);
 
-GEN_FCMPS(fcmpes_fcc1, float32, 22, 1);
-GEN_FCMP(fcmped_fcc1, float64, DT0, DT1, 22, 1);
+GEN_FCMP_T(fcmpes_fcc1, float32, 22, 1);
+GEN_FCMP_T(fcmped_fcc1, float64, 22, 1);
 GEN_FCMP(fcmpeq_fcc1, float128, QT0, QT1, 22, 1);
 
-GEN_FCMPS(fcmpes_fcc2, float32, 24, 1);
-GEN_FCMP(fcmped_fcc2, float64, DT0, DT1, 24, 1);
+GEN_FCMP_T(fcmpes_fcc2, float32, 24, 1);
+GEN_FCMP_T(fcmped_fcc2, float64, 24, 1);
 GEN_FCMP(fcmpeq_fcc2, float128, QT0, QT1, 24, 1);
 
-GEN_FCMPS(fcmpes_fcc3, float32, 26, 1);
-GEN_FCMP(fcmped_fcc3, float64, DT0, DT1, 26, 1);
+GEN_FCMP_T(fcmpes_fcc3, float32, 26, 1);
+GEN_FCMP_T(fcmped_fcc3, float64, 26, 1);
 GEN_FCMP(fcmpeq_fcc3, float128, QT0, QT1, 26, 1);
 #endif
-#undef GEN_FCMPS
+#undef GEN_FCMP_T
+#undef GEN_FCMP
 
 void helper_check_ieee_exceptions(CPUState *env)
 {
diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 615ddef..86fad6e 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -39,8 +39,6 @@ DEF_HELPER_3(udiv, tl, env, tl, tl)
 DEF_HELPER_3(udiv_cc, tl, env, tl, tl)
 DEF_HELPER_3(sdiv, tl, env, tl, tl)
 DEF_HELPER_3(sdiv_cc, tl, env, tl, tl)
-DEF_HELPER_2(stdf, void, tl, int)
-DEF_HELPER_2(lddf, void, tl, int)
 DEF_HELPER_2(ldqf, void, tl, int)
 DEF_HELPER_2(stqf, void, tl, int)
 #if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64)
@@ -52,29 +50,29 @@ DEF_HELPER_1(check_ieee_exceptions, void, env)
 DEF_HELPER_1(clear_float_exceptions, void, env)
 DEF_HELPER_1(fabss, f32, f32)
 DEF_HELPER_2(fsqrts, f32, env, f32)
-DEF_HELPER_1(fsqrtd, void, env)
+DEF_HELPER_2(fsqrtd, f64, env, f64)
 DEF_HELPER_3(fcmps, void, env, f32, f32)
-DEF_HELPER_1(fcmpd, void, env)
+DEF_HELPER_3(fcmpd, void, env, f64, f64)
 DEF_HELPER_3(fcmpes, void, env, f32, f32)
-DEF_HELPER_1(fcmped, void, env)
+DEF_HELPER_3(fcmped, void, env, f64, f64)
 DEF_HELPER_1(fsqrtq, void, env)
 DEF_HELPER_1(fcmpq, void, env)
 DEF_HELPER_1(fcmpeq, void, env)
 #ifdef TARGET_SPARC64
 DEF_HELPER_2(ldxfsr, void, env, i64)
-DEF_HELPER_1(fabsd, void, env)
+DEF_HELPER_2(fabsd, f64, env, f64)
 DEF_HELPER_3(fcmps_fcc1, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc2, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc3, void, env, f32, f32)
-DEF_HELPER_1(fcmpd_fcc1, void, env)
-DEF_HELPER_1(fcmpd_fcc2, void, env)
-DEF_HELPER_1(fcmpd_fcc3, void, env)
+DEF_HELPER_3(fcmpd_fcc1, void, env, f64, f64)
+DEF_HELPER_3(fcmpd_fcc2, void, env, f64, f64)
+DEF_HELPER_3(fcmpd_fcc3, void, env, f64, f64)
 DEF_HELPER_3(fcmpes_fcc1, void, env, f32, f32)
 DEF_HELPER_3(fcmpes_fcc2, void, env, f32, f32)
 DEF_HELPER_3(fcmpes_fcc3, void, env, f32, f32)
-DEF_HELPER_1(fcmped_fcc1, void, env)
-DEF_HELPER_1(fcmped_fcc2, void, env)
-DEF_HELPER_1(fcmped_fcc3, void, env)
+DEF_HELPER_3(fcmped_fcc1, void, env, f64, f64)
+DEF_HELPER_3(fcmped_fcc2, void, env, f64, f64)
+DEF_HELPER_3(fcmped_fcc3, void, env, f64, f64)
 DEF_HELPER_1(fabsq, void, env)
 DEF_HELPER_1(fcmpq_fcc1, void, env)
 DEF_HELPER_1(fcmpq_fcc2, void, env)
@@ -86,77 +84,78 @@ DEF_HELPER_1(fcmpeq_fcc3, void, env)
 DEF_HELPER_2(raise_exception, void, env, int)
 DEF_HELPER_0(shutdown, void)
 #define F_HELPER_0_1(name) DEF_HELPER_1(f ## name, void, env)
-#define F_HELPER_DQ_0_1(name)                   \
-    F_HELPER_0_1(name ## d);                    \
-    F_HELPER_0_1(name ## q)
 
-F_HELPER_DQ_0_1(add);
-F_HELPER_DQ_0_1(sub);
-F_HELPER_DQ_0_1(mul);
-F_HELPER_DQ_0_1(div);
+DEF_HELPER_3(faddd, f64, env, f64, f64)
+DEF_HELPER_3(fsubd, f64, env, f64, f64)
+DEF_HELPER_3(fmuld, f64, env, f64, f64)
+DEF_HELPER_3(fdivd, f64, env, f64, f64)
+F_HELPER_0_1(addq)
+F_HELPER_0_1(subq)
+F_HELPER_0_1(mulq)
+F_HELPER_0_1(divq)
 
 DEF_HELPER_3(fadds, f32, env, f32, f32)
 DEF_HELPER_3(fsubs, f32, env, f32, f32)
 DEF_HELPER_3(fmuls, f32, env, f32, f32)
 DEF_HELPER_3(fdivs, f32, env, f32, f32)
 
-DEF_HELPER_3(fsmuld, void, env, f32, f32)
-F_HELPER_0_1(dmulq);
+DEF_HELPER_3(fsmuld, f64, env, f32, f32)
+DEF_HELPER_3(fdmulq, void, env, f64, f64);
 
 DEF_HELPER_1(fnegs, f32, f32)
-DEF_HELPER_2(fitod, void, env, s32)
+DEF_HELPER_2(fitod, f64, env, s32)
 DEF_HELPER_2(fitoq, void, env, s32)
 
 DEF_HELPER_2(fitos, f32, env, s32)
 
 #ifdef TARGET_SPARC64
-DEF_HELPER_1(fnegd, void, env)
+DEF_HELPER_1(fnegd, f64, f64)
 DEF_HELPER_1(fnegq, void, env)
-DEF_HELPER_1(fxtos, i32, env)
-F_HELPER_DQ_0_1(xto);
+DEF_HELPER_2(fxtos, f32, env, s64)
+DEF_HELPER_2(fxtod, f64, env, s64)
+DEF_HELPER_2(fxtoq, void, env, s64)
 #endif
-DEF_HELPER_1(fdtos, f32, env)
-DEF_HELPER_2(fstod, void, env, f32)
+DEF_HELPER_2(fdtos, f32, env, f64)
+DEF_HELPER_2(fstod, f64, env, f32)
 DEF_HELPER_1(fqtos, f32, env)
 DEF_HELPER_2(fstoq, void, env, f32)
-F_HELPER_0_1(qtod);
-F_HELPER_0_1(dtoq);
+DEF_HELPER_1(fqtod, f64, env)
+DEF_HELPER_2(fdtoq, void, env, f64)
 DEF_HELPER_2(fstoi, s32, env, f32)
-DEF_HELPER_1(fdtoi, s32, env)
+DEF_HELPER_2(fdtoi, s32, env, f64)
 DEF_HELPER_1(fqtoi, s32, env)
 #ifdef TARGET_SPARC64
-DEF_HELPER_2(fstox, void, env, i32)
-F_HELPER_0_1(dtox);
-F_HELPER_0_1(qtox);
-F_HELPER_0_1(aligndata);
+DEF_HELPER_2(fstox, s64, env, f32)
+DEF_HELPER_2(fdtox, s64, env, f64)
+DEF_HELPER_1(fqtox, s64, env)
+DEF_HELPER_3(faligndata, i64, env, i64, i64)
 
-F_HELPER_0_1(pmerge);
-F_HELPER_0_1(mul8x16);
-F_HELPER_0_1(mul8x16al);
-F_HELPER_0_1(mul8x16au);
-F_HELPER_0_1(mul8sux16);
-F_HELPER_0_1(mul8ulx16);
-F_HELPER_0_1(muld8sux16);
-F_HELPER_0_1(muld8ulx16);
-F_HELPER_0_1(expand);
+DEF_HELPER_3(fpmerge, i64, env, i64, i64)
+DEF_HELPER_3(fmul8x16, i64, env, i64, i64)
+DEF_HELPER_3(fmul8x16al, i64, env, i64, i64)
+DEF_HELPER_3(fmul8x16au, i64, env, i64, i64)
+DEF_HELPER_3(fmul8sux16, i64, env, i64, i64)
+DEF_HELPER_3(fmul8ulx16, i64, env, i64, i64)
+DEF_HELPER_3(fmuld8sux16, i64, env, i64, i64)
+DEF_HELPER_3(fmuld8ulx16, i64, env, i64, i64)
+DEF_HELPER_3(fexpand, i64, env, i64, i64)
 #define VIS_HELPER(name)                                 \
-    F_HELPER_0_1(name##16);                              \
+    DEF_HELPER_3(f ## name ## 16, i64, env, i64, i64)    \
     DEF_HELPER_3(f ## name ## 16s, i32, env, i32, i32)   \
-    F_HELPER_0_1(name##32);                              \
+    DEF_HELPER_3(f ## name ## 32, i64, env, i64, i64)    \
     DEF_HELPER_3(f ## name ## 32s, i32, env, i32, i32)
 
 VIS_HELPER(padd);
 VIS_HELPER(psub);
 #define VIS_CMPHELPER(name)                              \
-    DEF_HELPER_1(f##name##16, i64, env);                 \
-    DEF_HELPER_1(f##name##32, i64, env)
+    DEF_HELPER_3(f##name##16, i64, env, i64, i64)        \
+    DEF_HELPER_3(f##name##32, i64, env, i64, i64)
 VIS_CMPHELPER(cmpgt);
 VIS_CMPHELPER(cmpeq);
 VIS_CMPHELPER(cmple);
 VIS_CMPHELPER(cmpne);
 #endif
 #undef F_HELPER_0_1
-#undef F_HELPER_DQ_0_1
 #undef VIS_HELPER
 #undef VIS_CMPHELPER
 DEF_HELPER_1(compute_psr, void, env);
diff --git a/target-sparc/ldst_helper.c b/target-sparc/ldst_helper.c
index 1fb3996..80e5408 100644
--- a/target-sparc/ldst_helper.c
+++ b/target-sparc/ldst_helper.c
@@ -66,8 +66,6 @@
 #endif
 #endif
 
-#define DT0 (env->dt0)
-#define DT1 (env->dt1)
 #define QT0 (env->qt0)
 #define QT1 (env->qt1)
 
@@ -2214,56 +2212,6 @@ target_ulong helper_casx_asi(target_ulong addr, target_ulong val1,
 }
 #endif /* TARGET_SPARC64 */
 
-void helper_stdf(target_ulong addr, int mem_idx)
-{
-    helper_check_align(addr, 7);
-#if !defined(CONFIG_USER_ONLY)
-    switch (mem_idx) {
-    case MMU_USER_IDX:
-        stfq_user(addr, DT0);
-        break;
-    case MMU_KERNEL_IDX:
-        stfq_kernel(addr, DT0);
-        break;
-#ifdef TARGET_SPARC64
-    case MMU_HYPV_IDX:
-        stfq_hypv(addr, DT0);
-        break;
-#endif
-    default:
-        DPRINTF_MMU("helper_stdf: need to check MMU idx %d\n", mem_idx);
-        break;
-    }
-#else
-    stfq_raw(address_mask(env, addr), DT0);
-#endif
-}
-
-void helper_lddf(target_ulong addr, int mem_idx)
-{
-    helper_check_align(addr, 7);
-#if !defined(CONFIG_USER_ONLY)
-    switch (mem_idx) {
-    case MMU_USER_IDX:
-        DT0 = ldfq_user(addr);
-        break;
-    case MMU_KERNEL_IDX:
-        DT0 = ldfq_kernel(addr);
-        break;
-#ifdef TARGET_SPARC64
-    case MMU_HYPV_IDX:
-        DT0 = ldfq_hypv(addr);
-        break;
-#endif
-    default:
-        DPRINTF_MMU("helper_lddf: need to check MMU idx %d\n", mem_idx);
-        break;
-    }
-#else
-    DT0 = ldfq_raw(address_mask(env, addr));
-#endif
-}
-
 void helper_ldqf(target_ulong addr, int mem_idx)
 {
     /* XXX add 128 bit load */
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 937e711..f41ef98 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -186,30 +186,6 @@ static TCGv_i64 gen_dest_fpr_D(void)
     return cpu_tmp64;
 }
 
-static void gen_op_load_fpr_DT0(unsigned int src)
-{
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.lower));
-}
-
-static void gen_op_load_fpr_DT1(unsigned int src)
-{
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt1) +
-                   offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt1) +
-                   offsetof(CPU_DoubleU, l.lower));
-}
-
-static void gen_op_store_DT0_fpr(unsigned int dst)
-{
-    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.lower));
-}
-
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
     tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
@@ -1490,20 +1466,20 @@ static inline void gen_op_fcmps(int fccno, TCGv_i32 r_rs1, TCGv_i32 r_rs2)
     }
 }
 
-static inline void gen_op_fcmpd(int fccno)
+static inline void gen_op_fcmpd(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
     switch (fccno) {
     case 0:
-        gen_helper_fcmpd(cpu_env);
+        gen_helper_fcmpd(cpu_env, r_rs1, r_rs2);
         break;
     case 1:
-        gen_helper_fcmpd_fcc1(cpu_env);
+        gen_helper_fcmpd_fcc1(cpu_env, r_rs1, r_rs2);
         break;
     case 2:
-        gen_helper_fcmpd_fcc2(cpu_env);
+        gen_helper_fcmpd_fcc2(cpu_env, r_rs1, r_rs2);
         break;
     case 3:
-        gen_helper_fcmpd_fcc3(cpu_env);
+        gen_helper_fcmpd_fcc3(cpu_env, r_rs1, r_rs2);
         break;
     }
 }
@@ -1544,20 +1520,20 @@ static inline void gen_op_fcmpes(int fccno, TCGv_i32 r_rs1, TCGv_i32 r_rs2)
     }
 }
 
-static inline void gen_op_fcmped(int fccno)
+static inline void gen_op_fcmped(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
     switch (fccno) {
     case 0:
-        gen_helper_fcmped(cpu_env);
+        gen_helper_fcmped(cpu_env, r_rs1, r_rs2);
         break;
     case 1:
-        gen_helper_fcmped_fcc1(cpu_env);
+        gen_helper_fcmped_fcc1(cpu_env, r_rs1, r_rs2);
         break;
     case 2:
-        gen_helper_fcmped_fcc2(cpu_env);
+        gen_helper_fcmped_fcc2(cpu_env, r_rs1, r_rs2);
         break;
     case 3:
-        gen_helper_fcmped_fcc3(cpu_env);
+        gen_helper_fcmped_fcc3(cpu_env, r_rs1, r_rs2);
         break;
     }
 }
@@ -1587,9 +1563,9 @@ static inline void gen_op_fcmps(int fccno, TCGv r_rs1, TCGv r_rs2)
     gen_helper_fcmps(cpu_env, r_rs1, r_rs2);
 }
 
-static inline void gen_op_fcmpd(int fccno)
+static inline void gen_op_fcmpd(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
-    gen_helper_fcmpd(cpu_env);
+    gen_helper_fcmpd(cpu_env, r_rs1, r_rs2);
 }
 
 static inline void gen_op_fcmpq(int fccno)
@@ -1602,9 +1578,9 @@ static inline void gen_op_fcmpes(int fccno, TCGv r_rs1, TCGv r_rs2)
     gen_helper_fcmpes(cpu_env, r_rs1, r_rs2);
 }
 
-static inline void gen_op_fcmped(int fccno)
+static inline void gen_op_fcmped(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
-    gen_helper_fcmped(cpu_env);
+    gen_helper_fcmped(cpu_env, r_rs1, r_rs2);
 }
 
 static inline void gen_op_fcmpeq(int fccno)
@@ -2461,12 +2437,12 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x2a: /* fsqrtd */
                     CHECK_FPU_FEATURE(dc, FSQRT);
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fsqrtd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fsqrtd(cpu_dst_64, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x2b: /* fsqrtq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2488,13 +2464,14 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x42: /* faddd */
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_faddd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_faddd(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x43: /* faddq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2517,13 +2494,14 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x46: /* fsubd */
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fsubd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fsubd(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x47: /* fsubq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2548,13 +2526,14 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x4a: /* fmuld */
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fmuld(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmuld(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x4b: /* fmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2578,13 +2557,14 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x4e: /* fdivd */
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdivd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fdivd(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x4f: /* fdivq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2601,17 +2581,18 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_clear_float_exceptions();
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fsmuld(cpu_env, cpu_src1_32, cpu_src2_32);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fsmuld(cpu_dst_64, cpu_env,
+                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x6e: /* fdmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdmulq(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fdmulq(cpu_env, cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
@@ -2625,10 +2606,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0xc6: /* fdtos */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtos(cpu_dst_32, cpu_env);
+                    gen_helper_fdtos(cpu_dst_32, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
@@ -2643,24 +2624,24 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0xc8: /* fitod */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fitod(cpu_env, cpu_src1_32);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fitod(cpu_dst_64, cpu_env, cpu_src1_32);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xc9: /* fstod */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fstod(cpu_env, cpu_src1_32);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fstod(cpu_dst_64, cpu_env, cpu_src1_32);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xcb: /* fqtod */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtod(cpu_env);
+                    gen_op_load_fpr_QT1(QFPREG(rs2));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fqtod(cpu_dst_64, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xcc: /* fitoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2678,8 +2659,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0xce: /* fdtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fdtoq(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fdtoq(cpu_env, cpu_src1_64);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
@@ -2692,10 +2673,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0xd2: /* fdtoi */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtoi(cpu_dst_32, cpu_env);
+                    gen_helper_fdtoi(cpu_dst_32, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
@@ -2726,10 +2707,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0x6: /* V9 fnegd */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fnegd(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fnegd(cpu_dst_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x7: /* V9 fnegq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2739,10 +2720,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0xa: /* V9 fabsd */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fabsd(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fabsd(cpu_dst_64, cpu_env, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xb: /* V9 fabsq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2754,49 +2735,49 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0x81: /* V9 fstox */
                     gen_clear_float_exceptions();
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fstox(cpu_env, cpu_src1_32);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fstox(cpu_dst_64, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x82: /* V9 fdtox */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdtox(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fdtox(cpu_dst_64, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x83: /* V9 fqtox */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtox(cpu_env);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fqtox(cpu_dst_64, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x84: /* V9 fxtos */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fxtos(cpu_dst_32, cpu_env);
+                    gen_helper_fxtos(cpu_dst_32, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x88: /* V9 fxtod */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fxtod(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fxtod(cpu_dst_64, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x8c: /* V9 fxtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fxtoq(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fxtoq(cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
@@ -3046,9 +3027,9 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_op_fcmps(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x52: /* fcmpd, V9 %fcc */
-                        gen_op_load_fpr_DT0(DFPREG(rs1));
-                        gen_op_load_fpr_DT1(DFPREG(rs2));
-                        gen_op_fcmpd(rd & 3);
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                        cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                        gen_op_fcmpd(rd & 3, cpu_src1_64, cpu_src2_64);
                         break;
                     case 0x53: /* fcmpq, V9 %fcc */
                         CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -3062,9 +3043,9 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_op_fcmpes(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x56: /* fcmped, V9 %fcc */
-                        gen_op_load_fpr_DT0(DFPREG(rs1));
-                        gen_op_load_fpr_DT1(DFPREG(rs2));
-                        gen_op_fcmped(rd & 3);
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                        cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                        gen_op_fcmped(rd & 3, cpu_src1_64, cpu_src2_64);
                         break;
                     case 0x57: /* fcmpeq, V9 %fcc */
                         CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -3953,115 +3934,130 @@ static void disas_sparc_insn(DisasContext * dc)
                     goto illegal_insn;
                 case 0x020: /* VIS I fcmple16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmple16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmple16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x022: /* VIS I fcmpne16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpne16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpne16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x024: /* VIS I fcmple32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmple32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmple32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x026: /* VIS I fcmpne32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpne32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpne32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x028: /* VIS I fcmpgt16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpgt16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpgt16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02a: /* VIS I fcmpeq16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpeq16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpeq16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02c: /* VIS I fcmpgt32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpgt32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpgt32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02e: /* VIS I fcmpeq32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpeq32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpeq32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x031: /* VIS I fmul8x16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8x16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8x16(cpu_dst_64, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x033: /* VIS I fmul8x16au */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8x16au(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8x16au(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x035: /* VIS I fmul8x16al */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8x16al(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8x16al(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x036: /* VIS I fmul8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8sux16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8sux16(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x037: /* VIS I fmul8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8ulx16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x038: /* VIS I fmuld8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmuld8sux16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_env,
+                                           cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmuld8ulx16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_env,
+                                           cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x03a: /* VIS I fpack32 */
                 case 0x03b: /* VIS I fpack16 */
@@ -4071,38 +4067,42 @@ static void disas_sparc_insn(DisasContext * dc)
                     goto illegal_insn;
                 case 0x048: /* VIS I faligndata */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_faligndata(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_faligndata(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x04b: /* VIS I fpmerge */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpmerge(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpmerge(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x04c: /* VIS II bshuffle */
                     // XXX
                     goto illegal_insn;
                 case 0x04d: /* VIS I fexpand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fexpand(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fexpand(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x050: /* VIS I fpadd16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpadd16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpadd16(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x051: /* VIS I fpadd16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4115,11 +4115,12 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x052: /* VIS I fpadd32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpadd32(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpadd32(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x053: /* VIS I fpadd32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4131,11 +4132,12 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x054: /* VIS I fpsub16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpsub16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpsub16(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x055: /* VIS I fpsub16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4148,11 +4150,12 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x056: /* VIS I fpsub32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpsub32(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpsub32(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x057: /* VIS I fpsub32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4811,16 +4814,10 @@ static void disas_sparc_insn(DisasContext * dc)
                     }
                     break;
                 case 0x23:      /* lddf, load double fpreg */
-                    {
-                        TCGv_i32 r_const;
-
-                        r_const = tcg_const_i32(dc->mem_idx);
-                        gen_address_mask(dc, cpu_addr);
-                        gen_helper_lddf(cpu_addr, r_const);
-                        tcg_temp_free_i32(r_const);
-                        gen_op_store_DT0_fpr(DFPREG(rd));
-                        gen_update_fprs_dirty(DFPREG(rd));
-                    }
+                    gen_address_mask(dc, cpu_addr);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_qemu_ld64(cpu_dst_64, cpu_addr, dc->mem_idx);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 default:
                     goto illegal_insn;
@@ -4971,15 +4968,9 @@ static void disas_sparc_insn(DisasContext * dc)
 #endif
 #endif
                 case 0x27: /* stdf, store double fpreg */
-                    {
-                        TCGv_i32 r_const;
-
-                        gen_op_load_fpr_DT0(DFPREG(rd));
-                        r_const = tcg_const_i32(dc->mem_idx);
-                        gen_address_mask(dc, cpu_addr);
-                        gen_helper_stdf(cpu_addr, r_const);
-                        tcg_temp_free_i32(r_const);
-                    }
+                    gen_address_mask(dc, cpu_addr);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rd);
+                    tcg_gen_qemu_st64(cpu_src1_64, cpu_addr, dc->mem_idx);
                     break;
                 default:
                     goto illegal_insn;
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index a22c10b..a007b0f 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -20,11 +20,6 @@
 #include "cpu.h"
 #include "helper.h"
 
-#define DT0 (env->dt0)
-#define DT1 (env->dt1)
-#define QT0 (env->qt0)
-#define QT1 (env->qt1)
-
 /* This function uses non-native bit order */
 #define GET_FIELD(X, FROM, TO)                                  \
     ((X) >> (63 - (TO)) & ((1ULL << ((TO) - (FROM) + 1)) - 1))
@@ -58,16 +53,16 @@ target_ulong helper_alignaddr(CPUState *env, target_ulong addr,
     return tmp & ~7ULL;
 }
 
-void helper_faligndata(CPUState *env)
+uint64_t helper_faligndata(CPUState *env, uint64_t src1, uint64_t src2)
 {
     uint64_t tmp;
 
-    tmp = (*((uint64_t *)&DT0)) << ((env->gsr & 7) * 8);
+    tmp = src1 << ((env->gsr & 7) * 8);
     /* on many architectures a shift of 64 does nothing */
     if ((env->gsr & 7) != 0) {
-        tmp |= (*((uint64_t *)&DT1)) >> (64 - (env->gsr & 7) * 8);
+        tmp |= src2 >> (64 - (env->gsr & 7) * 8);
     }
-    *((uint64_t *)&DT0) = tmp;
+    return tmp;
 }
 
 #ifdef HOST_WORDS_BIGENDIAN
@@ -102,12 +97,12 @@ typedef union {
     float32 f;
 } VIS32;
 
-void helper_fpmerge(CPUState *env)
+uint64_t helper_fpmerge(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
     /* Reverse calculation order to handle overlap */
     d.VIS_B64(7) = s.VIS_B64(3);
@@ -119,16 +114,16 @@ void helper_fpmerge(CPUState *env)
     d.VIS_B64(1) = s.VIS_B64(0);
     /* d.VIS_B64(0) = d.VIS_B64(0); */
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8x16(CPUState *env)
+uint64_t helper_fmul8x16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                 \
     tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B64(r);       \
@@ -143,16 +138,16 @@ void helper_fmul8x16(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8x16al(CPUState *env)
+uint64_t helper_fmul8x16al(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                 \
     tmp = (int32_t)d.VIS_SW64(1) * (int32_t)s.VIS_B64(r);       \
@@ -167,16 +162,16 @@ void helper_fmul8x16al(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8x16au(CPUState *env)
+uint64_t helper_fmul8x16au(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                 \
     tmp = (int32_t)d.VIS_SW64(0) * (int32_t)s.VIS_B64(r);       \
@@ -191,16 +186,16 @@ void helper_fmul8x16au(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8sux16(CPUState *env)
+uint64_t helper_fmul8sux16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8);       \
@@ -215,16 +210,16 @@ void helper_fmul8sux16(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8ulx16(CPUState *env)
+uint64_t helper_fmul8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2));        \
@@ -239,16 +234,16 @@ void helper_fmul8ulx16(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmuld8sux16(CPUState *env)
+uint64_t helper_fmuld8sux16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8);       \
@@ -262,16 +257,16 @@ void helper_fmuld8sux16(CPUState *env)
     PMUL(0);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmuld8ulx16(CPUState *env)
+uint64_t helper_fmuld8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2));        \
@@ -285,38 +280,38 @@ void helper_fmuld8ulx16(CPUState *env)
     PMUL(0);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fexpand(CPUState *env)
+uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS32 s;
     VIS64 d;
 
-    s.l = (uint32_t)(*(uint64_t *)&DT0 & 0xffffffff);
-    d.d = DT1;
+    s.l = (uint32_t)src1;
+    d.ll = src2;
     d.VIS_W64(0) = s.VIS_B32(0) << 4;
     d.VIS_W64(1) = s.VIS_B32(1) << 4;
     d.VIS_W64(2) = s.VIS_B32(2) << 4;
     d.VIS_W64(3) = s.VIS_B32(3) << 4;
 
-    DT0 = d.d;
+    return d.ll;
 }
 
 #define VIS_HELPER(name, F)                             \
-    void name##16(CPUState *env)                        \
+    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
-        s.d = DT0;                                      \
-        d.d = DT1;                                      \
+        s.ll = src1;                                    \
+        d.ll = src2;                                    \
                                                         \
         d.VIS_W64(0) = F(d.VIS_W64(0), s.VIS_W64(0));   \
         d.VIS_W64(1) = F(d.VIS_W64(1), s.VIS_W64(1));   \
         d.VIS_W64(2) = F(d.VIS_W64(2), s.VIS_W64(2));   \
         d.VIS_W64(3) = F(d.VIS_W64(3), s.VIS_W64(3));   \
                                                         \
-        DT0 = d.d;                                      \
+        return d.ll;                                    \
     }                                                   \
                                                         \
     uint32_t name##16s(CPUState *env, uint32_t src1,    \
@@ -333,17 +328,17 @@ void helper_fexpand(CPUState *env)
         return d.l;                                     \
     }                                                   \
                                                         \
-    void name##32(CPUState *env)                        \
+    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
-        s.d = DT0;                                      \
-        d.d = DT1;                                      \
+        s.ll = src1;                                    \
+        d.ll = src2;                                    \
                                                         \
         d.VIS_L64(0) = F(d.VIS_L64(0), s.VIS_L64(0));   \
         d.VIS_L64(1) = F(d.VIS_L64(1), s.VIS_L64(1));   \
                                                         \
-        DT0 = d.d;                                      \
+        return d.ll;                                    \
     }                                                   \
                                                         \
     uint32_t name##32s(CPUState *env, uint32_t src1,    \
@@ -365,12 +360,12 @@ VIS_HELPER(helper_fpadd, FADD)
 VIS_HELPER(helper_fpsub, FSUB)
 
 #define VIS_CMPHELPER(name, F)                                    \
-    uint64_t name##16(CPUState *env)                              \
+    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
-        s.d = DT0;                                                \
-        d.d = DT1;                                                \
+        s.ll = src1;                                              \
+        d.ll = src2;                                              \
                                                                   \
         d.VIS_W64(0) = F(s.VIS_W64(0), d.VIS_W64(0)) ? 1 : 0;     \
         d.VIS_W64(0) |= F(s.VIS_W64(1), d.VIS_W64(1)) ? 2 : 0;    \
@@ -381,12 +376,12 @@ VIS_HELPER(helper_fpsub, FSUB)
         return d.ll;                                              \
     }                                                             \
                                                                   \
-    uint64_t name##32(CPUState *env)                              \
+    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
-        s.d = DT0;                                                \
-        d.d = DT1;                                                \
+        s.ll = src1;                                              \
+        d.ll = src2;                                              \
                                                                   \
         d.VIS_L64(0) = F(s.VIS_L64(0), d.VIS_L64(0)) ? 1 : 0;     \
         d.VIS_L64(0) |= F(s.VIS_L64(1), d.VIS_L64(1)) ? 2 : 0;    \
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 05/16] target-sparc: Make FPU/VIS helpers const when possible.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (3 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 04/16] target-sparc: Pass float64 parameters instead of dt0/1 temporaries Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 06/16] target-sparc: Extract common code for floating-point operations Richard Henderson
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This also removes the unused ENV parameter from these helpers.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/fop_helper.c |    2 +-
 target-sparc/helper.h     |   50 +++++++++++++++------------
 target-sparc/translate.c  |   83 ++++++++++++++++++---------------------------
 target-sparc/vis_helper.c |   35 +++++++++----------
 4 files changed, 78 insertions(+), 92 deletions(-)

diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
index f6348c2..e652021 100644
--- a/target-sparc/fop_helper.c
+++ b/target-sparc/fop_helper.c
@@ -182,7 +182,7 @@ float32 helper_fabss(float32 src)
 }
 
 #ifdef TARGET_SPARC64
-float64 helper_fabsd(CPUState *env, float64 src)
+float64 helper_fabsd(float64 src)
 {
     return float64_abs(src);
 }
diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 86fad6e..ba0ad81 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -16,7 +16,7 @@ DEF_HELPER_1(rdccr, tl, env)
 DEF_HELPER_2(wrccr, void, env, tl)
 DEF_HELPER_1(rdcwp, tl, env)
 DEF_HELPER_2(wrcwp, void, env, tl)
-DEF_HELPER_3(array8, tl, env, tl, tl)
+DEF_HELPER_FLAGS_2(array8, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl, tl)
 DEF_HELPER_3(alignaddr, tl, env, tl, tl)
 DEF_HELPER_1(popc, tl, tl)
 DEF_HELPER_3(ldda_asi, void, tl, int, int)
@@ -48,7 +48,7 @@ DEF_HELPER_4(st_asi, void, tl, i64, int, int)
 DEF_HELPER_2(ldfsr, void, env, i32)
 DEF_HELPER_1(check_ieee_exceptions, void, env)
 DEF_HELPER_1(clear_float_exceptions, void, env)
-DEF_HELPER_1(fabss, f32, f32)
+DEF_HELPER_FLAGS_1(fabss, TCG_CALL_CONST | TCG_CALL_PURE, f32, f32)
 DEF_HELPER_2(fsqrts, f32, env, f32)
 DEF_HELPER_2(fsqrtd, f64, env, f64)
 DEF_HELPER_3(fcmps, void, env, f32, f32)
@@ -60,7 +60,7 @@ DEF_HELPER_1(fcmpq, void, env)
 DEF_HELPER_1(fcmpeq, void, env)
 #ifdef TARGET_SPARC64
 DEF_HELPER_2(ldxfsr, void, env, i64)
-DEF_HELPER_2(fabsd, f64, env, f64)
+DEF_HELPER_FLAGS_1(fabsd, TCG_CALL_CONST | TCG_CALL_PURE, f64, f64)
 DEF_HELPER_3(fcmps_fcc1, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc2, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc3, void, env, f32, f32)
@@ -102,14 +102,14 @@ DEF_HELPER_3(fdivs, f32, env, f32, f32)
 DEF_HELPER_3(fsmuld, f64, env, f32, f32)
 DEF_HELPER_3(fdmulq, void, env, f64, f64);
 
-DEF_HELPER_1(fnegs, f32, f32)
+DEF_HELPER_FLAGS_1(fnegs, TCG_CALL_CONST | TCG_CALL_PURE, f32, f32)
 DEF_HELPER_2(fitod, f64, env, s32)
 DEF_HELPER_2(fitoq, void, env, s32)
 
 DEF_HELPER_2(fitos, f32, env, s32)
 
 #ifdef TARGET_SPARC64
-DEF_HELPER_1(fnegd, f64, f64)
+DEF_HELPER_FLAGS_1(fnegd, TCG_CALL_CONST | TCG_CALL_PURE, f64, f64)
 DEF_HELPER_1(fnegq, void, env)
 DEF_HELPER_2(fxtos, f32, env, s64)
 DEF_HELPER_2(fxtod, f64, env, s64)
@@ -130,26 +130,32 @@ DEF_HELPER_2(fdtox, s64, env, f64)
 DEF_HELPER_1(fqtox, s64, env)
 DEF_HELPER_3(faligndata, i64, env, i64, i64)
 
-DEF_HELPER_3(fpmerge, i64, env, i64, i64)
-DEF_HELPER_3(fmul8x16, i64, env, i64, i64)
-DEF_HELPER_3(fmul8x16al, i64, env, i64, i64)
-DEF_HELPER_3(fmul8x16au, i64, env, i64, i64)
-DEF_HELPER_3(fmul8sux16, i64, env, i64, i64)
-DEF_HELPER_3(fmul8ulx16, i64, env, i64, i64)
-DEF_HELPER_3(fmuld8sux16, i64, env, i64, i64)
-DEF_HELPER_3(fmuld8ulx16, i64, env, i64, i64)
-DEF_HELPER_3(fexpand, i64, env, i64, i64)
-#define VIS_HELPER(name)                                 \
-    DEF_HELPER_3(f ## name ## 16, i64, env, i64, i64)    \
-    DEF_HELPER_3(f ## name ## 16s, i32, env, i32, i32)   \
-    DEF_HELPER_3(f ## name ## 32, i64, env, i64, i64)    \
-    DEF_HELPER_3(f ## name ## 32s, i32, env, i32, i32)
+DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+#define VIS_HELPER(name)                                                 \
+    DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
+                       i64, i64, i64)                                    \
+    DEF_HELPER_FLAGS_2(f ## name ## 16s, TCG_CALL_CONST | TCG_CALL_PURE, \
+                       i32, i32, i32)                                    \
+    DEF_HELPER_FLAGS_2(f ## name ## 32, TCG_CALL_CONST | TCG_CALL_PURE,  \
+                       i64, i64, i64)                                    \
+    DEF_HELPER_FLAGS_2(f ## name ## 32s, TCG_CALL_CONST | TCG_CALL_PURE, \
+                       i32, i32, i32)
 
 VIS_HELPER(padd);
 VIS_HELPER(psub);
-#define VIS_CMPHELPER(name)                              \
-    DEF_HELPER_3(f##name##16, i64, env, i64, i64)        \
-    DEF_HELPER_3(f##name##32, i64, env, i64, i64)
+#define VIS_CMPHELPER(name)                                              \
+    DEF_HELPER_FLAGS_2(f##name##16, TCG_CALL_CONST | TCG_CALL_PURE,      \
+                       i64, i64, i64)                                    \
+    DEF_HELPER_FLAGS_2(f##name##32, TCG_CALL_CONST | TCG_CALL_PURE,      \
+                       i64, i64, i64)
 VIS_CMPHELPER(cmpgt);
 VIS_CMPHELPER(cmpeq);
 VIS_CMPHELPER(cmple);
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index f41ef98..80f0058 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2722,7 +2722,7 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0xa: /* V9 fabsd */
                     cpu_src1_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fabsd(cpu_dst_64, cpu_env, cpu_src1_64);
+                    gen_helper_fabsd(cpu_dst_64, cpu_src1_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xb: /* V9 fabsq */
@@ -3902,14 +3902,14 @@ static void disas_sparc_insn(DisasContext * dc)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_array8(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_helper_array8(cpu_dst, cpu_src1, cpu_src2);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x012: /* VIS I array16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_array8(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_helper_array8(cpu_dst, cpu_src1, cpu_src2);
                     tcg_gen_shli_i64(cpu_dst, cpu_dst, 1);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
@@ -3917,7 +3917,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_array8(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_helper_array8(cpu_dst, cpu_src1, cpu_src2);
                     tcg_gen_shli_i64(cpu_dst, cpu_dst, 2);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
@@ -3936,64 +3936,56 @@ static void disas_sparc_insn(DisasContext * dc)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmple16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmple16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x022: /* VIS I fcmpne16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpne16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpne16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x024: /* VIS I fcmple32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmple32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmple32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x026: /* VIS I fcmpne32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpne32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpne32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x028: /* VIS I fcmpgt16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpgt16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpgt16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02a: /* VIS I fcmpeq16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpeq16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpeq16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02c: /* VIS I fcmpgt32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpgt32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpgt32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02e: /* VIS I fcmpeq32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpeq32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpeq32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x031: /* VIS I fmul8x16 */
@@ -4001,8 +3993,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16(cpu_dst_64, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8x16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x033: /* VIS I fmul8x16au */
@@ -4010,8 +4001,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16au(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8x16au(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x035: /* VIS I fmul8x16al */
@@ -4019,8 +4010,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16al(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8x16al(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x036: /* VIS I fmul8sux16 */
@@ -4028,8 +4019,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8sux16(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8sux16(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x037: /* VIS I fmul8ulx16 */
@@ -4037,8 +4028,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x038: /* VIS I fmuld8sux16 */
@@ -4046,8 +4037,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_env,
-                                           cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_src1_64,
+                                           cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
@@ -4055,8 +4046,8 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_env,
-                                           cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_src1_64,
+                                           cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x03a: /* VIS I fpack32 */
@@ -4079,8 +4070,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpmerge(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x04c: /* VIS II bshuffle */
@@ -4091,8 +4081,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fexpand(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fexpand(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x050: /* VIS I fpadd16 */
@@ -4100,8 +4089,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd16(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpadd16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x051: /* VIS I fpadd16s */
@@ -4109,8 +4097,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     cpu_src2_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpadd16s(cpu_dst_32, cpu_env,
-                                        cpu_src1_32, cpu_src2_32);
+                    gen_helper_fpadd16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x052: /* VIS I fpadd32 */
@@ -4118,8 +4105,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd32(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpadd32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x053: /* VIS I fpadd32s */
@@ -4135,8 +4121,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub16(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpsub16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x055: /* VIS I fpsub16s */
@@ -4144,8 +4129,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     cpu_src2_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpsub16s(cpu_dst_32, cpu_env,
-                                        cpu_src1_32, cpu_src2_32);
+                    gen_helper_fpsub16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x056: /* VIS I fpsub32 */
@@ -4153,8 +4137,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub32(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpsub32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x057: /* VIS I fpsub32s */
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index a007b0f..39c8d9a 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -28,8 +28,7 @@
 #define GET_FIELD_SP(X, FROM, TO)               \
     GET_FIELD(X, 63 - (TO), 63 - (FROM))
 
-target_ulong helper_array8(CPUState *env, target_ulong pixel_addr,
-                           target_ulong cubesize)
+target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
 {
     return (GET_FIELD_SP(pixel_addr, 60, 63) << (17 + 2 * cubesize)) |
         (GET_FIELD_SP(pixel_addr, 39, 39 + cubesize - 1) << (17 + cubesize)) |
@@ -97,7 +96,7 @@ typedef union {
     float32 f;
 } VIS32;
 
-uint64_t helper_fpmerge(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fpmerge(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
 
@@ -117,7 +116,7 @@ uint64_t helper_fpmerge(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -141,7 +140,7 @@ uint64_t helper_fmul8x16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16al(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16al(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -165,7 +164,7 @@ uint64_t helper_fmul8x16al(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16au(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16au(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -189,7 +188,7 @@ uint64_t helper_fmul8x16au(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8sux16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8sux16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -213,7 +212,7 @@ uint64_t helper_fmul8sux16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8ulx16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -237,7 +236,7 @@ uint64_t helper_fmul8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmuld8sux16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmuld8sux16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -260,7 +259,7 @@ uint64_t helper_fmuld8sux16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmuld8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmuld8ulx16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -283,7 +282,7 @@ uint64_t helper_fmuld8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fexpand(uint64_t src1, uint64_t src2)
 {
     VIS32 s;
     VIS64 d;
@@ -299,7 +298,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
 }
 
 #define VIS_HELPER(name, F)                             \
-    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##16(uint64_t src1, uint64_t src2)     \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
@@ -314,8 +313,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
         return d.ll;                                    \
     }                                                   \
                                                         \
-    uint32_t name##16s(CPUState *env, uint32_t src1,    \
-                       uint32_t src2)                   \
+    uint32_t name##16s(uint32_t src1, uint32_t src2)    \
     {                                                   \
         VIS32 s, d;                                     \
                                                         \
@@ -328,7 +326,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
         return d.l;                                     \
     }                                                   \
                                                         \
-    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##32(uint64_t src1, uint64_t src2)     \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
@@ -341,8 +339,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
         return d.ll;                                    \
     }                                                   \
                                                         \
-    uint32_t name##32s(CPUState *env, uint32_t src1,    \
-                       uint32_t src2)                   \
+    uint32_t name##32s(uint32_t src1, uint32_t src2)    \
     {                                                   \
         VIS32 s, d;                                     \
                                                         \
@@ -360,7 +357,7 @@ VIS_HELPER(helper_fpadd, FADD)
 VIS_HELPER(helper_fpsub, FSUB)
 
 #define VIS_CMPHELPER(name, F)                                    \
-    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##16(uint64_t src1, uint64_t src2)               \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
@@ -376,7 +373,7 @@ VIS_HELPER(helper_fpsub, FSUB)
         return d.ll;                                              \
     }                                                             \
                                                                   \
-    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##32(uint64_t src1, uint64_t src2)               \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 06/16] target-sparc: Extract common code for floating-point operations.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (4 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 05/16] target-sparc: Make FPU/VIS helpers const when possible Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 07/16] target-sparc: Extract float128 move to a function Richard Henderson
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |  835 +++++++++++++++++++++-------------------------
 1 files changed, 381 insertions(+), 454 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 80f0058..6c13f1c 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -1627,6 +1627,305 @@ static inline void gen_clear_float_exceptions(void)
     gen_helper_clear_float_exceptions(cpu_env);
 }
 
+static inline void gen_fop_FF(DisasContext *dc, int rd, int rs,
+                              void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32))
+{
+    TCGv_i32 dst, src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static inline void gen_ne_fop_FF(DisasContext *dc, int rd, int rs,
+                                 void (*gen)(TCGv_i32, TCGv_i32))
+{
+    TCGv_i32 dst, src;
+
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, src);
+
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static inline void gen_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32))
+{
+    TCGv_i32 dst, src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_F(dc, rs1);
+    src2 = gen_load_fpr_F(dc, rs2);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+#ifdef TARGET_SPARC64
+static inline void gen_ne_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
+                                  void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32))
+{
+    TCGv_i32 dst, src1, src2;
+
+    src1 = gen_load_fpr_F(dc, rs1);
+    src2 = gen_load_fpr_F(dc, rs2);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, src1, src2);
+
+    gen_store_fpr_F(dc, rd, dst);
+}
+#endif
+
+static inline void gen_fop_DD(DisasContext *dc, int rd, int rs,
+                              void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i64))
+{
+    TCGv_i64 dst, src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_D(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+#ifdef TARGET_SPARC64
+static inline void gen_ne_fop_DD(DisasContext *dc, int rd, int rs,
+                                 void (*gen)(TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src;
+
+    src = gen_load_fpr_D(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, src);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+#endif
+
+static inline void gen_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+#ifdef TARGET_SPARC64
+static inline void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
+                                  void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src1, src2;
+
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, src1, src2);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+#endif
+
+static inline void gen_fop_QQ(DisasContext *dc, int rd, int rs,
+                              void (*gen)(TCGv_ptr))
+{
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT1(QFPREG(rs));
+
+    gen(cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+#ifdef TARGET_SPARC64
+static inline void gen_ne_fop_QQ(DisasContext *dc, int rd, int rs,
+                                 void (*gen)(TCGv_ptr))
+{
+    gen_op_load_fpr_QT1(QFPREG(rs));
+
+    gen(cpu_env);
+
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+#endif
+
+static inline void gen_fop_QQQ(DisasContext *dc, int rd, int rs1, int rs2,
+                               void (*gen)(TCGv_ptr))
+{
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT0(QFPREG(rs1));
+    gen_op_load_fpr_QT1(QFPREG(rs2));
+
+    gen(cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+static inline void gen_fop_DFF(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32, TCGv_i32))
+{
+    TCGv_i64 dst;
+    TCGv_i32 src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_F(dc, rs1);
+    src2 = gen_load_fpr_F(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+static inline void gen_fop_QDD(DisasContext *dc, int rd, int rs1, int rs2,
+                               void (*gen)(TCGv_ptr, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+
+    gen(cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+#ifdef TARGET_SPARC64
+static inline void gen_fop_DF(DisasContext *dc, int rd, int rs,
+                              void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32))
+{
+    TCGv_i64 dst;
+    TCGv_i32 src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+#endif
+
+static inline void gen_ne_fop_DF(DisasContext *dc, int rd, int rs,
+                                 void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32))
+{
+    TCGv_i64 dst;
+    TCGv_i32 src;
+
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+static inline void gen_fop_FD(DisasContext *dc, int rd, int rs,
+                              void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i64))
+{
+    TCGv_i32 dst;
+    TCGv_i64 src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_D(dc, rs);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static inline void gen_fop_FQ(DisasContext *dc, int rd, int rs,
+                              void (*gen)(TCGv_i32, TCGv_ptr))
+{
+    TCGv_i32 dst;
+
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT1(QFPREG(rs));
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static inline void gen_fop_DQ(DisasContext *dc, int rd, int rs,
+                              void (*gen)(TCGv_i64, TCGv_ptr))
+{
+    TCGv_i64 dst;
+
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT1(QFPREG(rs));
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+static inline void gen_ne_fop_QF(DisasContext *dc, int rd, int rs,
+                                 void (*gen)(TCGv_ptr, TCGv_i32))
+{
+    TCGv_i32 src;
+
+    src = gen_load_fpr_F(dc, rs);
+
+    gen(cpu_env, src);
+
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+static inline void gen_ne_fop_QD(DisasContext *dc, int rd, int rs,
+                                 void (*gen)(TCGv_ptr, TCGv_i64))
+{
+    TCGv_i64 src;
+
+    src = gen_load_fpr_D(dc, rs);
+
+    gen(cpu_env, src);
+
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
 /* asi moves */
 #ifdef TARGET_SPARC64
 static inline TCGv_i32 gen_get_asi(int insn, TCGv r_addr)
@@ -2415,279 +2714,115 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
                     break;
                 case 0x5: /* fnegs */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fnegs(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs2, gen_helper_fnegs);
                     break;
                 case 0x9: /* fabss */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fabss(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs2, gen_helper_fabss);
                     break;
                 case 0x29: /* fsqrts */
                     CHECK_FPU_FEATURE(dc, FSQRT);
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fsqrts(cpu_dst_32, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FF(dc, rd, rs2, gen_helper_fsqrts);
                     break;
                 case 0x2a: /* fsqrtd */
                     CHECK_FPU_FEATURE(dc, FSQRT);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fsqrtd(cpu_dst_64, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DD(dc, rd, rs2, gen_helper_fsqrtd);
                     break;
                 case 0x2b: /* fsqrtq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fsqrtq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQ(dc, rd, rs2, gen_helper_fsqrtq);
                     break;
                 case 0x41: /* fadds */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fadds(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fadds);
                     break;
                 case 0x42: /* faddd */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_faddd(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_faddd);
                     break;
                 case 0x43: /* faddq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_faddq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_faddq);
                     break;
                 case 0x45: /* fsubs */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fsubs(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fsubs);
                     break;
                 case 0x46: /* fsubd */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fsubd(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fsubd);
                     break;
                 case 0x47: /* fsubq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fsubq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fsubq);
                     break;
                 case 0x49: /* fmuls */
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fmuls(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fmuls);
                     break;
                 case 0x4a: /* fmuld */
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld);
                     break;
                 case 0x4b: /* fmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fmulq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fmulq);
                     break;
                 case 0x4d: /* fdivs */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdivs(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fdivs);
                     break;
                 case 0x4e: /* fdivd */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fdivd(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fdivd);
                     break;
                 case 0x4f: /* fdivq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fdivq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fdivq);
                     break;
                 case 0x69: /* fsmuld */
                     CHECK_FPU_FEATURE(dc, FSMULD);
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fsmuld(cpu_dst_64, cpu_env,
-                                      cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DFF(dc, rd, rs1, rs2, gen_helper_fsmuld);
                     break;
                 case 0x6e: /* fdmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fdmulq(cpu_env, cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QDD(dc, rd, rs1, rs2, gen_helper_fdmulq);
                     break;
                 case 0xc4: /* fitos */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fitos(cpu_dst_32, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FF(dc, rd, rs2, gen_helper_fitos);
                     break;
                 case 0xc6: /* fdtos */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtos(cpu_dst_32, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FD(dc, rd, rs2, gen_helper_fdtos);
                     break;
                 case 0xc7: /* fqtos */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fqtos(cpu_dst_32, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FQ(dc, rd, rs2, gen_helper_fqtos);
                     break;
                 case 0xc8: /* fitod */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fitod(cpu_dst_64, cpu_env, cpu_src1_32);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DF(dc, rd, rs2, gen_helper_fitod);
                     break;
                 case 0xc9: /* fstod */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fstod(cpu_dst_64, cpu_env, cpu_src1_32);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DF(dc, rd, rs2, gen_helper_fstod);
                     break;
                 case 0xcb: /* fqtod */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_clear_float_exceptions();
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fqtod(cpu_dst_64, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DQ(dc, rd, rs2, gen_helper_fqtod);
                     break;
                 case 0xcc: /* fitoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fitoq(cpu_env, cpu_src1_32);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QF(dc, rd, rs2, gen_helper_fitoq);
                     break;
                 case 0xcd: /* fstoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fstoq(cpu_env, cpu_src1_32);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QF(dc, rd, rs2, gen_helper_fstoq);
                     break;
                 case 0xce: /* fdtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fdtoq(cpu_env, cpu_src1_64);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QD(dc, rd, rs2, gen_helper_fdtoq);
                     break;
                 case 0xd1: /* fstoi */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fstoi(cpu_dst_32, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FF(dc, rd, rs2, gen_helper_fstoi);
                     break;
                 case 0xd2: /* fdtoi */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtoi(cpu_dst_32, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FD(dc, rd, rs2, gen_helper_fdtoi);
                     break;
                 case 0xd3: /* fqtoi */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fqtoi(cpu_dst_32, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FQ(dc, rd, rs2, gen_helper_fqtoi);
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
@@ -2707,80 +2842,38 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0x6: /* V9 fnegd */
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fnegd(cpu_dst_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs2, gen_helper_fnegd);
                     break;
                 case 0x7: /* V9 fnegq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_helper_fnegq(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QQ(dc, rd, rs2, gen_helper_fnegq);
                     break;
                 case 0xa: /* V9 fabsd */
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fabsd(cpu_dst_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs2, gen_helper_fabsd);
                     break;
                 case 0xb: /* V9 fabsq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_helper_fabsq(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QQ(dc, rd, rs2, gen_helper_fabsq);
                     break;
                 case 0x81: /* V9 fstox */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fstox(cpu_dst_64, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DF(dc, rd, rs2, gen_helper_fstox);
                     break;
                 case 0x82: /* V9 fdtox */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fdtox(cpu_dst_64, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DD(dc, rd, rs2, gen_helper_fdtox);
                     break;
                 case 0x83: /* V9 fqtox */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fqtox(cpu_dst_64, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DQ(dc, rd, rs2, gen_helper_fqtox);
                     break;
                 case 0x84: /* V9 fxtos */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fxtos(cpu_dst_32, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FD(dc, rd, rs2, gen_helper_fxtos);
                     break;
                 case 0x88: /* V9 fxtod */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fxtod(cpu_dst_64, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DD(dc, rd, rs2, gen_helper_fxtod);
                     break;
                 case 0x8c: /* V9 fxtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fxtoq(cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QD(dc, rd, rs2, gen_helper_fxtoq);
                     break;
 #endif
                 default:
@@ -3990,65 +4083,31 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x031: /* VIS I fmul8x16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16);
                     break;
                 case 0x033: /* VIS I fmul8x16au */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16au(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16au);
                     break;
                 case 0x035: /* VIS I fmul8x16al */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16al(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16al);
                     break;
                 case 0x036: /* VIS I fmul8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8sux16(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8sux16);
                     break;
                 case 0x037: /* VIS I fmul8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8ulx16);
                     break;
                 case 0x038: /* VIS I fmuld8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_src1_64,
-                                           cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8sux16);
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_src1_64,
-                                           cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8ulx16);
                     break;
                 case 0x03a: /* VIS I fpack32 */
                 case 0x03b: /* VIS I fpack16 */
@@ -4067,86 +4126,46 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x04b: /* VIS I fpmerge */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
                     break;
                 case 0x04c: /* VIS II bshuffle */
                     // XXX
                     goto illegal_insn;
                 case 0x04d: /* VIS I fexpand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fexpand(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fexpand);
                     break;
                 case 0x050: /* VIS I fpadd16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpadd16);
                     break;
                 case 0x051: /* VIS I fpadd16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpadd16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, gen_helper_fpadd16s);
                     break;
                 case 0x052: /* VIS I fpadd32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpadd32);
                     break;
                 case 0x053: /* VIS I fpadd32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_add_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_add_i32);
                     break;
                 case 0x054: /* VIS I fpsub16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpsub16);
                     break;
                 case 0x055: /* VIS I fpsub16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpsub16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, gen_helper_fpsub16s);
                     break;
                 case 0x056: /* VIS I fpsub32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpsub32);
                     break;
                 case 0x057: /* VIS I fpsub32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_sub_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_sub_i32);
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4162,143 +4181,75 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_nor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_nor_i64);
                     break;
                 case 0x063: /* VIS I fnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_nor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_nor_i32);
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_andc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_andc_i64);
                     break;
                 case 0x065: /* VIS I fandnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_andc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_andc_i32);
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs2, tcg_gen_not_i64);
                     break;
                 case 0x067: /* VIS I fnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs2, tcg_gen_not_i32);
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_andc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs2, rs1, tcg_gen_andc_i64);
                     break;
                 case 0x069: /* VIS I fandnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_andc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs2, rs1, tcg_gen_andc_i32);
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs1, tcg_gen_not_i64);
                     break;
                 case 0x06b: /* VIS I fnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs1, tcg_gen_not_i32);
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_xor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_xor_i64);
                     break;
                 case 0x06d: /* VIS I fxors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_xor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_xor_i32);
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_nand_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_nand_i64);
                     break;
                 case 0x06f: /* VIS I fnands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_nand_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_nand_i32);
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_and_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_and_i64);
                     break;
                 case 0x071: /* VIS I fands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_and_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_and_i32);
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_eqv_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_eqv_i64);
                     break;
                 case 0x073: /* VIS I fxnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_eqv_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_eqv_i32);
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4312,19 +4263,11 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_orc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_orc_i64);
                     break;
                 case 0x077: /* VIS I fornot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_orc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_orc_i32);
                     break;
                 case 0x078: /* VIS I fsrc2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4338,35 +4281,19 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_orc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs2, rs1, tcg_gen_orc_i64);
                     break;
                 case 0x07b: /* VIS I fornot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_orc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs2, rs1, tcg_gen_orc_i32);
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_or_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_or_i64);
                     break;
                 case 0x07d: /* VIS I fors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_or_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_or_i32);
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 07/16] target-sparc: Extract float128 move to a function.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (5 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 06/16] target-sparc: Extract common code for floating-point operations Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 08/16] target-sparc: Undo cpu_fpr rename Richard Henderson
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |   50 ++++++++++++++++-----------------------------
 1 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 6c13f1c..106b406 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -222,6 +222,20 @@ static void gen_op_store_QT0_fpr(unsigned int dst)
                    offsetof(CPU_QuadU, l.lowest));
 }
 
+#ifdef TARGET_SPARC64
+static void gen_move_Q(int rd, int rs)
+{
+    rd = QFPREG(rd);
+    rs = QFPREG(rs);
+
+    tcg_gen_mov_i32(cpu__fpr[rd], cpu__fpr[rs]);
+    tcg_gen_mov_i32(cpu__fpr[rd + 1], cpu__fpr[rs + 1]);
+    tcg_gen_mov_i32(cpu__fpr[rd + 2], cpu__fpr[rs + 2]);
+    tcg_gen_mov_i32(cpu__fpr[rd + 3], cpu__fpr[rs + 3]);
+    gen_update_fprs_dirty(rd);
+}
+#endif
+
 /* moves */
 #ifdef CONFIG_USER_ONLY
 #define supervisor(dc) 0
@@ -2831,15 +2845,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x3: /* V9 fmovq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],
-                                    cpu__fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],
-                                    cpu__fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],
-                                    cpu__fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],
-                                    cpu__fpr[QFPREG(rs2) + 3]);
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_move_Q(rd, rs2);
                     break;
                 case 0x6: /* V9 fnegd */
                     gen_ne_fop_DD(dc, rd, rs2, gen_helper_fnegd);
@@ -2924,11 +2930,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)], cpu__fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1], cpu__fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2], cpu__fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3], cpu__fpr[QFPREG(rs2) + 3]);
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_move_Q(rd, rs2);
                     gen_set_label(l1);
                     break;
                 }
@@ -2978,15 +2980,7 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
-                                        cpu__fpr[QFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
-                                        cpu__fpr[QFPREG(rs2) + 1]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
-                                        cpu__fpr[QFPREG(rs2) + 2]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
-                                        cpu__fpr[QFPREG(rs2) + 3]);     \
-                        gen_update_fprs_dirty(QFPREG(rd));              \
+                        gen_move_Q(rd, rs2);                            \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -3077,15 +3071,7 @@ static void disas_sparc_insn(DisasContext * dc)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
-                                        cpu__fpr[QFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
-                                        cpu__fpr[QFPREG(rs2) + 1]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
-                                        cpu__fpr[QFPREG(rs2) + 2]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
-                                        cpu__fpr[QFPREG(rs2) + 3]);     \
-                        gen_update_fprs_dirty(QFPREG(rd));              \
+                        gen_move_Q(rd, rs2);                            \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 08/16] target-sparc: Undo cpu_fpr rename.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (6 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 07/16] target-sparc: Extract float128 move to a function Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 09/16] target-sparc: Change fpr representation to doubles Richard Henderson
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |   56 +++++++++++++++++++++++-----------------------
 1 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 106b406..0b95b64 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -63,7 +63,7 @@ static TCGv cpu_tmp0;
 static TCGv_i32 cpu_tmp32;
 static TCGv_i64 cpu_tmp64;
 /* Floating point registers */
-static TCGv_i32 cpu__fpr[TARGET_FPREGS];
+static TCGv_i32 cpu_fpr[TARGET_FPREGS];
 
 static target_ulong gen_opc_npc[OPC_BUF_SIZE];
 static target_ulong gen_opc_jump_pc[2];
@@ -126,12 +126,12 @@ static inline void gen_update_fprs_dirty(int rd)
 /* floating point registers moves */
 static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 {
-    return cpu__fpr[src];
+    return cpu_fpr[src];
 }
 
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
-    tcg_gen_mov_i32(cpu__fpr[dst], v);
+    tcg_gen_mov_i32(cpu_fpr[dst], v);
     gen_update_fprs_dirty(dst);
 }
 
@@ -146,13 +146,13 @@ static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
     src = DFPREG(src);
 
 #if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu__fpr[src]);
-    tcg_gen_mov_i32(TCGV_LOW(ret), cpu__fpr[src + 1]);
+    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu_fpr[src]);
+    tcg_gen_mov_i32(TCGV_LOW(ret), cpu_fpr[src + 1]);
 #else
     {
         TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_extu_i32_i64(ret, cpu__fpr[src]);
-        tcg_gen_extu_i32_i64(t, cpu__fpr[src + 1]);
+        tcg_gen_extu_i32_i64(ret, cpu_fpr[src]);
+        tcg_gen_extu_i32_i64(t, cpu_fpr[src + 1]);
         tcg_gen_shli_i64(ret, ret, 32);
         tcg_gen_or_i64(ret, ret, t);
         tcg_temp_free_i64(t);
@@ -173,9 +173,9 @@ static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
     tcg_gen_mov_i32(cpu__fpu[dst], TCGV_HIGH(v));
     tcg_gen_mov_i32(cpu__fpu[dst + 1], TCGV_LOW(v));
 #else
-    tcg_gen_trunc_i64_i32(cpu__fpr[dst + 1], v);
+    tcg_gen_trunc_i64_i32(cpu_fpr[dst + 1], v);
     tcg_gen_shri_i64(v, v, 32);
-    tcg_gen_trunc_i64_i32(cpu__fpr[dst], v);
+    tcg_gen_trunc_i64_i32(cpu_fpr[dst], v);
 #endif
 
     gen_update_fprs_dirty(dst);
@@ -188,37 +188,37 @@ static TCGv_i64 gen_dest_fpr_D(void)
 
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_load_fpr_QT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_store_QT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_ld_i32(cpu__fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_ld_i32(cpu__fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
@@ -228,10 +228,10 @@ static void gen_move_Q(int rd, int rs)
     rd = QFPREG(rd);
     rs = QFPREG(rs);
 
-    tcg_gen_mov_i32(cpu__fpr[rd], cpu__fpr[rs]);
-    tcg_gen_mov_i32(cpu__fpr[rd + 1], cpu__fpr[rs + 1]);
-    tcg_gen_mov_i32(cpu__fpr[rd + 2], cpu__fpr[rs + 2]);
-    tcg_gen_mov_i32(cpu__fpr[rd + 3], cpu__fpr[rs + 3]);
+    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs]);
+    tcg_gen_mov_i32(cpu_fpr[rd + 1], cpu_fpr[rs + 1]);
+    tcg_gen_mov_i32(cpu_fpr[rd + 2], cpu_fpr[rs + 2]);
+    tcg_gen_mov_i32(cpu_fpr[rd + 3], cpu_fpr[rs + 3]);
     gen_update_fprs_dirty(rd);
 }
 #endif
@@ -5251,9 +5251,9 @@ void gen_intermediate_code_init(CPUSPARCState *env)
                                               offsetof(CPUState, gregs[i]),
                                               gregnames[i]);
         for (i = 0; i < TARGET_FPREGS; i++)
-            cpu__fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
-                                                 offsetof(CPUState, fpr[i]),
-                                                 fregnames[i]);
+            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
+                                                offsetof(CPUState, fpr[i]),
+                                                fregnames[i]);
 
         /* register helpers */
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 09/16] target-sparc: Change fpr representation to doubles.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (7 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 08/16] target-sparc: Undo cpu_fpr rename Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 10/16] target-sparc: Do exceptions management fully inside the helpers Richard Henderson
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This allows a more efficient representation for 64-bit hosts.
It should be about the same for 32-bit hosts, as we can still
access the individual pieces of the double.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 gdbstub.c                  |   35 +++++++---
 linux-user/signal.c        |   28 +++++----
 monitor.c                  |   96 ++++++++++++++--------------
 target-sparc/cpu.h         |    7 +-
 target-sparc/cpu_init.c    |    6 +-
 target-sparc/ldst_helper.c |   71 +++++++++------------
 target-sparc/machine.c     |   20 ++----
 target-sparc/translate.c   |  150 +++++++++++++++++++++-----------------------
 8 files changed, 202 insertions(+), 211 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 4009058..a25f404 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -814,7 +814,11 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t *mem_buf, int n)
 #if defined(TARGET_ABI32) || !defined(TARGET_SPARC64)
     if (n < 64) {
         /* fprs */
-        GET_REG32(*((uint32_t *)&env->fpr[n - 32]));
+        if (n & 1) {
+            GET_REG32(env->fpr[(n - 32) / 2].l.lower);
+        } else {
+            GET_REG32(env->fpr[(n - 32) / 2].l.upper);
+        }
     }
     /* Y, PSR, WIM, TBR, PC, NPC, FPSR, CPSR */
     switch (n) {
@@ -831,15 +835,15 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t *mem_buf, int n)
 #else
     if (n < 64) {
         /* f0-f31 */
-        GET_REG32(*((uint32_t *)&env->fpr[n - 32]));
+        if (n & 1) {
+            GET_REG32(env->fpr[(n - 32) / 2].l.lower);
+        } else {
+            GET_REG32(env->fpr[(n - 32) / 2].l.upper);
+        }
     }
     if (n < 80) {
         /* f32-f62 (double width, even numbers only) */
-        uint64_t val;
-
-        val = (uint64_t)*((uint32_t *)&env->fpr[(n - 64) * 2 + 32]) << 32;
-        val |= *((uint32_t *)&env->fpr[(n - 64) * 2 + 33]);
-        GET_REG64(val);
+        GET_REG64(env->fpr[(n - 32) / 2].ll);
     }
     switch (n) {
     case 80: GET_REGL(env->pc);
@@ -878,7 +882,12 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n)
 #if defined(TARGET_ABI32) || !defined(TARGET_SPARC64)
     else if (n < 64) {
         /* fprs */
-        *((uint32_t *)&env->fpr[n - 32]) = tmp;
+        /* f0-f31 */
+        if (n & 1) {
+            env->fpr[(n - 32) / 2].l.lower = tmp;
+        } else {
+            env->fpr[(n - 32) / 2].l.upper = tmp;
+        }
     } else {
         /* Y, PSR, WIM, TBR, PC, NPC, FPSR, CPSR */
         switch (n) {
@@ -896,12 +905,16 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n)
 #else
     else if (n < 64) {
         /* f0-f31 */
-        env->fpr[n] = ldfl_p(mem_buf);
+        tmp = ldl_p(mem_buf);
+        if (n & 1) {
+            env->fpr[(n - 32) / 2].l.lower = tmp;
+        } else {
+            env->fpr[(n - 32) / 2].l.upper = tmp;
+        }
         return 4;
     } else if (n < 80) {
         /* f32-f62 (double width, even numbers only) */
-        *((uint32_t *)&env->fpr[(n - 64) * 2 + 32]) = tmp >> 32;
-        *((uint32_t *)&env->fpr[(n - 64) * 2 + 33]) = tmp;
+        env->fpr[(n - 32) / 2].ll = tmp;
     } else {
         switch (n) {
         case 80: env->pc = tmp; break;
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 40c5eb1..f3b767e 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -2296,12 +2296,14 @@ void sparc64_set_context(CPUSPARCState *env)
      */
     err |= __get_user(env->fprs, &(ucp->tuc_mcontext.mc_fpregs.mcfpu_fprs));
     {
-        uint32_t *src, *dst;
-        src = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
-        dst = env->fpr;
-        /* XXX: check that the CPU storage is the same as user context */
-        for (i = 0; i < 64; i++, dst++, src++)
-            err |= __get_user(*dst, src);
+        uint32_t *src = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
+        for (i = 0; i < 64; i++, src++) {
+            if (i & 1) {
+                err |= __get_user(env->fpr[i/2].l.lower, src);
+            } else {
+                err |= __get_user(env->fpr[i/2].l.upper, src);
+            }
+        }
     }
     err |= __get_user(env->fsr,
                       &(ucp->tuc_mcontext.mc_fpregs.mcfpu_fsr));
@@ -2390,12 +2392,14 @@ void sparc64_get_context(CPUSPARCState *env)
     err |= __put_user(i7, &(mcp->mc_i7));
 
     {
-        uint32_t *src, *dst;
-        src = env->fpr;
-        dst = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
-        /* XXX: check that the CPU storage is the same as user context */
-        for (i = 0; i < 64; i++, dst++, src++)
-            err |= __put_user(*src, dst);
+        uint32_t *dst = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
+        for (i = 0; i < 64; i++, dst++) {
+            if (i & 1) {
+                err |= __put_user(env->fpr[i/2].l.lower, dst);
+            } else {
+                err |= __put_user(env->fpr[i/2].l.upper, dst);
+            }
+        }
     }
     err |= __put_user(env->fsr, &(mcp->mc_fpregs.mcfpu_fsr));
     err |= __put_user(env->gsr, &(mcp->mc_fpregs.mcfpu_gsr));
diff --git a/monitor.c b/monitor.c
index ffda0fe..d13bd15 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3471,55 +3471,55 @@ static const MonitorDef monitor_defs[] = {
 #endif
     { "tbr", offsetof(CPUState, tbr) },
     { "fsr", offsetof(CPUState, fsr) },
-    { "f0", offsetof(CPUState, fpr[0]) },
-    { "f1", offsetof(CPUState, fpr[1]) },
-    { "f2", offsetof(CPUState, fpr[2]) },
-    { "f3", offsetof(CPUState, fpr[3]) },
-    { "f4", offsetof(CPUState, fpr[4]) },
-    { "f5", offsetof(CPUState, fpr[5]) },
-    { "f6", offsetof(CPUState, fpr[6]) },
-    { "f7", offsetof(CPUState, fpr[7]) },
-    { "f8", offsetof(CPUState, fpr[8]) },
-    { "f9", offsetof(CPUState, fpr[9]) },
-    { "f10", offsetof(CPUState, fpr[10]) },
-    { "f11", offsetof(CPUState, fpr[11]) },
-    { "f12", offsetof(CPUState, fpr[12]) },
-    { "f13", offsetof(CPUState, fpr[13]) },
-    { "f14", offsetof(CPUState, fpr[14]) },
-    { "f15", offsetof(CPUState, fpr[15]) },
-    { "f16", offsetof(CPUState, fpr[16]) },
-    { "f17", offsetof(CPUState, fpr[17]) },
-    { "f18", offsetof(CPUState, fpr[18]) },
-    { "f19", offsetof(CPUState, fpr[19]) },
-    { "f20", offsetof(CPUState, fpr[20]) },
-    { "f21", offsetof(CPUState, fpr[21]) },
-    { "f22", offsetof(CPUState, fpr[22]) },
-    { "f23", offsetof(CPUState, fpr[23]) },
-    { "f24", offsetof(CPUState, fpr[24]) },
-    { "f25", offsetof(CPUState, fpr[25]) },
-    { "f26", offsetof(CPUState, fpr[26]) },
-    { "f27", offsetof(CPUState, fpr[27]) },
-    { "f28", offsetof(CPUState, fpr[28]) },
-    { "f29", offsetof(CPUState, fpr[29]) },
-    { "f30", offsetof(CPUState, fpr[30]) },
-    { "f31", offsetof(CPUState, fpr[31]) },
+    { "f0", offsetof(CPUState, fpr[0].l.upper) },
+    { "f1", offsetof(CPUState, fpr[0].l.lower) },
+    { "f2", offsetof(CPUState, fpr[1].l.upper) },
+    { "f3", offsetof(CPUState, fpr[1].l.lower) },
+    { "f4", offsetof(CPUState, fpr[2].l.upper) },
+    { "f5", offsetof(CPUState, fpr[2].l.lower) },
+    { "f6", offsetof(CPUState, fpr[3].l.upper) },
+    { "f7", offsetof(CPUState, fpr[3].l.lower) },
+    { "f8", offsetof(CPUState, fpr[4].l.upper) },
+    { "f9", offsetof(CPUState, fpr[4].l.lower) },
+    { "f10", offsetof(CPUState, fpr[5].l.upper) },
+    { "f11", offsetof(CPUState, fpr[5].l.lower) },
+    { "f12", offsetof(CPUState, fpr[6].l.upper) },
+    { "f13", offsetof(CPUState, fpr[6].l.lower) },
+    { "f14", offsetof(CPUState, fpr[7].l.upper) },
+    { "f15", offsetof(CPUState, fpr[7].l.lower) },
+    { "f16", offsetof(CPUState, fpr[8].l.upper) },
+    { "f17", offsetof(CPUState, fpr[8].l.lower) },
+    { "f18", offsetof(CPUState, fpr[9].l.upper) },
+    { "f19", offsetof(CPUState, fpr[9].l.lower) },
+    { "f20", offsetof(CPUState, fpr[10].l.upper) },
+    { "f21", offsetof(CPUState, fpr[10].l.lower) },
+    { "f22", offsetof(CPUState, fpr[11].l.upper) },
+    { "f23", offsetof(CPUState, fpr[11].l.lower) },
+    { "f24", offsetof(CPUState, fpr[12].l.upper) },
+    { "f25", offsetof(CPUState, fpr[12].l.lower) },
+    { "f26", offsetof(CPUState, fpr[13].l.upper) },
+    { "f27", offsetof(CPUState, fpr[13].l.lower) },
+    { "f28", offsetof(CPUState, fpr[14].l.upper) },
+    { "f29", offsetof(CPUState, fpr[14].l.lower) },
+    { "f30", offsetof(CPUState, fpr[15].l.upper) },
+    { "f31", offsetof(CPUState, fpr[15].l.lower) },
 #ifdef TARGET_SPARC64
-    { "f32", offsetof(CPUState, fpr[32]) },
-    { "f34", offsetof(CPUState, fpr[34]) },
-    { "f36", offsetof(CPUState, fpr[36]) },
-    { "f38", offsetof(CPUState, fpr[38]) },
-    { "f40", offsetof(CPUState, fpr[40]) },
-    { "f42", offsetof(CPUState, fpr[42]) },
-    { "f44", offsetof(CPUState, fpr[44]) },
-    { "f46", offsetof(CPUState, fpr[46]) },
-    { "f48", offsetof(CPUState, fpr[48]) },
-    { "f50", offsetof(CPUState, fpr[50]) },
-    { "f52", offsetof(CPUState, fpr[52]) },
-    { "f54", offsetof(CPUState, fpr[54]) },
-    { "f56", offsetof(CPUState, fpr[56]) },
-    { "f58", offsetof(CPUState, fpr[58]) },
-    { "f60", offsetof(CPUState, fpr[60]) },
-    { "f62", offsetof(CPUState, fpr[62]) },
+    { "f32", offsetof(CPUState, fpr[16]) },
+    { "f34", offsetof(CPUState, fpr[17]) },
+    { "f36", offsetof(CPUState, fpr[18]) },
+    { "f38", offsetof(CPUState, fpr[19]) },
+    { "f40", offsetof(CPUState, fpr[20]) },
+    { "f42", offsetof(CPUState, fpr[21]) },
+    { "f44", offsetof(CPUState, fpr[22]) },
+    { "f46", offsetof(CPUState, fpr[23]) },
+    { "f48", offsetof(CPUState, fpr[24]) },
+    { "f50", offsetof(CPUState, fpr[25]) },
+    { "f52", offsetof(CPUState, fpr[26]) },
+    { "f54", offsetof(CPUState, fpr[27]) },
+    { "f56", offsetof(CPUState, fpr[28]) },
+    { "f58", offsetof(CPUState, fpr[29]) },
+    { "f60", offsetof(CPUState, fpr[30]) },
+    { "f62", offsetof(CPUState, fpr[31]) },
     { "asi", offsetof(CPUState, asi) },
     { "pstate", offsetof(CPUState, pstate) },
     { "cansave", offsetof(CPUState, cansave) },
diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 4eace33..38a7074 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -3,16 +3,17 @@
 
 #include "config.h"
 #include "qemu-common.h"
+#include "bswap.h"
 
 #if !defined(TARGET_SPARC64)
 #define TARGET_LONG_BITS 32
-#define TARGET_FPREGS 32
+#define TARGET_DPREGS 16
 #define TARGET_PAGE_BITS 12 /* 4k */
 #define TARGET_PHYS_ADDR_SPACE_BITS 36
 #define TARGET_VIRT_ADDR_SPACE_BITS 32
 #else
 #define TARGET_LONG_BITS 64
-#define TARGET_FPREGS 64
+#define TARGET_DPREGS 32
 #define TARGET_PAGE_BITS 13 /* 8k */
 #define TARGET_PHYS_ADDR_SPACE_BITS 41
 # ifdef TARGET_ABI32
@@ -395,7 +396,7 @@ typedef struct CPUSPARCState {
 
     uint32_t psr;      /* processor state register */
     target_ulong fsr;      /* FPU state register */
-    float32 fpr[TARGET_FPREGS];  /* floating point registers */
+    CPU_DoubleU fpr[TARGET_DPREGS];  /* floating point registers */
     uint32_t cwp;      /* index of current register window (extracted
                           from PSR) */
 #if !defined(TARGET_SPARC64) || defined(TARGET_ABI32)
diff --git a/target-sparc/cpu_init.c b/target-sparc/cpu_init.c
index 6954800..c7269b5 100644
--- a/target-sparc/cpu_init.c
+++ b/target-sparc/cpu_init.c
@@ -813,11 +813,11 @@ void cpu_dump_state(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
         }
     }
     cpu_fprintf(f, "\nFloating Point Registers:\n");
-    for (i = 0; i < TARGET_FPREGS; i++) {
+    for (i = 0; i < TARGET_DPREGS; i++) {
         if ((i & 3) == 0) {
-            cpu_fprintf(f, "%%f%02d:", i);
+            cpu_fprintf(f, "%%f%02d:", i * 2);
         }
-        cpu_fprintf(f, " %016f", *(float *)&env->fpr[i]);
+        cpu_fprintf(f, " %016" PRIx64, env->fpr[i].ll);
         if ((i & 3) == 3) {
             cpu_fprintf(f, "\n");
         }
diff --git a/target-sparc/ldst_helper.c b/target-sparc/ldst_helper.c
index 80e5408..b59707e 100644
--- a/target-sparc/ldst_helper.c
+++ b/target-sparc/ldst_helper.c
@@ -2045,7 +2045,7 @@ void helper_ldda_asi(target_ulong addr, int asi, int rd)
 void helper_ldf_asi(target_ulong addr, int asi, int size, int rd)
 {
     unsigned int i;
-    CPU_DoubleU u;
+    target_ulong val;
 
     helper_check_align(addr, 3);
     addr = asi_address_mask(env, asi, addr);
@@ -2060,13 +2060,11 @@ void helper_ldf_asi(target_ulong addr, int asi, int size, int rd)
             return;
         }
         helper_check_align(addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            *(uint32_t *)&env->fpr[rd++] = helper_ld_asi(addr, asi & 0x8f, 4,
-                                                         0);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
+            env->fpr[rd/2].ll = helper_ld_asi(addr, asi & 0x8f, 8, 0);
         }
-
         return;
+
     case 0x16: /* UA2007 Block load primary, user privilege */
     case 0x17: /* UA2007 Block load secondary, user privilege */
     case 0x1e: /* UA2007 Block load primary LE, user privilege */
@@ -2080,13 +2078,11 @@ void helper_ldf_asi(target_ulong addr, int asi, int size, int rd)
             return;
         }
         helper_check_align(addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            *(uint32_t *)&env->fpr[rd++] = helper_ld_asi(addr, asi & 0x19, 4,
-                                                         0);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 4) {
+            env->fpr[rd/2].ll = helper_ld_asi(addr, asi & 0x19, 8, 0);
         }
-
         return;
+
     default:
         break;
     }
@@ -2094,20 +2090,19 @@ void helper_ldf_asi(target_ulong addr, int asi, int size, int rd)
     switch (size) {
     default:
     case 4:
-        *((uint32_t *)&env->fpr[rd]) = helper_ld_asi(addr, asi, size, 0);
+        val = helper_ld_asi(addr, asi, size, 0);
+        if (rd & 1) {
+            env->fpr[rd/2].l.lower = val;
+        } else {
+            env->fpr[rd/2].l.upper = val;
+        }
         break;
     case 8:
-        u.ll = helper_ld_asi(addr, asi, size, 0);
-        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
-        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
+        env->fpr[rd/2].ll = helper_ld_asi(addr, asi, size, 0);
         break;
     case 16:
-        u.ll = helper_ld_asi(addr, asi, 8, 0);
-        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
-        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
-        u.ll = helper_ld_asi(addr + 8, asi, 8, 0);
-        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
-        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
+        env->fpr[rd/2].ll = helper_ld_asi(addr, asi, 8, 0);
+        env->fpr[rd/2 + 1].ll = helper_ld_asi(addr + 8, asi, 8, 0);
         break;
     }
 }
@@ -2115,8 +2110,7 @@ void helper_ldf_asi(target_ulong addr, int asi, int size, int rd)
 void helper_stf_asi(target_ulong addr, int asi, int size, int rd)
 {
     unsigned int i;
-    target_ulong val = 0;
-    CPU_DoubleU u;
+    target_ulong val;
 
     helper_check_align(addr, 3);
     addr = asi_address_mask(env, asi, addr);
@@ -2133,10 +2127,8 @@ void helper_stf_asi(target_ulong addr, int asi, int size, int rd)
             return;
         }
         helper_check_align(addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            val = *(uint32_t *)&env->fpr[rd++];
-            helper_st_asi(addr, val, asi & 0x8f, 4);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
+            helper_st_asi(addr, env->fpr[rd/2].ll, asi & 0x8f, 8);
         }
 
         return;
@@ -2153,10 +2145,8 @@ void helper_stf_asi(target_ulong addr, int asi, int size, int rd)
             return;
         }
         helper_check_align(addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            val = *(uint32_t *)&env->fpr[rd++];
-            helper_st_asi(addr, val, asi & 0x19, 4);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
+            helper_st_asi(addr, env->fpr[rd/2].ll, asi & 0x19, 8);
         }
 
         return;
@@ -2167,20 +2157,19 @@ void helper_stf_asi(target_ulong addr, int asi, int size, int rd)
     switch (size) {
     default:
     case 4:
-        helper_st_asi(addr, *(uint32_t *)&env->fpr[rd], asi, size);
+        if (rd & 1) {
+            val = env->fpr[rd/2].l.lower;
+        } else {
+            val = env->fpr[rd/2].l.upper;
+        }
+        helper_st_asi(addr, val, asi, size);
         break;
     case 8:
-        u.l.upper = *(uint32_t *)&env->fpr[rd++];
-        u.l.lower = *(uint32_t *)&env->fpr[rd++];
-        helper_st_asi(addr, u.ll, asi, size);
+        helper_st_asi(addr, env->fpr[rd/2].ll, asi, size);
         break;
     case 16:
-        u.l.upper = *(uint32_t *)&env->fpr[rd++];
-        u.l.lower = *(uint32_t *)&env->fpr[rd++];
-        helper_st_asi(addr, u.ll, asi, 8);
-        u.l.upper = *(uint32_t *)&env->fpr[rd++];
-        u.l.lower = *(uint32_t *)&env->fpr[rd++];
-        helper_st_asi(addr + 8, u.ll, asi, 8);
+        helper_st_asi(addr, env->fpr[rd/2].ll, asi, 8);
+        helper_st_asi(addr + 8, env->fpr[rd/2 + 1].ll, asi, 8);
         break;
     }
 }
diff --git a/target-sparc/machine.c b/target-sparc/machine.c
index 56ae041..235b088 100644
--- a/target-sparc/machine.c
+++ b/target-sparc/machine.c
@@ -21,13 +21,9 @@ void cpu_save(QEMUFile *f, void *opaque)
         qemu_put_betls(f, &env->regbase[i]);
 
     /* FPU */
-    for(i = 0; i < TARGET_FPREGS; i++) {
-        union {
-            float32 f;
-            uint32_t i;
-        } u;
-        u.f = env->fpr[i];
-        qemu_put_be32(f, u.i);
+    for (i = 0; i < TARGET_DPREGS; i++) {
+        qemu_put_be32(f, env->fpr[i].l.upper);
+        qemu_put_be32(f, env->fpr[i].l.lower);
     }
 
     qemu_put_betls(f, &env->pc);
@@ -128,13 +124,9 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
         qemu_get_betls(f, &env->regbase[i]);
 
     /* FPU */
-    for(i = 0; i < TARGET_FPREGS; i++) {
-        union {
-            float32 f;
-            uint32_t i;
-        } u;
-        u.i = qemu_get_be32(f);
-        env->fpr[i] = u.f;
+    for (i = 0; i < TARGET_DPREGS; i++) {
+        env->fpr[i].l.upper = qemu_get_be32(f);
+        env->fpr[i].l.lower = qemu_get_be32(f);
     }
 
     qemu_get_betls(f, &env->pc);
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 0b95b64..2c123b1 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -63,7 +63,7 @@ static TCGv cpu_tmp0;
 static TCGv_i32 cpu_tmp32;
 static TCGv_i64 cpu_tmp64;
 /* Floating point registers */
-static TCGv_i32 cpu_fpr[TARGET_FPREGS];
+static TCGv_i64 cpu_fpr[TARGET_DPREGS];
 
 static target_ulong gen_opc_npc[OPC_BUF_SIZE];
 static target_ulong gen_opc_jump_pc[2];
@@ -82,8 +82,8 @@ typedef struct DisasContext {
     uint32_t cc_op;  /* current CC operation */
     struct TranslationBlock *tb;
     sparc_def_t *def;
-    TCGv_i64 t64[3];
-    int n_t64;
+    TCGv_i32 t32[3];
+    int n_t32;
 } DisasContext;
 
 // This function uses non-native bit order
@@ -126,12 +126,44 @@ static inline void gen_update_fprs_dirty(int rd)
 /* floating point registers moves */
 static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 {
-    return cpu_fpr[src];
+#if TCG_TARGET_REG_BITS == 32
+    if (src & 1) {
+        return TCGV_LOW(cpu_fpr[src / 2]);
+    } else {
+        return TCGV_HIGH(cpu_fpr[src / 2]);
+    }
+#else
+    if (src & 1) {
+        return MAKE_TCGV_I32(GET_TCGV_I64(cpu_fpr[src / 2]));
+    } else {
+        TCGv_i32 ret = tcg_temp_local_new_i32();
+        TCGv_i64 t = tcg_temp_new_i64();
+
+        tcg_gen_shri_i64(t, cpu_fpr[src / 2], 32);
+        tcg_gen_trunc_i64_i32(ret, t);
+        tcg_temp_free_i64(t);
+
+        dc->t32[dc->n_t32++] = ret;
+        assert(dc->n_t32 <= ARRAY_SIZE(dc->t32));
+
+        return ret;
+    }
+#endif
 }
 
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
-    tcg_gen_mov_i32(cpu_fpr[dst], v);
+#if TCG_TARGET_REG_BITS == 32
+    if (dst & 1) {
+        tcg_gen_mov_i32(TCGV_LOW(cpu_fpr[dst / 2]), v);
+    } else {
+        tcg_gen_mov_i32(TCGV_HIGH(cpu_fpr[dst / 2]), v);
+    }
+#else
+    TCGv_i64 t = MAKE_TCGV_I64(GET_TCGV_I32(v));
+    tcg_gen_deposit_i64(cpu_fpr[dst / 2], cpu_fpr[dst / 2], t,
+                        (dst & 1 ? 0 : 32), 32);
+#endif
     gen_update_fprs_dirty(dst);
 }
 
@@ -142,42 +174,14 @@ static TCGv_i32 gen_dest_fpr_F(void)
 
 static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
 {
-    TCGv_i64 ret = tcg_temp_new_i64();
     src = DFPREG(src);
-
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu_fpr[src]);
-    tcg_gen_mov_i32(TCGV_LOW(ret), cpu_fpr[src + 1]);
-#else
-    {
-        TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_extu_i32_i64(ret, cpu_fpr[src]);
-        tcg_gen_extu_i32_i64(t, cpu_fpr[src + 1]);
-        tcg_gen_shli_i64(ret, ret, 32);
-        tcg_gen_or_i64(ret, ret, t);
-        tcg_temp_free_i64(t);
-    }
-#endif
-
-    dc->t64[dc->n_t64++] = ret;
-    assert(dc->n_t64 <= ARRAY_SIZE(dc->t64));
-
-    return ret;
+    return cpu_fpr[src / 2];
 }
 
 static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
 {
     dst = DFPREG(dst);
-
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(cpu__fpu[dst], TCGV_HIGH(v));
-    tcg_gen_mov_i32(cpu__fpu[dst + 1], TCGV_LOW(v));
-#else
-    tcg_gen_trunc_i64_i32(cpu_fpr[dst + 1], v);
-    tcg_gen_shri_i64(v, v, 32);
-    tcg_gen_trunc_i64_i32(cpu_fpr[dst], v);
-#endif
-
+    tcg_gen_mov_i64(cpu_fpr[dst / 2], v);
     gen_update_fprs_dirty(dst);
 }
 
@@ -188,50 +192,36 @@ static TCGv_i64 gen_dest_fpr_D(void)
 
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lowest));
+    tcg_gen_st_i64(cpu_fpr[src / 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.upper));
+    tcg_gen_st_i64(cpu_fpr[src/2 + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.lower));
 }
 
 static void gen_op_load_fpr_QT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.lowest));
+    tcg_gen_st_i64(cpu_fpr[src / 2], cpu_env, offsetof(CPUSPARCState, qt1) +
+                   offsetof(CPU_QuadU, ll.upper));
+    tcg_gen_st_i64(cpu_fpr[src/2 + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
+                   offsetof(CPU_QuadU, ll.lower));
 }
 
 static void gen_op_store_QT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upper));
-    tcg_gen_ld_i32(cpu_fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lower));
-    tcg_gen_ld_i32(cpu_fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lowest));
+    tcg_gen_ld_i64(cpu_fpr[dst / 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.upper));
+    tcg_gen_ld_i64(cpu_fpr[dst/2 + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.lower));
 }
 
 #ifdef TARGET_SPARC64
-static void gen_move_Q(int rd, int rs)
+static void gen_move_Q(unsigned int rd, unsigned int rs)
 {
     rd = QFPREG(rd);
     rs = QFPREG(rs);
 
-    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs]);
-    tcg_gen_mov_i32(cpu_fpr[rd + 1], cpu_fpr[rs + 1]);
-    tcg_gen_mov_i32(cpu_fpr[rd + 2], cpu_fpr[rs + 2]);
-    tcg_gen_mov_i32(cpu_fpr[rd + 3], cpu_fpr[rs + 3]);
+    tcg_gen_mov_i64(cpu_fpr[rd / 2], cpu_fpr[rs / 2]);
+    tcg_gen_mov_i64(cpu_fpr[rd / 2 + 1], cpu_fpr[rs / 2 + 1]);
     gen_update_fprs_dirty(rd);
 }
 #endif
@@ -5001,6 +4991,13 @@ static void disas_sparc_insn(DisasContext * dc)
  egress:
     tcg_temp_free(cpu_tmp1);
     tcg_temp_free(cpu_tmp2);
+    if (dc->n_t32 != 0) {
+        int i;
+        for (i = dc->n_t32 - 1; i >= 0; --i) {
+            tcg_temp_free_i32(dc->t32[i]);
+        }
+        dc->n_t32 = 0;
+    }
 }
 
 static inline void gen_intermediate_code_internal(TranslationBlock * tb,
@@ -5100,9 +5097,6 @@ static inline void gen_intermediate_code_internal(TranslationBlock * tb,
     tcg_temp_free_i64(cpu_tmp64);
     tcg_temp_free_i32(cpu_tmp32);
     tcg_temp_free(cpu_tmp0);
-    for (j = dc->n_t64 - 1; j >= 0; --j) {
-        tcg_temp_free_i64(dc->t64[j]);
-    }
 
     if (tb->cflags & CF_LAST_IO)
         gen_io_end();
@@ -5168,15 +5162,11 @@ void gen_intermediate_code_init(CPUSPARCState *env)
         "g6",
         "g7",
     };
-    static const char * const fregnames[64] = {
-        "f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7",
-        "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",
-        "f16", "f17", "f18", "f19", "f20", "f21", "f22", "f23",
-        "f24", "f25", "f26", "f27", "f28", "f29", "f30", "f31",
-        "f32", "f33", "f34", "f35", "f36", "f37", "f38", "f39",
-        "f40", "f41", "f42", "f43", "f44", "f45", "f46", "f47",
-        "f48", "f49", "f50", "f51", "f52", "f53", "f54", "f55",
-        "f56", "f57", "f58", "f59", "f60", "f61", "f62", "f63",
+    static const char * const fregnames[32] = {
+        "f0", "f2", "f4", "f6", "f8", "f10", "f12", "f14",
+        "f16", "f18", "f20", "f22", "f24", "f26", "f28", "f30",
+        "f32", "f34", "f36", "f38", "f40", "f42", "f44", "f46",
+        "f48", "f50", "f52", "f54", "f56", "f58", "f60", "f62",
     };
 
     /* init various static tables */
@@ -5246,14 +5236,16 @@ void gen_intermediate_code_init(CPUSPARCState *env)
         cpu_tbr = tcg_global_mem_new(TCG_AREG0, offsetof(CPUState, tbr),
                                      "tbr");
 #endif
-        for (i = 1; i < 8; i++)
+        for (i = 1; i < 8; i++) {
             cpu_gregs[i] = tcg_global_mem_new(TCG_AREG0,
                                               offsetof(CPUState, gregs[i]),
                                               gregnames[i]);
-        for (i = 0; i < TARGET_FPREGS; i++)
-            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
+        }
+        for (i = 0; i < TARGET_DPREGS; i++) {
+            cpu_fpr[i] = tcg_global_mem_new_i64(TCG_AREG0,
                                                 offsetof(CPUState, fpr[i]),
                                                 fregnames[i]);
+        }
 
         /* register helpers */
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 10/16] target-sparc: Do exceptions management fully inside the helpers.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (8 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 09/16] target-sparc: Change fpr representation to doubles Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 11/16] target-sparc: Implement PDIST Richard Henderson
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This reduces the size of the individual translation blocks, since
we only emit a single call for each FOP rather than three.  In
addition, clear_float_exceptions expands inline to a single byte store.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/fop_helper.c |  206 ++++++++++++++++++++++++++++++++-------------
 target-sparc/helper.h     |    2 -
 target-sparc/translate.c  |   29 -------
 3 files changed, 146 insertions(+), 91 deletions(-)

diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
index e652021..c7a2512 100644
--- a/target-sparc/fop_helper.c
+++ b/target-sparc/fop_helper.c
@@ -23,22 +23,71 @@
 #define QT0 (env->qt0)
 #define QT1 (env->qt1)
 
+static void check_ieee_exceptions(CPUState *env)
+{
+    target_ulong status;
+
+    status = get_float_exception_flags(&env->fp_status);
+    if (status) {
+        /* Copy IEEE 754 flags into FSR */
+        if (status & float_flag_invalid) {
+            env->fsr |= FSR_NVC;
+        }
+        if (status & float_flag_overflow) {
+            env->fsr |= FSR_OFC;
+        }
+        if (status & float_flag_underflow) {
+            env->fsr |= FSR_UFC;
+        }
+        if (status & float_flag_divbyzero) {
+            env->fsr |= FSR_DZC;
+        }
+        if (status & float_flag_inexact) {
+            env->fsr |= FSR_NXC;
+        }
+
+        if ((env->fsr & FSR_CEXC_MASK) & ((env->fsr & FSR_TEM_MASK) >> 23)) {
+            /* Unmasked exception, generate a trap */
+            env->fsr |= FSR_FTT_IEEE_EXCP;
+            helper_raise_exception(env, TT_FP_EXCP);
+        } else {
+            /* Accumulate exceptions */
+            env->fsr |= (env->fsr & FSR_CEXC_MASK) << 5;
+        }
+    }
+}
+
+static inline void clear_float_exceptions(CPUState *env)
+{
+    set_float_exception_flags(0, &env->fp_status);
+}
+
 #define F_HELPER(name, p) void helper_f##name##p(CPUState *env)
 
 #define F_BINOP(name)                                           \
-    float32 helper_f ## name ## s (CPUState * env, float32 src1,\
+    float32 helper_f ## name ## s (CPUState *env, float32 src1, \
                                    float32 src2)                \
     {                                                           \
-        return float32_ ## name (src1, src2, &env->fp_status);  \
+        float32 ret;                                            \
+        clear_float_exceptions(env);                            \
+        ret = float32_ ## name (src1, src2, &env->fp_status);   \
+        check_ieee_exceptions(env);                             \
+        return ret;                                             \
     }                                                           \
     float64 helper_f ## name ## d (CPUState * env, float64 src1,\
                                    float64 src2)                \
     {                                                           \
-        return float64_ ## name (src1, src2, &env->fp_status);  \
+        float64 ret;                                            \
+        clear_float_exceptions(env);                            \
+        ret = float64_ ## name (src1, src2, &env->fp_status);   \
+        check_ieee_exceptions(env);                             \
+        return ret;                                             \
     }                                                           \
     F_HELPER(name, q)                                           \
     {                                                           \
+        clear_float_exceptions(env);                            \
         QT0 = float128_ ## name (QT0, QT1, &env->fp_status);    \
+        check_ieee_exceptions(env);                             \
     }
 
 F_BINOP(add);
@@ -49,16 +98,22 @@ F_BINOP(div);
 
 float64 helper_fsmuld(CPUState *env, float32 src1, float32 src2)
 {
-    return float64_mul(float32_to_float64(src1, &env->fp_status),
-                       float32_to_float64(src2, &env->fp_status),
-                       &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float64_mul(float32_to_float64(src1, &env->fp_status),
+                      float32_to_float64(src2, &env->fp_status),
+                      &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fdmulq(CPUState *env, float64 src1, float64 src2)
 {
+    clear_float_exceptions(env);
     QT0 = float128_mul(float64_to_float128(src1, &env->fp_status),
                        float64_to_float128(src2, &env->fp_status),
                        &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 float32 helper_fnegs(float32 src)
@@ -81,32 +136,48 @@ F_HELPER(neg, q)
 /* Integer to float conversion.  */
 float32 helper_fitos(CPUState *env, int32_t src)
 {
-    return int32_to_float32(src, &env->fp_status);
+    /* Inexact error possible converting int to float.  */
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = int32_to_float32(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fitod(CPUState *env, int32_t src)
 {
+    /* No possible exceptions converting int to double.  */
     return int32_to_float64(src, &env->fp_status);
 }
 
 void helper_fitoq(CPUState *env, int32_t src)
 {
+    /* No possible exceptions converting int to long double.  */
     QT0 = int32_to_float128(src, &env->fp_status);
 }
 
 #ifdef TARGET_SPARC64
 float32 helper_fxtos(CPUState *env, int64_t src)
 {
-    return int64_to_float32(src, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = int64_to_float32(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fxtod(CPUState *env, int64_t src)
 {
-    return int64_to_float64(src, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = int64_to_float64(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fxtoq(CPUState *env, int64_t src)
 {
+    /* No possible exceptions converting long long to long double.  */
     QT0 = int64_to_float128(src, &env->fp_status);
 }
 #endif
@@ -115,64 +186,108 @@ void helper_fxtoq(CPUState *env, int64_t src)
 /* floating point conversion */
 float32 helper_fdtos(CPUState *env, float64 src)
 {
-    return float64_to_float32(src, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = float64_to_float32(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fstod(CPUState *env, float32 src)
 {
-    return float32_to_float64(src, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float32_to_float64(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float32 helper_fqtos(CPUState *env)
 {
-    return float128_to_float32(QT1, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = float128_to_float32(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fstoq(CPUState *env, float32 src)
 {
+    clear_float_exceptions(env);
     QT0 = float32_to_float128(src, &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 float64 helper_fqtod(CPUState *env)
 {
-    return float128_to_float64(QT1, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float128_to_float64(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fdtoq(CPUState *env, float64 src)
 {
+    clear_float_exceptions(env);
     QT0 = float64_to_float128(src, &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 /* Float to integer conversion.  */
 int32_t helper_fstoi(CPUState *env, float32 src)
 {
-    return float32_to_int32_round_to_zero(src, &env->fp_status);
+    int32_t ret;
+    clear_float_exceptions(env);
+    ret = float32_to_int32_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int32_t helper_fdtoi(CPUState *env, float64 src)
 {
-    return float64_to_int32_round_to_zero(src, &env->fp_status);
+    int32_t ret;
+    clear_float_exceptions(env);
+    ret = float64_to_int32_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int32_t helper_fqtoi(CPUState *env)
 {
-    return float128_to_int32_round_to_zero(QT1, &env->fp_status);
+    int32_t ret;
+    clear_float_exceptions(env);
+    ret = float128_to_int32_round_to_zero(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 #ifdef TARGET_SPARC64
 int64_t helper_fstox(CPUState *env, float32 src)
 {
-    return float32_to_int64_round_to_zero(src, &env->fp_status);
+    int64_t ret;
+    clear_float_exceptions(env);
+    ret = float32_to_int64_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int64_t helper_fdtox(CPUState *env, float64 src)
 {
-    return float64_to_int64_round_to_zero(src, &env->fp_status);
+    int64_t ret;
+    clear_float_exceptions(env);
+    ret = float64_to_int64_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int64_t helper_fqtox(CPUState *env)
 {
-    return float128_to_int64_round_to_zero(QT1, &env->fp_status);
+    int64_t ret;
+    clear_float_exceptions(env);
+    ret = float128_to_int64_round_to_zero(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 #endif
 
@@ -195,17 +310,27 @@ void helper_fabsq(CPUState *env)
 
 float32 helper_fsqrts(CPUState *env, float32 src)
 {
-    return float32_sqrt(src, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = float32_sqrt(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fsqrtd(CPUState *env, float64 src)
 {
-    return float64_sqrt(src, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float64_sqrt(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fsqrtq(CPUState *env)
 {
+    clear_float_exceptions(env);
     QT0 = float128_sqrt(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 #define GEN_FCMP(name, size, reg1, reg2, FS, E)                         \
@@ -318,45 +443,6 @@ GEN_FCMP(fcmpeq_fcc3, float128, QT0, QT1, 26, 1);
 #undef GEN_FCMP_T
 #undef GEN_FCMP
 
-void helper_check_ieee_exceptions(CPUState *env)
-{
-    target_ulong status;
-
-    status = get_float_exception_flags(&env->fp_status);
-    if (status) {
-        /* Copy IEEE 754 flags into FSR */
-        if (status & float_flag_invalid) {
-            env->fsr |= FSR_NVC;
-        }
-        if (status & float_flag_overflow) {
-            env->fsr |= FSR_OFC;
-        }
-        if (status & float_flag_underflow) {
-            env->fsr |= FSR_UFC;
-        }
-        if (status & float_flag_divbyzero) {
-            env->fsr |= FSR_DZC;
-        }
-        if (status & float_flag_inexact) {
-            env->fsr |= FSR_NXC;
-        }
-
-        if ((env->fsr & FSR_CEXC_MASK) & ((env->fsr & FSR_TEM_MASK) >> 23)) {
-            /* Unmasked exception, generate a trap */
-            env->fsr |= FSR_FTT_IEEE_EXCP;
-            helper_raise_exception(env, TT_FP_EXCP);
-        } else {
-            /* Accumulate exceptions */
-            env->fsr |= (env->fsr & FSR_CEXC_MASK) << 5;
-        }
-    }
-}
-
-void helper_clear_float_exceptions(CPUState *env)
-{
-    set_float_exception_flags(0, &env->fp_status);
-}
-
 static inline void set_fsr(CPUState *env)
 {
     int rnd_mode;
diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index ba0ad81..22fb8ef 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -46,8 +46,6 @@ DEF_HELPER_4(ld_asi, i64, tl, int, int, int)
 DEF_HELPER_4(st_asi, void, tl, i64, int, int)
 #endif
 DEF_HELPER_2(ldfsr, void, env, i32)
-DEF_HELPER_1(check_ieee_exceptions, void, env)
-DEF_HELPER_1(clear_float_exceptions, void, env)
 DEF_HELPER_FLAGS_1(fabss, TCG_CALL_CONST | TCG_CALL_PURE, f32, f32)
 DEF_HELPER_2(fsqrts, f32, env, f32)
 DEF_HELPER_2(fsqrtd, f64, env, f64)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 2c123b1..0b9ace3 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -1626,23 +1626,16 @@ static inline void gen_op_clear_ieee_excp_and_FTT(void)
     tcg_gen_andi_tl(cpu_fsr, cpu_fsr, FSR_FTT_CEXC_NMASK);
 }
 
-static inline void gen_clear_float_exceptions(void)
-{
-    gen_helper_clear_float_exceptions(cpu_env);
-}
-
 static inline void gen_fop_FF(DisasContext *dc, int rd, int rs,
                               void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32))
 {
     TCGv_i32 dst, src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_F(dc, rs);
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1664,14 +1657,12 @@ static inline void gen_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
 {
     TCGv_i32 dst, src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_F(dc, rs1);
     src2 = gen_load_fpr_F(dc, rs2);
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1696,13 +1687,11 @@ static inline void gen_fop_DD(DisasContext *dc, int rd, int rs,
 {
     TCGv_i64 dst, src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_D(dc, rs);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
@@ -1726,14 +1715,12 @@ static inline void gen_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
 {
     TCGv_i64 dst, src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_D(dc, rs1);
     src2 = gen_load_fpr_D(dc, rs2);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
@@ -1756,12 +1743,10 @@ static inline void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
 static inline void gen_fop_QQ(DisasContext *dc, int rd, int rs,
                               void (*gen)(TCGv_ptr))
 {
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT1(QFPREG(rs));
 
     gen(cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_op_store_QT0_fpr(QFPREG(rd));
     gen_update_fprs_dirty(QFPREG(rd));
 }
@@ -1782,13 +1767,11 @@ static inline void gen_ne_fop_QQ(DisasContext *dc, int rd, int rs,
 static inline void gen_fop_QQQ(DisasContext *dc, int rd, int rs1, int rs2,
                                void (*gen)(TCGv_ptr))
 {
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT0(QFPREG(rs1));
     gen_op_load_fpr_QT1(QFPREG(rs2));
 
     gen(cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_op_store_QT0_fpr(QFPREG(rd));
     gen_update_fprs_dirty(QFPREG(rd));
 }
@@ -1799,14 +1782,12 @@ static inline void gen_fop_DFF(DisasContext *dc, int rd, int rs1, int rs2,
     TCGv_i64 dst;
     TCGv_i32 src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_F(dc, rs1);
     src2 = gen_load_fpr_F(dc, rs2);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
@@ -1815,13 +1796,11 @@ static inline void gen_fop_QDD(DisasContext *dc, int rd, int rs1, int rs2,
 {
     TCGv_i64 src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_D(dc, rs1);
     src2 = gen_load_fpr_D(dc, rs2);
 
     gen(cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_op_store_QT0_fpr(QFPREG(rd));
     gen_update_fprs_dirty(QFPREG(rd));
 }
@@ -1833,13 +1812,11 @@ static inline void gen_fop_DF(DisasContext *dc, int rd, int rs,
     TCGv_i64 dst;
     TCGv_i32 src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_F(dc, rs);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 #endif
@@ -1864,13 +1841,11 @@ static inline void gen_fop_FD(DisasContext *dc, int rd, int rs,
     TCGv_i32 dst;
     TCGv_i64 src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_D(dc, rs);
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1879,13 +1854,11 @@ static inline void gen_fop_FQ(DisasContext *dc, int rd, int rs,
 {
     TCGv_i32 dst;
 
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT1(QFPREG(rs));
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1894,13 +1867,11 @@ static inline void gen_fop_DQ(DisasContext *dc, int rd, int rs,
 {
     TCGv_i64 dst;
 
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT1(QFPREG(rs));
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 11/16] target-sparc: Implement PDIST.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (9 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 10/16] target-sparc: Do exceptions management fully inside the helpers Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 12/16] target-sparc: Implement fpack{16, 32, fix} Richard Henderson
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 +
 target-sparc/translate.c  |   21 +++++++++++++++++++--
 target-sparc/vis_helper.c |   21 +++++++++++++++++++++
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 22fb8ef..22f9dce 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -137,6 +137,7 @@ DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_3(pdist, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 #define VIS_HELPER(name)                                                 \
     DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
                        i64, i64, i64)                                    \
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 0b9ace3..2646aaf 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -1738,6 +1738,21 @@ static inline void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
 
     gen_store_fpr_D(dc, rd, dst);
 }
+
+static inline void gen_ne_fop_DDDD(DisasContext *dc, int rd, int rs1, int rs2,
+                           void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src0, src1, src2;
+
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    src0 = gen_load_fpr_D(dc, rd);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, src0, src1, src2);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
 #endif
 
 static inline void gen_fop_QQ(DisasContext *dc, int rd, int rs,
@@ -4059,9 +4074,11 @@ static void disas_sparc_insn(DisasContext * dc)
                 case 0x03a: /* VIS I fpack32 */
                 case 0x03b: /* VIS I fpack16 */
                 case 0x03d: /* VIS I fpackfix */
-                case 0x03e: /* VIS I pdist */
-                    // XXX
                     goto illegal_insn;
+                case 0x03e: /* VIS I pdist */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_ne_fop_DDDD(dc, rd, rs1, rs2, gen_helper_pdist);
+                    break;
                 case 0x048: /* VIS I faligndata */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 39c8d9a..cd5d4a7 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -396,3 +396,24 @@ VIS_CMPHELPER(helper_fcmpgt, FCMPGT)
 VIS_CMPHELPER(helper_fcmpeq, FCMPEQ)
 VIS_CMPHELPER(helper_fcmple, FCMPLE)
 VIS_CMPHELPER(helper_fcmpne, FCMPNE)
+
+uint64_t helper_pdist(uint64_t sum, uint64_t src1, uint64_t src2)
+{
+    int i;
+    for (i = 0; i < 8; i++) {
+        int s1, s2;
+
+        s1 = (src1 >> (56 - (i * 8))) & 0xff;
+        s2 = (src2 >> (56 - (i * 8))) & 0xff;
+
+        /* Absolute value of difference. */
+        s1 -= s2;
+        if (s1 < 0) {
+            s1 = -s1;
+        }
+
+        sum += s1;
+    }
+
+    return sum;
+}
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 12/16] target-sparc: Implement fpack{16, 32, fix}.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (10 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 11/16] target-sparc: Implement PDIST Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 13/16] target-sparc: Implement EDGE* instructions Richard Henderson
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    3 ++
 target-sparc/translate.c  |   30 ++++++++++++++++++++-
 target-sparc/vis_helper.c |   64 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 96 insertions(+), 1 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 22f9dce..07c39a9 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -138,6 +138,9 @@ DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_3(pdist, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
+DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fpackfix, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
 #define VIS_HELPER(name)                                                 \
     DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
                        i64, i64, i64)                                    \
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 2646aaf..102c83a 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -1739,6 +1739,20 @@ static inline void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
     gen_store_fpr_D(dc, rd, dst);
 }
 
+static inline void gen_gsr_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
+                           void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src1, src2;
+
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_gsr, src1, src2);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+
 static inline void gen_ne_fop_DDDD(DisasContext *dc, int rd, int rs1, int rs2,
                            void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
 {
@@ -4072,9 +4086,23 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8ulx16);
                     break;
                 case 0x03a: /* VIS I fpack32 */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_gsr_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpack32);
+                    break;
                 case 0x03b: /* VIS I fpack16 */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpack16(cpu_dst_32, cpu_gsr, cpu_src1_64);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    break;
                 case 0x03d: /* VIS I fpackfix */
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpackfix(cpu_dst_32, cpu_gsr, cpu_src1_64);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    break;
                 case 0x03e: /* VIS I pdist */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     gen_ne_fop_DDDD(dc, rd, rs1, rs2, gen_helper_pdist);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index cd5d4a7..59ca8d7 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -417,3 +417,67 @@ uint64_t helper_pdist(uint64_t sum, uint64_t src1, uint64_t src2)
 
     return sum;
 }
+
+uint32_t helper_fpack16(uint64_t gsr, uint64_t rs2)
+{
+    int scale = (gsr >> 3) & 0xf;
+    uint32_t ret = 0;
+    int byte;
+
+    for (byte = 0; byte < 4; byte++) {
+        uint32_t val;
+        int16_t src = rs2 >> (byte * 16);
+        int32_t scaled = src << scale;
+        int32_t from_fixed = scaled >> 7;
+
+        val = (from_fixed < 0 ?  0 :
+               from_fixed > 255 ?  255 : from_fixed);
+
+        ret |= val << (8 * byte);
+    }
+
+    return ret;
+}
+
+uint64_t helper_fpack32(uint64_t gsr, uint64_t rs1, uint64_t rs2)
+{
+    int scale = (gsr >> 3) & 0x1f;
+    uint64_t ret = 0;
+    int word;
+
+    ret = (rs1 << 8) & ~(0x000000ff000000ffULL);
+    for (word = 0; word < 2; word++) {
+        uint64_t val;
+        int32_t src = rs2 >> (word * 32);
+        int64_t scaled = (int64_t)src << scale;
+        int64_t from_fixed = scaled >> 23;
+
+        val = (from_fixed < 0 ? 0 :
+               (from_fixed > 255) ? 255 : from_fixed);
+
+        ret |= val << (32 * word);
+    }
+
+    return ret;
+}
+
+uint32_t helper_fpackfix(uint64_t gsr, uint64_t rs2)
+{
+    int scale = (gsr >> 3) & 0x1f;
+    uint32_t ret = 0;
+    int word;
+
+    for (word = 0; word < 2; word++) {
+        uint32_t val;
+        int32_t src = rs2 >> (word * 32);
+        int64_t scaled = src << scale;
+        int64_t from_fixed = scaled >> 16;
+
+        val = (from_fixed < -32768 ? -32768 :
+               from_fixed > 32767 ?  32767 : from_fixed);
+
+        ret |= (val & 0xffff) << (word * 16);
+    }
+
+    return ret;
+}
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 13/16] target-sparc: Implement EDGE* instructions.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (11 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 12/16] target-sparc: Implement fpack{16, 32, fix} Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 14/16] target-sparc: Implement ALIGNADDR* inline Richard Henderson
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |  177 +++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 175 insertions(+), 2 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 102c83a..d02cf06 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2221,6 +2221,109 @@ static inline void gen_load_trap_state_at_tl(TCGv_ptr r_tsptr, TCGv_ptr cpu_env)
 
     tcg_temp_free_i32(r_tl);
 }
+
+static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
+                     int width, bool cc, bool left)
+{
+    TCGv lo1, lo2, t1, t2;
+    uint64_t amask, tabl, tabr;
+    int shift, imask, omask;
+
+    if (cc) {
+        tcg_gen_mov_tl(cpu_cc_src, s1);
+        tcg_gen_mov_tl(cpu_cc_src2, s2);
+        tcg_gen_sub_tl(cpu_cc_dst, s1, s2);
+        tcg_gen_movi_i32(cpu_cc_op, CC_OP_SUB);
+        dc->cc_op = CC_OP_SUB;
+    }
+
+    /* Theory of operation: there are two tables, left and right (not to
+       be confused with the left and right versions of the opcode).  These
+       are indexed by the low 3 bits of the inputs.  To make things "easy",
+       these tables are loaded into two constants, TABL and TABR below.
+       The operation index = (input & imask) << shift calculates the index
+       into the constant, while val = (table >> index) & omask calculates
+       the value we're looking for.  */
+    switch (width) {
+    case 8:
+        imask = 0x7;
+        shift = 3;
+        omask = 0xff;
+        if (left) {
+            tabl = 0x80c0e0f0f8fcfeffULL;
+            tabr = 0xff7f3f1f0f070301ULL;
+        } else {
+            tabl = 0x0103070f1f3f7fffULL;
+            tabr = 0xfffefcf8f0e0c080ULL;
+        }
+        break;
+    case 16:
+        imask = 0x6;
+        shift = 1;
+        omask = 0xf;
+        if (left) {
+            tabl = 0x8cef;
+            tabr = 0xf731;
+        } else {
+            tabl = 0x137f;
+            tabr = 0xfec8;
+        }
+        break;
+    case 32:
+        imask = 0x4;
+        shift = 0;
+        omask = 0x3;
+        if (left) {
+            tabl = (2 << 2) | 3;
+            tabr = (3 << 2) | 1;
+        } else {
+            tabl = (1 << 2) | 3;
+            tabr = (3 << 2) | 2;
+        }
+        break;
+    default:
+        abort();
+    }
+
+    lo1 = tcg_temp_new();
+    lo2 = tcg_temp_new();
+    tcg_gen_andi_tl(lo1, s1, imask);
+    tcg_gen_andi_tl(lo2, s2, imask);
+    tcg_gen_shli_tl(lo1, lo1, shift);
+    tcg_gen_shli_tl(lo2, lo2, shift);
+
+    t1 = tcg_const_tl(tabl);
+    t2 = tcg_const_tl(tabr);
+    tcg_gen_shr_tl(lo1, t1, lo1);
+    tcg_gen_shr_tl(lo2, t2, lo2);
+    tcg_gen_andi_tl(dst, lo1, omask);
+    tcg_gen_andi_tl(lo2, lo2, omask);
+
+    amask = -8;
+    if (AM_CHECK(dc)) {
+        amask &= 0xffffffffULL;
+    }
+    tcg_gen_andi_tl(s1, s1, amask);
+    tcg_gen_andi_tl(s2, s2, amask);
+
+    /* We want to compute
+        dst = (s1 == s2 ? lo1 : lo1 & lo2).
+       We've already done dst = lo1, so this reduces to
+        dst &= (s1 == s2 ? -1 : lo2)
+       Which we perform by
+        lo2 |= -(s1 == s2)
+        dst &= lo2
+    */
+    tcg_gen_setcond_tl(TCG_COND_EQ, t1, s1, s2);
+    tcg_gen_neg_tl(t1, t1);
+    tcg_gen_or_tl(lo2, lo2, t1);
+    tcg_gen_and_tl(dst, dst, lo2);
+
+    tcg_temp_free(lo1);
+    tcg_temp_free(lo2);
+    tcg_temp_free(t1);
+    tcg_temp_free(t2);
+}
 #endif
 
 #define CHECK_IU_FEATURE(dc, FEATURE)                      \
@@ -3954,19 +4057,89 @@ static void disas_sparc_insn(DisasContext * dc)
 
                 switch (opf) {
                 case 0x000: /* VIS I edge8cc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 1, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x001: /* VIS II edge8n */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 0, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x002: /* VIS I edge8lcc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 1, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x003: /* VIS II edge8ln */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 0, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x004: /* VIS I edge16cc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 1, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x005: /* VIS II edge16n */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 0, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x006: /* VIS I edge16lcc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 1, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x007: /* VIS II edge16ln */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 0, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x008: /* VIS I edge32cc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 1, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x009: /* VIS II edge32n */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 0, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x00a: /* VIS I edge32lcc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 1, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x00b: /* VIS II edge32ln */
-                    // XXX
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 0, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x010: /* VIS I array8 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 14/16] target-sparc: Implement ALIGNADDR* inline.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (12 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 13/16] target-sparc: Implement EDGE* instructions Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 15/16] target-sparc: Implement BMASK/BSHUFFLE Richard Henderson
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

While ALIGNADDR was implemented out-of-line, ALIGNADDRL was not
implemeneted at all.  However, this is a very simple operation
so we're better off doing this inline.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 -
 target-sparc/translate.c  |   24 ++++++++++++++++++++++--
 target-sparc/vis_helper.c |   11 -----------
 3 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 07c39a9..73fb0ee 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -17,7 +17,6 @@ DEF_HELPER_2(wrccr, void, env, tl)
 DEF_HELPER_1(rdcwp, tl, env)
 DEF_HELPER_2(wrcwp, void, env, tl)
 DEF_HELPER_FLAGS_2(array8, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl, tl)
-DEF_HELPER_3(alignaddr, tl, env, tl, tl)
 DEF_HELPER_1(popc, tl, tl)
 DEF_HELPER_3(ldda_asi, void, tl, int, int)
 DEF_HELPER_4(ldf_asi, void, tl, int, int, int)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index d02cf06..685a907 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2324,6 +2324,20 @@ static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
     tcg_temp_free(t1);
     tcg_temp_free(t2);
 }
+
+static void gen_alignaddr(TCGv dst, TCGv s1, TCGv s2, bool left)
+{
+    TCGv tmp = tcg_temp_new();
+
+    tcg_gen_add_tl(tmp, s1, s2);
+    tcg_gen_andi_tl(dst, tmp, -8);
+    if (left) {
+        tcg_gen_neg_tl(tmp, tmp);
+    }
+    tcg_gen_deposit_tl(cpu_gsr, cpu_gsr, tmp, 0, 3);
+
+    tcg_temp_free(tmp);
+}
 #endif
 
 #define CHECK_IU_FEATURE(dc, FEATURE)                      \
@@ -4167,11 +4181,17 @@ static void disas_sparc_insn(DisasContext * dc)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_alignaddr(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_alignaddr(cpu_dst, cpu_src1, cpu_src2, 0);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
-                case 0x019: /* VIS II bmask */
                 case 0x01a: /* VIS I alignaddrl */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    cpu_src1 = get_src1(insn, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_alignaddr(cpu_dst, cpu_src1, cpu_src2, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
+                case 0x019: /* VIS II bmask */
                     // XXX
                     goto illegal_insn;
                 case 0x020: /* VIS I fcmple16 */
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 59ca8d7..40adb47 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -41,17 +41,6 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
         GET_FIELD_SP(pixel_addr, 11, 12);
 }
 
-target_ulong helper_alignaddr(CPUState *env, target_ulong addr,
-                              target_ulong offset)
-{
-    uint64_t tmp;
-
-    tmp = addr + offset;
-    env->gsr &= ~7ULL;
-    env->gsr |= tmp & 7ULL;
-    return tmp & ~7ULL;
-}
-
 uint64_t helper_faligndata(CPUState *env, uint64_t src1, uint64_t src2)
 {
     uint64_t tmp;
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 15/16] target-sparc: Implement BMASK/BSHUFFLE.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (13 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 14/16] target-sparc: Implement ALIGNADDR* inline Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 16/16] target-sparc: Implement FALIGNDATA inline Richard Henderson
  2011-10-26 21:18 ` [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 +
 target-sparc/translate.c  |   14 ++++++++++----
 target-sparc/vis_helper.c |   29 +++++++++++++++++++++++++++++
 3 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 73fb0ee..3ee12a9 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -140,6 +140,7 @@ DEF_HELPER_FLAGS_3(pdist, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
 DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fpackfix, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
+DEF_HELPER_FLAGS_3(bshuffle, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 #define VIS_HELPER(name)                                                 \
     DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
                        i64, i64, i64)                                    \
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 685a907..50fc587 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -4192,8 +4192,13 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x019: /* VIS II bmask */
-                    // XXX
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    cpu_src1 = get_src1(insn, cpu_src1);
+                    cpu_src2 = get_src1(insn, cpu_src2);
+                    tcg_gen_add_tl(cpu_dst, cpu_src1, cpu_src2);
+                    tcg_gen_deposit_tl(cpu_gsr, cpu_gsr, cpu_dst, 32, 32);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x020: /* VIS I fcmple16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
@@ -4314,8 +4319,9 @@ static void disas_sparc_insn(DisasContext * dc)
                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
                     break;
                 case 0x04c: /* VIS II bshuffle */
-                    // XXX
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_gsr_fop_DDD(dc, rd, rs1, rs2, gen_helper_bshuffle);
+                    break;
                 case 0x04d: /* VIS I fexpand */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fexpand);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 40adb47..7830120 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -470,3 +470,32 @@ uint32_t helper_fpackfix(uint64_t gsr, uint64_t rs2)
 
     return ret;
 }
+
+uint64 helper_bshuffle(uint64_t gsr, uint64_t src1, uint64_t src2)
+{
+    union {
+        uint64_t ll[2];
+        uint8_t b[16];
+    } s;
+    VIS64 r;
+    uint32_t i, mask, host;
+
+    /* Set up S such that we can index across all of the bytes.  */
+#ifdef HOST_WORDS_BIGENDIAN
+    s.ll[0] = src1;
+    s.ll[1] = src2;
+    host = 0;
+#else
+    s.ll[1] = src1;
+    s.ll[0] = src2;
+    host = 15;
+#endif
+    mask = gsr >> 32;
+
+    for (i = 0; i < 8; ++i) {
+        unsigned e = (mask >> (28 - i*4)) & 0xf;
+        r.VIS_B64(i) = s.b[e ^ host];
+    }
+
+    return r.ll;
+}
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 16/16] target-sparc: Implement FALIGNDATA inline.
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (14 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 15/16] target-sparc: Implement BMASK/BSHUFFLE Richard Henderson
@ 2011-10-26 21:15 ` Richard Henderson
  2011-10-26 21:18 ` [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
  16 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This is a relatively simple sequence of shifts.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 -
 target-sparc/translate.c  |   32 ++++++++++++++++++++++++++------
 target-sparc/vis_helper.c |   12 ------------
 3 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 3ee12a9..faaf8dc 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -125,7 +125,6 @@ DEF_HELPER_1(fqtoi, s32, env)
 DEF_HELPER_2(fstox, s64, env, f32)
 DEF_HELPER_2(fdtox, s64, env, f64)
 DEF_HELPER_1(fqtox, s64, env)
-DEF_HELPER_3(faligndata, i64, env, i64, i64)
 
 DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 50fc587..9318540 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2338,6 +2338,31 @@ static void gen_alignaddr(TCGv dst, TCGv s1, TCGv s2, bool left)
 
     tcg_temp_free(tmp);
 }
+
+static void gen_faligndata(TCGv dst, TCGv gsr, TCGv s1, TCGv s2)
+{
+    TCGv t1, t2, shift;
+
+    t1 = tcg_temp_new();
+    t2 = tcg_temp_new();
+    shift = tcg_temp_new();
+
+    tcg_gen_andi_tl(shift, gsr, 7);
+    tcg_gen_shli_tl(shift, shift, 3);
+    tcg_gen_shl_tl(t1, s1, shift);
+
+    /* A shift of 64 does not produce 0 in TCG.  Divide this into a
+       shift of (up to 63) followed by a constant shift of 1.  */
+    tcg_gen_xori_tl(shift, shift, 63);
+    tcg_gen_shr_tl(t2, s2, shift);
+    tcg_gen_shri_tl(t2, t2, 1);
+
+    tcg_gen_or_tl(dst, t1, t2);
+
+    tcg_temp_free(t1);
+    tcg_temp_free(t2);
+    tcg_temp_free(shift);
+}
 #endif
 
 #define CHECK_IU_FEATURE(dc, FEATURE)                      \
@@ -4307,12 +4332,7 @@ static void disas_sparc_insn(DisasContext * dc)
                     break;
                 case 0x048: /* VIS I faligndata */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_faligndata(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_gsr_fop_DDD(dc, rd, rs1, rs2, gen_faligndata);
                     break;
                 case 0x04b: /* VIS I fpmerge */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 7830120..a992c29 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -41,18 +41,6 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
         GET_FIELD_SP(pixel_addr, 11, 12);
 }
 
-uint64_t helper_faligndata(CPUState *env, uint64_t src1, uint64_t src2)
-{
-    uint64_t tmp;
-
-    tmp = src1 << ((env->gsr & 7) * 8);
-    /* on many architectures a shift of 64 does nothing */
-    if ((env->gsr & 7) != 0) {
-        tmp |= src2 >> (64 - (env->gsr & 7) * 8);
-    }
-    return tmp;
-}
-
 #ifdef HOST_WORDS_BIGENDIAN
 #define VIS_B64(n) b[7 - (n)]
 #define VIS_W64(n) w[3 - (n)]
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements
  2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
                   ` (15 preceding siblings ...)
  2011-10-26 21:15 ` [Qemu-devel] [PATCH 16/16] target-sparc: Implement FALIGNDATA inline Richard Henderson
@ 2011-10-26 21:18 ` Richard Henderson
  2011-10-27 20:59   ` Blue Swirl
  16 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2011-10-26 21:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

On 10/26/2011 02:15 PM, Richard Henderson wrote:
> Changes v1->v2:
>   * sparc-linux-user and unrelated tcg patches removed,
>   * fabsd env/constification folded into patch 5
>   * always_inline hack and fallout in patch 6 mitigated by marking all
>     of the helper functions inline as well.
>   * some coding-style issues cleaned up
>   * rebased vs mainline, now that blueswirl's series is installed

Oh, it's also pushed to

  git://repo.or.cz/qemu/rth.git rth/vis2


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements
  2011-10-26 21:18 ` [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
@ 2011-10-27 20:59   ` Blue Swirl
  0 siblings, 0 replies; 19+ messages in thread
From: Blue Swirl @ 2011-10-27 20:59 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Wed, Oct 26, 2011 at 21:18, Richard Henderson <rth@twiddle.net> wrote:
> On 10/26/2011 02:15 PM, Richard Henderson wrote:
>> Changes v1->v2:
>>   * sparc-linux-user and unrelated tcg patches removed,
>>   * fabsd env/constification folded into patch 5
>>   * always_inline hack and fallout in patch 6 mitigated by marking all
>>     of the helper functions inline as well.
>>   * some coding-style issues cleaned up
>>   * rebased vs mainline, now that blueswirl's series is installed
>
> Oh, it's also pushed to
>
>  git://repo.or.cz/qemu/rth.git rth/vis2

Thanks, pulled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-10-27 20:59 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-26 21:15 [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 01/16] target-sparc: Add accessors for single-precision fpr access Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 02/16] target-sparc: Mark fprs dirty in store accessor Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 03/16] target-sparc: Add accessors for double-precision fpr access Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 04/16] target-sparc: Pass float64 parameters instead of dt0/1 temporaries Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 05/16] target-sparc: Make FPU/VIS helpers const when possible Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 06/16] target-sparc: Extract common code for floating-point operations Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 07/16] target-sparc: Extract float128 move to a function Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 08/16] target-sparc: Undo cpu_fpr rename Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 09/16] target-sparc: Change fpr representation to doubles Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 10/16] target-sparc: Do exceptions management fully inside the helpers Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 11/16] target-sparc: Implement PDIST Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 12/16] target-sparc: Implement fpack{16, 32, fix} Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 13/16] target-sparc: Implement EDGE* instructions Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 14/16] target-sparc: Implement ALIGNADDR* inline Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 15/16] target-sparc: Implement BMASK/BSHUFFLE Richard Henderson
2011-10-26 21:15 ` [Qemu-devel] [PATCH 16/16] target-sparc: Implement FALIGNDATA inline Richard Henderson
2011-10-26 21:18 ` [Qemu-devel] [PATCH v2 00/16] Sparc FPU/VIS improvements Richard Henderson
2011-10-27 20:59   ` Blue Swirl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).